-
-
Notifications
You must be signed in to change notification settings - Fork 141
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add confidence interval for MWU #226
base: main
Are you sure you want to change the base?
Conversation
raphaelvallat#225 Implemented CI from 'Calculating confidence intervals for some non-parametric analyses', Campbell and Gardner 1988. CI Style is adapted from ttest. The same publication offers a solution for wilcoxon, which is not yet implemented but could be added fairly easily.
Codecov Report
@@ Coverage Diff @@
## master #226 +/- ##
=======================================
Coverage 98.99% 99.00%
=======================================
Files 19 19
Lines 3290 3304 +14
Branches 527 531 +4
=======================================
+ Hits 3257 3271 +14
Misses 17 17
Partials 16 16
Continue to review full report at Codecov.
|
pingouin/nonparametric.py
Outdated
conf = confidence | ||
N = scipy.stats.norm.ppf(conf) | ||
ct1, ct2 = len(x),len(y) # count samples | ||
diffs = sorted([i-j for i in x for j in y]) # get ct1xct2 difference |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@kschuerholt could we use a numpy function / numpy broadcasting here to avoid the nested for loop?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, that's easy enough. I'll add it in a new commit promptly.
MWU 97.0 two-sided 0.00556 0.515 0.2425 | ||
>>> pg.mwu(x, y, alternative='two-sided',confidence=0.95) | ||
U-val alternative p-val RBC CLES CI95% | ||
MWU 97.0 two-sided 0.00556 0.515 0.2425 [-0.39290395101879694, -0.09400270319896187] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this the actual output that you get? The CI should normally be rounded to two decimals by the _postprocess_dataframe function
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's the actual output I get. I was wondering about that, too. But then again, the t-test also gives me full floats (at least when confidence!=0.95), so I thought that was intentional.
I can of course round it in MWU or do you want to adress that elsewhere?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
pingouin/nonparametric.py
Outdated
@@ -222,16 +225,20 @@ def mwu(x, y, alternative='two-sided', **kwargs): | |||
Association and the American Statistical Association, 25(2), | |||
101–132. https://doi.org/10.2307/1165329 | |||
|
|||
.. [5] Campbell, M. J. & Gardner, M. J. (1988). Calculating confidence |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you add in the "Notes" section a one line explanation of the CI method?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, I'll give that a go. Like I said, I'm not a statistician, so that'll have to be proof-read by someone
pingouin/nonparametric.py
Outdated
N = scipy.stats.norm.ppf(conf) | ||
ct1, ct2 = len(x),len(y) # count samples | ||
diffs = sorted([i-j for i in x for j in y]) # get ct1xct2 difference | ||
k = int(round(ct1*ct2/2 - (N * (ct1*ct2*(ct1+ct2+1)/12)**0.5))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please make sure that the code follows the flake8 guideline, i.e. there must be a white space between arithmetic operators
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry about that.. was editing the file on the fly in github directly, no auto linting/formatting there yet unfortunatley. Next commit will be formatted accordingly.
Hi @kschuerholt, FYI I have just released a minor release of Pingouin (https://github.com/raphaelvallat/pingouin/releases/tag/v0.5.1) to fix some urgent dependencies bugs. Could you please make sure to update the PR to the new master and solve any conflicts that may arise? Thank you, |
Thanks for the heads-up. It's still on the todo list, but currently other things have to come first. I'm trying to get hold of an original source for CI computation of nonparametric tests. Or did you find something? Cheers, |
#225
Implemented CI from 'Calculating confidence intervals for some non-parametric analyses', Campbell and Gardner 1988. CI Style is adapted from ttest. The same publication offers a solution for wilcoxon, which is not yet implemented but could be added fairly easily.