How would this be analyzed?
|
|
Thread rating:  |
DarkProtoman - 24 Oct 2008 21:53 GMT How would this study be analyzed:
It's a single-blinded RCT to evaluate the effect of a special diet on 7 parameters, vs two other diets, the DASH diet and the USDA Food Pyramid diet. The sample is 75 patients with risk factors for coronary artery disease. Results will be considered statistically significant if p < .01.
Would a two-tailed MANOVA be good to analyze this? What's "p-rep"?
RichUlrich - 25 Oct 2008 02:49 GMT >How would this study be analyzed: > [quoted text clipped - 5 lines] > >Would a two-tailed MANOVA be good to analyze this? What's "p-rep"? What is a "two-tailed MANOVA"? Are the "7 parameters" being treated as replications or parallel measures? Where do you get "p-rep" that causes you to ask about it? ("p" could be a count of "replications", if those are your 7.)
This has more power if the 3 diets are tried on all 75 patients, with tests performed on large, short-term effects. But the default reading for me is to expect 3 groups with 25 each....
There's a reason that most simple trials have only two groups -- you have poor enough power with two groups when you don't know what you are looking for, and the power is worse with three.
Having "7 parameters" is obviously a tougher condition than having only one or two.
MANOVA will give you a test of an overall difference (or several), comprised of any complex pattern of the 7 "parameters". Yes, this 3-groups discriminant function is the way to test that hypothesis. That's what MANOVA reduced to, for this model. Seven separate tests, with Bonferroni correction, are enormously more definitive about the parameters separately, if "separate" is how you see them.
A-priori shaving 7 hypotheses to 2 or 3 would be good, too, for the sake of statistical power and clarity of thinking. Drop the ones from your primary hypotheses, that are not needed to justify the cost of the study.
 Signature Rich Ulrich
Peter / Labo - 25 Oct 2008 17:59 GMT Hey there,
A MANOVA sounds reasonable (although I don't know what a two-tailed manova is) as an omnibus "global" test, that will avoid both false positives and false negatives for anything that you want to test afterwards. Except for showing that there is "something" "somewhere" in your data, the bare manova result will be relatively useless, which is why you will do an ANOVA for each parameter afterwards. As your hypotheses are probably rather specific and the design is simple, you can replace anova for planned comparison t-tests. Just know that your anova or contrast p-values will be correlated. After the anova's you will test for the pairwise comparisons that interest you. Whenever you do several tests at the same time, you might want to apply a multiple- comparison multiplicity correction (like Bonferroni, but there are some more modern methods). If you do apply corrections, p<.01 would be considered relatively conservative in my field, but maybe it is not in yours.
p-rep is the probability of _qualitative_ replication of what you have found if you would do the exact same experiment again. For example, if you find that your special diet is better than the DASH diet on some parameter, a common p-value will tell you what the probability is that you obtain this result by chance and chance alone, _if_ there is really no mean effect. On the other hand, a replication p will tell you (approx) how likely it is that you would obtain again that the new diet is better than the DASH diet if you do the same experiment. I believe that, in the two-sided comparison case, an alpha of 0.01 corresponds to a p_rep of about 0.97. That is very high, but it is really an upper bound, as this is calculated without taking into account the factors that do change from replication to replication (as "exact replication" is impossible). Try pasting the following code at an Rweb server (for example http://rweb.stat.umn.edu/Rweb/Rweb.general.html ) to test what you get as p_rep for other significance levels. " p_value=0.01 pnorm(qnorm(1-p_value/2)/sqrt(2)) " There is more info on p_rep on wikipedia (http://en.wikipedia.org/wiki/ P-rep) and a a much more accurate account in the manuscript/ publication of Peter R. Killeen: An Alternative to Null-Hypothesis Significance Tests (which I found at http://www.pubmedcentral.nih.gov/articlerender.fcgi?artid=1473027 ).
Hope this helps,
Peter.
Peter / Labo - 25 Oct 2008 18:01 GMT I'm desperately trying to post a reply on google groups, and for some reason, I'm not managing to do so. Read what I wrote at http://mathforum.org/kb/message.jspa?messageID=6474135&tstart=0 . It will tell you what I think about manova and what is p-rep.
Peter / Labo - 25 Oct 2008 14:47 GMT > How would this study be analyzed: > [quoted text clipped - 5 lines] > > Would a two-tailed MANOVA be good to analyze this? What's "p-rep"? Hey there,
A MANOVA sounds reasonable (although I don't know what a two-tailed manova is) as an omnibus "global" test, that will avoid both false positives and false negatives for anything that you want to test afterwards. Except for showing that there is "something" "somewhere" in your data, the bare manova result will be relatively useless, which is why you will do an ANOVA for each parameter afterwards. As your hypotheses are probably rather specific and the design is simple, you can replace anova for planned comparison t-tests. Just know that your anova or contrast p-values will be correlated. After the anova's you will test for the pairwise comparisons that interest you. Whenever you do several tests at the same time, you might want to apply a multiple- comparison multiplicity correction (like Bonferroni, but there are some more modern methods). If you do apply corrections, p<.01 would be considered relatively conservative in my field, but maybe it is not in yours.
p-rep is the probability of _qualitative_ replication of what you have found if you would do the exact same experiment again. For example, if you find that your special diet is better than the DASH diet on some parameter, a common p-value will tell you what the probability is that you obtain this result by chance and chance alone, _if_ there is really no mean effect. On the other hand, a replication p will tell you (approx) how likely it is that you would obtain again that the new diet is better than the DASH diet if you do the same experiment. I believe that, in the two-sided comparison case, an alpha of 0.01 corresponds to a p_rep of about 0.97. That is very high, but it is really an upper bound, as this is calculated without taking into account the factors that do change from replication to replication (as "exact replication" is impossible). Try pasting the following code at an Rweb server (for example http://rweb.stat.umn.edu/Rweb/Rweb.general.html ) to test what you get as p_rep for other significance levels. " p_value=0.01 pnorm(qnorm(1-p_value/2)/sqrt(2)) " There is more info on p_rep on wikipedia (http://en.wikipedia.org/wiki/ P-rep) and a a much more accurate account in the manuscript/ publication of Peter R. Killeen: An Alternative to Null-Hypothesis Significance Tests (which I found at http://www.pubmedcentral.nih.gov/articlerender.fcgi?artid=1473027 ).
Hope this helps,
Peter.
Bruce Weaver - 25 Oct 2008 15:20 GMT >> How would this study be analyzed: >> [quoted text clipped - 14 lines] > in your data, the bare manova result will be relatively useless, which > is why you will do an ANOVA for each parameter afterwards. --- snip ---
I profess no expertise in multivariate statistics. However, I just read this in Dave Garson's online stats notes:
"Multiple univariate ANOVAs should not be used to "follow up" on a significant MANOVA. A significant MANOVA means that groups differ significantly on the canonical variates Put another way, examining group differences is best done by examining the canonical variates resulting from MANOVA. Examining group differences in terms of univariate ANOVAs involves a loss of power and increased chance of Type I error. See Huberty & Morris (1989). In general, MANOVA is conducted when it is expected that the resulting canonical variates reflect some construct differentiating the groups. If, however, it is known that the dependent variables are conceptually unrelated, then MANOVA may be inappropriate and univariate ANOVAs useful. Huberty & Morris (1989: 301) recommend MANOVA over multiple ANOVAs whenthe purpose is "to identify outcome variable system constructs, to select variable subsets, [or] to determine variable relative worth."
from: http://faculty.chass.ncsu.edu/garson/PA765/manova.htm
Huberty, Carl J. & Morris, John D. (1989). Multivariate analysis versus multiple univariate analyses Psychological Bulletin, 105(2), 302-308.
I'd be interested in comments from folks who are well-versed in MANOVA/DFA.
Cheers, Bruce
 Signature Bruce Weaver bweaver@lakeheadu.ca http://sites.google.com/a/lakeheadu.ca/bweaver/ "When all else fails, RTFM."
Peter - 25 Oct 2008 20:21 GMT Hey Bruce,
Well, I kind of based myself on (a translation of) Hair, Tatham, Anderson & William (1998) Multivariate Data Analysis, 5th edition. In their examples in Chapter 6 they follow up on significant manova p-values with univariate tests in a stage that is very appropriately called "Interpretation of the results". In fact, I have to be a bit more specific: they report the results of common univariate F tests followed by univariate "stepdown F" (Roy-Bargman): the univariate F-values are ordered in signifance, and F-tests are then done sequentially, using the more significant dependent variates as a covariate. On the recommendations of David Garson, I would only look into canonical variates if individual F-tests do not yield any significant result. I suppose that the situation of diet comparison corresponds more to what Huberty & Morris call conceptually independent variables. This is of course a highly subjective evaluation, but I am unconvinced by what I read in Huberty & Morris about the uselessness of manova before multiple anova. They stress very much the problem of non-interpretability of the manova canonical variate for problems involving conceptually independent variables, but, in my opinion, they are not making a very strong point against manova to control for type-I error when those variables are possibly correlated. They write: "The idea that one completely controls for Type I error probability by first conducting an overall MANOVA is open to question (Bird & Hadzi-Pavlovic, 1983; Bray & Maxwell, 1982, p. 343), because the alpha value for each ANOVA would be less than or equal to the alpha employed for the MANOVA only when the MANOVA null hypothesis is true. This notion does not have convincing empirical support in a MANOVA-ANOVAS context (Wilkinson, 1975), the Hummel and Sligo (1971) and Hummel and Johnston (1986) studies notwithstanding." (p. 303) The formulations "open to question", "not convincing", "notwithstanding" do not support the conviction with which they write: "We consider to be a myth the idea that one is controlling Type I error probability by following a significant MANOVA test with multiple ANOVA tests, each conducted using conventional significance levels." (p. 307) I know about the problems of multiple correlated F-tests, but that's why we have correction procedures such as stepdown bonferroni, to control for increased Type-I error rate. The reverse, the problem of power, only becomes an issue for me if there are indeed no significant univariate F-results after a significant manova. If this would happen, manova is diagnostic in the sense that it helps to discriminate when there is nothing to be found and when one should switch to "explorative mode" and combine variables based on the manova output or the correlation matrix or more conceptually sensible ways to identify constructs (physiological states?) that the variables measure indirectly.
Peter.
PS Sorry about that large number of posts with the same contents - I tried to post several times to google groups but my post just did not show up in google, while it _is_ showing up in mathforum, as I discovered afterwards!
RichUlrich - 25 Oct 2008 23:21 GMT >>> How would this study be analyzed: >>> [quoted text clipped - 32 lines] >unrelated, then MANOVA may be inappropriate and univariate ANOVAs >useful. [break] This is close to what I said already -- If you want to look especially at the single variables, MANOVA is probably not appropriate even for an overall test because it wastes power. I'm less hostile to what you may do as follow-up, owing to the fact that MANOVA can be a puzzle to interpret if the effects are not really strong.
> Huberty & Morris (1989: 301) recommend MANOVA over >multiple ANOVAs whenthe purpose is "to identify outcome variable >system constructs, to select variable subsets, [or] to determine >variable relative worth." And -- "variable relative worth" is always problematic. Consider that multiple regression is a simple subset of the general MANOVA, canonical-correlation model. MANOVA is just like multiple regression, with more complication since there are several roots, and thus, several canonical regression equations.
>from: http://faculty.chass.ncsu.edu/garson/PA765/manova.htm > [quoted text clipped - 4 lines] >I'd be interested in comments from folks who are well-versed in >MANOVA/DFA.
 Signature Rich Ulrich
Peter / Labo - 25 Oct 2008 17:49 GMT > How would this study be analyzed: > [quoted text clipped - 5 lines] > > Would a two-tailed MANOVA be good to analyze this? What's "p-rep"? Hey there,
A MANOVA sounds reasonable (although I don't know what a two-tailed manova is) as an omnibus "global" test, that will avoid both false positives and false negatives for anything that you want to test afterwards. Except for showing that there is "something" "somewhere" in your data, the bare manova result will be relatively useless, which is why you will do an ANOVA for each parameter afterwards. As your hypotheses are probably rather specific and the design is simple, you can replace anova for planned comparison t-tests. Just know that your anova or contrast p-values will be correlated. After the anova's you will test for the pairwise comparisons that interest you. Whenever you do several tests at the same time, you might want to apply a multiple- comparison multiplicity correction (like Bonferroni, but there are some more modern methods). If you do apply corrections, p<.01 would be considered relatively conservative in my field, but maybe it is not in yours.
p-rep is the probability of _qualitative_ replication of what you have found if you would do the exact same experiment again. For example, if you find that your special diet is better than the DASH diet on some parameter, a common p-value will tell you what the probability is that you obtain this result by chance and chance alone, _if_ there is really no mean effect. On the other hand, a replication p will tell you (approx) how likely it is that you would obtain again that the new diet is better than the DASH diet if you do the same experiment. I believe that, in the two-sided comparison case, an alpha of 0.01 corresponds to a p_rep of about 0.97. That is very high, but it is really an upper bound, as this is calculated without taking into account the factors that do change from replication to replication (as "exact replication" is impossible). Try pasting the following code at an Rweb server (for example http://rweb.stat.umn.edu/Rweb/Rweb.general.html ) to test what you get as p_rep for other significance levels. " p_value=0.01 pnorm(qnorm(1-p_value/2)/sqrt(2)) " There is more info on p_rep on wikipedia (http://en.wikipedia.org/wiki/ P-rep) and a a much more accurate account in the manuscript/ publication of Peter R. Killeen: An Alternative to Null-Hypothesis Significance Tests (which I found at http://www.pubmedcentral.nih.gov/articlerender.fcgi?artid=1473027 ).
Hope this helps,
Peter.
Peter / Labo - 25 Oct 2008 18:03 GMT > How would this study be analyzed: > [quoted text clipped - 5 lines] > > Would a two-tailed MANOVA be good to analyze this? What's "p-rep"? Hey there,
A MANOVA sounds reasonable (although I don't know what a two-tailed manova is) as an omnibus "global" test, that will avoid both false positives and false negatives for anything that you want to test afterwards. Except for showing that there is "something" "somewhere" in your data, the bare manova result will be relatively useless, which is why you will do an ANOVA for each parameter afterwards. As your hypotheses are probably rather specific and the design is simple, you can replace anova for planned comparison t-tests. Just know that your anova or contrast p-values will be correlated. After the anova's you will test for the pairwise comparisons that interest you. Whenever you do several tests at the same time, you might want to apply a multiple- comparison multiplicity correction (like Bonferroni, but there are some more modern methods). If you do apply corrections, p<.01 would be considered relatively conservative in my field, but maybe it is not in yours.
p-rep is the probability of _qualitative_ replication of what you have found if you would do the exact same experiment again. For example, if you find that your special diet is better than the DASH diet on some parameter, a common p-value will tell you what the probability is that you obtain this result by chance and chance alone, _if_ there is really no mean effect. On the other hand, a replication p will tell you (approx) how likely it is that you would obtain again that the new diet is better than the DASH diet if you do the same experiment. I believe that, in the two-sided comparison case, an alpha of 0.01 corresponds to a p_rep of about 0.97. That is very high, but it is really an upper bound, as this is calculated without taking into account the factors that do change from replication to replication (as "exact replication" is impossible). Try pasting the following code at an Rweb server (for example http://rweb.stat.umn.edu/Rweb/Rweb.general.html ) to test what you get as p_rep for other significance levels. " p_value=0.01 pnorm(qnorm(1-p_value/2)/sqrt(2)) " There is more info on p_rep on wikipedia (http://en.wikipedia.org/wiki/ P-rep) and a a much more accurate account in the manuscript/ publication of Peter R. Killeen: An Alternative to Null-Hypothesis Significance Tests (which I found at http://www.pubmedcentral.nih.gov/articlerender.fcgi?artid=1473027 ).
Hope this helps,
Peter.
Peter / Labo - 25 Oct 2008 18:47 GMT > How would this study be analyzed: > [quoted text clipped - 5 lines] > > Would a two-tailed MANOVA be good to analyze this? What's "p-rep"? Hey there,
A MANOVA sounds reasonable (although I don't know what a two-tailed manova is) as an omnibus "global" test, that will avoid both false positives and false negatives for anything that you want to test afterwards. Except for showing that there is "something" "somewhere" in your data, the bare manova result will be relatively useless, which is why you will do an ANOVA for each parameter afterwards. As your hypotheses are probably rather specific and the design is simple, you can replace anova for planned comparison t-tests. Just know that your anova or contrast p-values will be correlated. After the anova's you will test for the pairwise comparisons that interest you. Whenever you do several tests at the same time, you might want to apply a multiple- comparison multiplicity correction (like Bonferroni, but there are some more modern methods). If you do apply corrections, p<.01 would be considered relatively conservative in my field, but maybe it is not in yours.
p-rep is the probability of _qualitative_ replication of what you have found if you would do the exact same experiment again. For example, if you find that your special diet is better than the DASH diet on some parameter, a common p-value will tell you what the probability is that you obtain this result by chance and chance alone, _if_ there is really no mean effect. On the other hand, a replication p will tell you (approx) how likely it is that you would obtain again that the new diet is better than the DASH diet if you do the same experiment. I believe that, in the two-sided comparison case, an alpha of 0.01 corresponds to a p_rep of about 0.97. That is very high, but it is really an upper bound, as this is calculated without taking into account the factors that do change from replication to replication (as "exact replication" is impossible). Try pasting the following code at an Rweb server (for example http://rweb.stat.umn.edu/Rweb/Rweb.general.html ) to test what you get as p_rep for other significance levels. " p_value=0.01 pnorm(qnorm(1-p_value/2)/sqrt(2)) " There is more info on p_rep on wikipedia (http://en.wikipedia.org/wiki/ P-rep) and a a much more accurate account in the manuscript/ publication of Peter R. Killeen: An Alternative to Null-Hypothesis Significance Tests (which I found at http://www.pubmedcentral.nih.gov/articlerender.fcgi?artid=1473027 ).
Hope this helps,
Peter.
Peter / Labo - 25 Oct 2008 18:50 GMT > How would this study be analyzed: > [quoted text clipped - 5 lines] > > Would a two-tailed MANOVA be good to analyze this? What's "p-rep"? Hey there,
A MANOVA sounds reasonable (although I don't know what a two-tailed manova is) as an omnibus "global" test, that will avoid both false positives and false negatives for anything that you want to test afterwards. Except for showing that there is "something" "somewhere" in your data, the bare manova result will be relatively useless, which is why you will do an ANOVA for each parameter afterwards. As your hypotheses are probably rather specific and the design is simple, you can replace anova for planned comparison t-tests. Just know that your anova or contrast p-values will be correlated. After the anova's you will test for the pairwise comparisons that interest you. Whenever you do several tests at the same time, you might want to apply a multiple- comparison multiplicity correction (like Bonferroni, but there are some more modern methods). If you do apply corrections, p<.01 would be considered relatively conservative in my field, but maybe it is not in yours.
p-rep is the probability of _qualitative_ replication of what you have found if you would do the exact same experiment again. For example, if you find that your special diet is better than the DASH diet on some parameter, a common p-value will tell you what the probability is that you obtain this result by chance and chance alone, _if_ there is really no mean effect. On the other hand, a replication p will tell you (approx) how likely it is that you would obtain again that the new diet is better than the DASH diet if you do the same experiment. I believe that, in the two-sided comparison case, an alpha of 0.01 corresponds to a p_rep of about 0.97. That is very high, but it is really an upper bound, as this is calculated without taking into account the factors that do change from replication to replication (as "exact replication" is impossible). Try pasting the following code at an Rweb server (for example http://rweb.stat.umn.edu/Rweb/Rweb.general.html ) to test what you get as p_rep for other significance levels. " p_value=0.01 pnorm(qnorm(1-p_value/2)/sqrt(2)) " There is more info on p_rep on wikipedia (http://en.wikipedia.org/wiki/ P-rep) and a a much more accurate account in the manuscript/ publication of Peter R. Killeen: An Alternative to Null-Hypothesis Significance Tests (which I found at http://www.pubmedcentral.nih.gov/articlerender.fcgi?artid=1473027 ).
Hope this helps,
Peter.
Ray Koopman - 25 Oct 2008 19:02 GMT > [...] What's "p-rep"? P-rep (or p_rep) purports to be an estimate of the probability that an effect would be significant if the experiment were replicated. http://en.wikipedia.org/wiki/P-rep gives a fair summary.
Peter - 25 Oct 2008 20:40 GMT > On Oct 24, 1:53 pm, DarkProtoman > <Protoman2...@gmail.com> wrote: [quoted text clipped - 6 lines] > http://en.wikipedia.org/wiki/P-rep gives a fair > summary. I feel I need to correct this... It is not the (estimate of) probability that it would be significant in a replication, but that it would have the same sign, significant or not.
Ray Koopman - 28 Oct 2008 00:26 GMT >> On Oct 24, 1:53 pm, DarkProtoman >> <Protoman2...@gmail.com> wrote: [quoted text clipped - 8 lines] > > I feel I need to correct this... It is not the (estimate of) probability that it would be significant in a replication, but that it would have the same sign, significant or not. You're right. I stand corrected.
RichUlrich - 30 Oct 2008 22:58 GMT >> On Oct 24, 1:53 pm, DarkProtoman >> <Protoman2...@gmail.com> wrote: [quoted text clipped - 10 lines] > probability that it would be significant in a replication, but that > it would have the same sign, significant or not. About p-rep. Probability that a replication will be the same direction.
That is good to know. The replication has to be (only) in the same direction. That does seem like an explanation of the original p-value that might be helpful to some people, for the two group problem.
For the user's problem where he is suggesting MANOVA, I don't see how p-rep could be relevant. The same would be true for ANOVA with more than 2 groups, it would seem. Unless there is some theoretical extension?
Are there special rules for where the numerator d.f. of the F-test is greater than 1?
 Signature Rich Ulrich
|
|
|