Trace:

en:suppl_vars

This shows you the differences between two versions of the page.

Both sides previous revision Previous revision | |||

en:suppl_vars [2019/03/16 06:14] David Zelený [Supplementary variables (unconstrained ordination)] |
en:suppl_vars [2019/03/16 06:20] (current) David Zelený [Multiple testing issue and available corrections] |
||
---|---|---|---|

Line 45: | Line 45: | ||

==== Multiple testing issue and available corrections ==== | ==== Multiple testing issue and available corrections ==== | ||

- | The more tests of significance we are doing, the higher is the chance to observe the significant result, even if the null hypothesis is true (no relationship). This rule is called //multiple testing issue// and can be illustrated in a simple example. I generated two random variables with normal distribution, calculated their regression, and tested it (using parametric F-test). One would expect that the test will not return a significant result since the variables are generated randomly. But if I repeat this 100 times (<imgref multiple-testing>), you can see that some of the results turn to be significant. The proportion of significant results depends on the threshold value you use to deem result significant; e.g., if you consider as significant results with P-value lower than 5% (alpha = 0.05), then about 5% of the tests may appear as significant even though the variables are random (Type I error). | + | The more tests of significance we are doing, the higher is the chance to observe the significant result, even if the null hypothesis is true (no relationship). This rule is called //multiple testing issue// and can be illustrated in a simple example. I generated two random variables with normal distribution, calculated their regression, and tested it (using parametric F-test). One would expect that the test will not return a significant result since the variables are generated randomly. But if I repeat this 100 times (<imgref multiple-testing>), you can see that some of the results turn to be significant. The proportion of significant results depends on the threshold value you use to deem result significant; e.g., if you consider as significant results with P-value lower than 5% (alpha = 0.05), then about 5% of the tests may appear as significant even though the variables are random (Type I error). Or, put in another way, the probability that at least one of the tests will be significant at P < alpha can be calculated 1 - (1 - m)<sup>alpha</sup>, which is called //family-wise Type I error rate// - the probability we are conducting Type I error rate if we interpret the results of multiple tests without any correction. |

<imgcaption multiple-testing|Multiple testing issue. I generated two random variables (normally distributed) and tested the significance of their regression with parametric F-test. I replicated this 100 times, each with newly generated random variables. Significant regressions (P < 0.05) are displayed with a red regression line. From a total of 100 analyses, four are significant at the level of 0.05 (almost 5% of all analyses).>{{ :obrazky:multiple-testing-issue.jpg?direct |}}</imgcaption> | <imgcaption multiple-testing|Multiple testing issue. I generated two random variables (normally distributed) and tested the significance of their regression with parametric F-test. I replicated this 100 times, each with newly generated random variables. Significant regressions (P < 0.05) are displayed with a red regression line. From a total of 100 analyses, four are significant at the level of 0.05 (almost 5% of all analyses).>{{ :obrazky:multiple-testing-issue.jpg?direct |}}</imgcaption> |

en/suppl_vars.txt · Last modified: 2019/03/16 06:20 by David Zelený