Compute and plot predictions, slopes, marginal means, and comparisons (contrasts, risk ratios, odds ratios, etc.) for over 70 classes of statistical models in R. Conduct linear and non-linear hypothesis tests, as well as equivalence tests using the delta method.
Comparisons
Download Zip: https://urluss.com/2vGB8V
Compare the predictions made by a model for different regressor values (e.g., college graduates vs. others): contrasts, differences, risk ratios, odds, etc. comparisons(), avg_comparisons(), plot_comparisons().
In 2016, the impulse to draw comparisons to some of the worst episodes in European history may have been understandable and even useful. The future was opaque and elites were shaken by the election results. And there were strategic uses to such warnings. The horrors coming were likely, though no one knew their exact form. Sometimes, the sky does not fall in precisely the way the chickens fear, but it is still the right move to cluck.
This problem, that when you do multiple statistical tests, some fraction will be false positives, has received increasing attention in the last few years. This is important for such techniques as the use of microarrays, which make it possible to measure RNA quantities for tens of thousands of genes at once; brain scanning, in which blood flow can be estimated in 100,000 or more three-dimensional bits of brain; and evolutionary genomics, where the sequences of every gene in the genome of two or more species can be compared. There is no universally accepted approach for dealing with the problem of multiple comparisons; it is an area of active research, both in the mathematical details and broader epistomological questions.
The Bonferroni correction is appropriate when a single false positive in a set of tests would be a problem. It is mainly useful when there are a fairly small number of multiple comparisons and you're looking for one or two that might be significant. However, if you have a large number of multiple comparisons and you're looking for many that might be significant, the Bonferroni correction may lead to a very high rate of false negatives. For example, let's say you're comparing the expression level of 20,000 genes between liver cancer tissue and normal liver tissue. Based on previous studies, you are hoping to find dozens or hundreds of genes with different expression levels. If you use the Bonferroni correction, a P value would have to be less than 0.05/20000=0.0000025 to be significant. Only genes with huge differences in expression will have a P value that low, and could miss out on a lot of important differences just because you wanted to be sure that your results did not include a single false positive.
The Bonferroni correction and Benjamini-Hochberg procedure assume that the individual tests are independent of each other, as when you are comparing sample A vs. sample B, C vs. D, E vs. F, etc. If you are comparing sample A vs. sample B, A vs. C, A vs. D, etc., the comparisons are not independent; if A is higher than B, there's a good chance that A will be higher than C as well. One place this occurs is when you're doing unplanned comparisons of means in anova, for which a variety of other techniques have been developed, such as the Tukey-Kramer test. Another experimental design with multiple, non-independent comparisons is when you compare multiple variables between groups, and the variables are correlated with each other within groups. An example would be knocking out your favorite gene in mice and comparing everything you can think of on knockout vs. control mice: length, weight, strength, running speed, food consumption, feces production, etc. All of these variables are likely to be correlated within groups; mice that are longer will probably also weigh more, would be stronger, run faster, eat more food, and poop more. To analyze this kind of experiment, you can use multivariate analysis of variance, or manova, which I'm not covering in this textbook.
Salvatore Mangiafico's R Companion has a sample R programs for the Bonferroni, Benjamini-Hochberg, and several other methods for correcting for multiple comparisons.SASThere is a PROC MULTTEST that will perform the Benjamini-Hochberg procedure, as well as many other multiple-comparison corrections. Here's an example using the diet and mammographic density data from García-Arenzana et al. (2014).
Comparisons between strings and numbers using == and other non-strict comparison operators currently work by casting the string to a number, and subsequently performing a comparison on integers or floats. This results in many surprising comparison results, the most notable of which is that 0 == "foobar" returns true. This RFC proposes to make non-strict comparisons more useful and less error prone, by using a number comparison only if the string is actually numeric. Otherwise the number is converted into a string, and a string comparison is performed.
The current dogma in the PHP world is that non-strict comparisons should always be avoided, because their conversion semantics are rarely desirable and can easily lead to bugs or even security issues. The single largest source of bugs is likely the fact that 0 == "foobar" returns true. Quite often this is encountered in cases where the comparison is implicit, such as in_array() or switch statements. A classic example:
This is an unfortunate state of affairs, because the concept of non-strict comparisons is not without value in a language like PHP, which commonly deals with mixtures of numbers in both plain and stringified form. Considering 42 and "42" as the same value is useful in many contexts, in part also due to the implicit conversions performed by PHP (e.g. string array keys may be converted to integers). Additionally some constructs (such as switch) only support non-strict comparison natively.
Unfortunately, while the idea of non-strict comparisons has some merit, their current semantics are blatantly wrong in some cases and thus greatly limit the overall usefulness of non-strict comparisons.
This RFC intends to give string to number comparisons a more reasonable behavior: When comparing to a numeric string, use a number comparison (same as now). Otherwise, convert the number to string and use a string comparison. The following table shows how the result of some simple comparisons changes (or doesn't change) under this RFC:
This change to the semantics of non-strict comparisons is backwards incompatible. Worse, it constitutes a silent change in core language semantics. Code that worked one way in PHP 7.4 will work differently in PHP 8.0. Use of static analysis to detect cases that may be affected is likely to yield many false positives. 2ff7e9595c
Comments