Applied Statistics Notes
From enfascination
Snippets from http://www.sportsci.org/resource/stats/
RMSE: "Here's an example. Suppose you have heights for a group of females and males. If you analyze the data without regard to the sex of the subjects, the measure of spread you get will be the total variation. But stats programs can take into account the sex of each subject, work out the means for the boys and the girls, then derive a single SD that will do for the boys and the girls. That single SD is the RMSE. Yes, you can also work out the SDs for the boys and girls separately, but you may need a single one to calculate effect sizes. You can't simply average the SDs."
Log Transformations: http://www.sportsci.org/resource/stats/logtrans.html " Log transform? No, the standard deviations would need to be bigger for more training. Rank transform? Yes, non-parametric analysis is called for here. Just rank the entire column of data for training, then do the analysis as usual. " http://www.sportsci.org/resource/stats/twoanova.html
Beginner Non-Parametric Stats: http://www.sportsci.org/resource/stats/nonparms.html
Beginner Logistic Regression, and other exotic dependents: http://www.sportsci.org/resource/stats/ordinal.html http://www.sportsci.org/resource/stats/counts.html
Quotes re Models:
"
The main trend with experience is linear, and we want to know about the differences in the slopes, so we need a full ANCOVA model:
attitude <= sport experience sport*experience
And finally, there is curvature for at least one sport, so we need to fit a quadratic term overall, and a quadratic term that might differ between the two sports. The way to do that is to include the quadratic term as a main effect and as an interaction with sport. So here's the full model:
attitude <= sport experience sport*experience experience2 sport*experience2
The p value for sport*experience2 tells you whether any difference in the curvature for the two sports is statistically significant. Once again you express this difference as a contribution to the overall R2 for the model, as described for the simpler example above. "http://www.sportsci.org/resource/stats/polynomial.html
Sample Size: http://www.sportsci.org/resource/stats/samplesize.html
Dont Understand: " We don't have the interaction term athlete*time in the model, partly because athlete is a random effect, and partly because we would need multiple measurements for subjects at the pre and post time points for the interaction term to make any sense. Let's leave aside this complexity. " http://www.sportsci.org/resource/stats/repanova.html#rmslide