Applied Statistics Notes

Snippets from http://www.sportsci.org/resource/stats/

RMSE: "Here's an example. Suppose you have heights for a group of females and males. If you analyze the data without regard to the sex of the subjects, the measure of spread you get will be the total variation. But stats programs can take into account the sex of each subject, work out the means for the boys and the girls, then derive a single SD that will do for the boys and the girls. That single SD is the RMSE. Yes, you can also work out the SDs for the boys and girls separately, but you may need a single one to calculate effect sizes. You can't simply average the SDs."

Log Transformations: http://www.sportsci.org/resource/stats/logtrans.html " Log transform? No, the standard deviations would need to be bigger for more training. Rank transform? Yes, non-parametric analysis is called for here. Just rank the entire column of data for training, then do the analysis as usual. " http://www.sportsci.org/resource/stats/twoanova.html

Beginner Non-Parametric Stats: http://www.sportsci.org/resource/stats/nonparms.html

Beginner Logistic Regression, and other exotic dependents: http://www.sportsci.org/resource/stats/ordinal.html http://www.sportsci.org/resource/stats/counts.html

Quotes re Models: " The main trend with experience is linear, and we want to know about the differences in the slopes, so we need a full ANCOVA model:

attitude <= sport experience sport*experience

And finally, there is curvature for at least one sport, so we need to fit a quadratic term overall, and a quadratic term that might differ between the two sports. The way to do that is to include the quadratic term as a main effect and as an interaction with sport. So here's the full model:

attitude <= sport experience sport*experience experience2 sport*experience2

The p value for sport*experience2 tells you whether any difference in the curvature for the two sports is statistically significant. Once again you express this difference as a contribution to the overall R2 for the model, as described for the simpler example above. "http://www.sportsci.org/resource/stats/polynomial.html

Sample Size: http://www.sportsci.org/resource/stats/samplesize.html

Dont Understand: " We don't have the interaction term athlete*time in the model, partly because athlete is a random effect, and partly because we would need multiple measurements for subjects at the pre and post time points for the interaction term to make any sense. Let's leave aside this complexity. " http://www.sportsci.org/resource/stats/repanova.html#rmslide

Applied Statistics Notes

From enfascination