Data, Central Tendency, and Normalcy

In exploring some old data recently to see if it was publishable, I began to contemplate the things we take for granted in scientific studies. Statistics are so commonly used today that any paper is expected to have a smattering of tests and the all important p-values. (The incorrect use of alpha as inter-changeable with p-value is an irksome issue but one that has been explored by others.) Canned programs like Systat, SPSS, and even SAS or Minitab enable quick data analysis, which has revolutionized many fields of study. As the scientific community continues to grow and the pressure to get grants and publish increases, we should be more concerned the level of statistical knowledge our peers and students have. Knowledge of the software is not knowledge of statistical theory. I would hazard an argument that the latter is more important than the former.

Take for instance, the issue of sample size. In some fields, small sample sizes are all that is available. Take paleontology for one of many examples. You can’t create fossils. Yet, a sample size of at least is 30 is necessary to approximate normalcy and meet the assumptions of the Central Limit Theorem. I talked about normalcy in the socio-cultural sense before but, here, normalcy refers to a distribution of data. Without a normal distribution (or a tendency toward it, at the very least), one cannot accurately estimate population parameters. While non-parametric statistical analysis has developed tremendously throughout the years, it is often viewed as a weak version of the more powerful multi-variate analysis. How many people take the time to test whether their data are normally distributed before they run their factor analysis, discriminant function, time regression, and so on?

There are many ways to analyze data and statistics are a powerful tool but every single test has a tremendous number of assumptions and rules. We can’t reinvent the statistics wheel with each study but how do we know that each published study has contemplated the underlying assumptions of the tests they have used and if they are the most appropriate tests? We don’t. Quite simply, we rely on three things. We trust that our colleagues adhere to professional ethical standards. In cases of honest error, the peer review process identifies problems. Lastly, if the other two fail, the scientific method with a specific emphasis on replicating results, is the ultimate failsafe. Unless the topic is a hot one, that ultimate failsafe may take time, lots of time.

What do we do before results are replication? Think critically. Treat everything that is published as if one is the peer reviewer. Any gaps in reporting or unanswered questions should raise flags that perhaps the study at hand may not be the best one to cite in one’s own paper without some discussion of the problems.

Thinking critically requires extra time, precious extra time. But, I think it produces a better, stronger community that fosters greater attention to detail and better results overall. I would like to see more of it, as opposed to the sloppy lit reviews, hurried and often trite uncritical discussion of methods, discriminating presentation of results that disguise true problems with the study, and overblown conclusions.

Post a Comment

Your email is never published nor shared.