Highest Rated Comments
beck167035 karma
No OP, but the peer review process in many fields is currently addressing this!
The solution in academia is registered studies (see alltrials.org). You have to provide the exact analysis that you intend to do before you ever obtain the data, then you get funding to do the study. That way, you can't just go searching for spurious correlations.
Not sure how this would go for politics, but I'm sure there's a good modification of this process. Perhaps making the data only available on a password protected server, and providing limited access through a portal once a plan of analysis is layed out?
Of course, this only works if politicians respect the process and call each other out for analyses that break the rules, which may be putting too much trust in politicians and political analysts.
beck16706 karma
Fair point! I also don't know how it can be applied, but perhaps a smarter person than me could figure it out.
Just to be clear, registered studies are a completely separate topic from peer review (it's part of the general process, but doesn't have to be). By laying out the intended data analysis, political opponents can criticize any deviations from the plan as disingenuous. This is different from having somebody review your work post-hoc for any mistakes/bad arguments/etc.
beck16702 karma
Would you rather fight one Dr. Oz sized Vani Hari or one hundred Vani Hari sized Dr. Oz's?
beck167092 karma
I'm a statistician, and there are some very interesting ways of getting around this.
My favourite is: for each individual, remove the names and addresses (obviously). Find the summary statistics for the entire data set (e.g. the mean and variance of peoples' height, along with how correlated this is with the other variables). For each individual, generate 10 random sets of data from the summary statistics and choose the one closest to the individual, throw out the rest.
The resulting data is can never be tied back to a person, the process can be done completely transparently (the simulation procedure would be much more complicated than this,nut can be released without compromising privacy), and the data can be released for re-evaluation. It can be proven that the simulated data will lead to the same results no matter what model you use to analyze it (assuming certain mathematical properties of the model).
Sorry I don't have a source handy, this is from memory of a conference presentation a couple years ago.
View HistoryShare Link