Wednesday, September 4, 2024

An inverse for statistical analysis

 



An inverse for statistical analysis


I throw this out as a potential new thought, though it may well exist somewhere as I am not deep into statistics as i would have liked.

any set of observational data looks very much like a bell curve and obviously such a curve is a mathematical object.  The problem is outliers in many empirical sets.  And dropping those outliers hardly makes the data go away even if the curve bis now perfect.

My idea is to grab the data that is in the third standard deviation and work with it to select data points inside the first two deviations to locate conforming data able to support a generated data set.  There may well be multiple variations.

then discover if any of it means anything.

My own empirical experience informs me that the established strategy of casting out outliers is extremely dumb.  Sort of tossing out a high grade gold intersection when you know little about the geology.  Recall that the pay zone of the Rand was inches thick but continuous forever.  Ground truth actually proved it when likely stat methods would have failed it.

Knowing your data will satisfy a bell curve informs us to run filters anchored by outliers.



No comments: