Let’s say you have some data-points (x,y) in a file. Without looking, you can force a linear regression and calculate a slope. Or you can blindly fit a polynomial or Bayesian or any other model and compute as many parameters as you like. You might even get the model published somewhere, but it doesn’t necessarily mean it’s any good.
Such ad hoc models can parrot correct answers within a narrow range of applicability, for example on data similar to what was used to tune the model. But they can produce nonsense calculations otherwise. Since every dataset is unique, blindly applying a model gives undependable results.
The solution? Visually analyze near-raw data first, then apply a linear model only if necessary. That’s the simple idea behind SorcererScore(tm).
Proteomics mass spectrometry is fundamentally a world-changing, precision analytical technology that takes a molecular snapshot-in-time of a cell lysate, like a “biochemical x-ray”. It can uniquely accelerate drug and diagnostics discovery by directly querying dynamic proteins instead of static DNA (genomics) or protein byproducts (metabolomics).
That’s the billion-dollar potential of proteomics.
However, in practice, most labs lack the training to discern data analysis quality. They opt for fast-and-inexpensive PC programs that, per the “Triple Constraint” principle, yield poor results. Such software blindly force-fit data into aggressive, complex models tuned to excel in common benchmarks.
In short, proteomics seems to have stalled as the field unwittingly used flaky math to scrimped on computing. This penny wise, pound foolish approach for a “big data” field causes irreproducibility, a show-stopper for any analytical technology. We can think of no other field with such complex, voluminous data that invests so little in IT. For example, genomics would have stalled if they scrimped on server software.
Such is your opportunity for breakout success.
It is the widening disconnect between academics pursuing speedy algorithms and clinicians needing dependable analysis. Academia incentivizes attention-grabbing proofs-of-concept with ever-more complex equations, which is the opposite of what’s needed to find low-abundance peptides for detecting diseases early. Many publications contain mathematical errors and uninformed assumptions. (As a geeky example, redefining the cross-correlation search engine’s periodicity as a fragment mass tolerance, a popular recommendation, swaps the roles of “signal” and “noise”, which increases noise-to-signal instead signal-to-noise.) Once past peer-review, faulty math in non-robust software can propagate like fake news through social media.
On the positive side, this sets the stage for the Precision Proteomics Revolution.
With SorcererScore, by design you can visually interpret the raw evidence of every peptide ID (like a raw mass spectrum or medical x-ray). Or if numerical scores are preferred, you can calculate a simple distance to a dividing line (or plane in 3D) that separates “likely correct” and “likely incorrect” ID hypotheses.
Better yet, unlike statistical metrics, SorcererScore peptide results can be made arbitrarily precise. (Precision can be increased asymptotically by increasing search space.) This means in principle a wet lab can validate any SorcererScore peptide result. In contrast, it is impractical or impossible to experimentally validate false-discovery rate or p-values.
SorcererScore is a revolution in simplicity. It delivers dependable, easy-to-interpret, and sensitive analytical results you can trust without a statistics PhD.
But be forewarned: SorcererScore is more tuna sashimi than tuna salad, meaning there is no hiding imperfect data by blending into a complex model. It is engineered to show you exactly what your raw data holds, nothing more and nothing less, like a robust analytical technology should. As with medical x-rays, sometimes you have to re-do the same experiment to get a clearer view. The point is, the researcher is in total command of deep data interpretation for the very first time.
Personal Commentary: Case for Optimism in Medical R&D
The medical research community seems anxious over past and possible future budget cuts. Some deploy the “hope and pray” strategy — do nothing different hoping good times return. But a better strategy is to prepare for a “revolution” by proactively acquiring strategic skills.
We see parallels between microelectronics research in the 1980’s and medical research today.
Electronics research was once done in dedicated research groups (Bell Labs, universities). But once a market developed with the rise of PCs, they give way to merged R&D organizations. Today, electronics R&D is larger than ever in companies like Apple and Google.
One way to understand this: Before there is a market, research needs dedicated funding. With significant product sales, however, the R&D budget can exponentially grow by funding with profits.
In other words, the reduction of government funds at Bell Labs etc. was not a decline of electronics research per se, but rather an evolution of the funding model to grow R&D exponentially.
Applied to medicine, this predicts medical research will start to shift from government-funded research labs to industrial R&D with larger budgets.
A “revolution” is simply massive changes in a relatively short time. The 80/20 rule suggests this rule of thumb: In a revolution, a field doubles in size mainly by the top 20% growing 10x.
Therefore, to prepare for a technology revolution, acquire strategic skills to be among that successful 20%. Uniquely skilled researchers will experience success beyond their dreams.
Note: As of this writing, we offer a no-cost evaluation of SorcererScore on your data. We will search a representative subset (typically 10,000 spectra), then schedule a training conference call to interpret the results. Please contact us for details.
Leave a Reply
Send Us Your Thoughts On This Post.