Major League Baseball is awash in advanced statistics that more reliably describe key aspects of players’ offensive and defensive performance. It has been reported that through the use of Statcast, the MLB Advanced Media group can supply teams with 70 fields x 1.5 billion rows of data per season [i]. Yes, billion with a b. This flood of information has supercharged MLB teams’ and the sabermetric community’s development of ever-more useful statistics for describing player performance.
However, this amount of data brings significant challenges. Perhaps chief among them is that while certain individuals may be comfortable with reams of tables and ever-increasing numbers of descriptive statistics, many others prefer or require analyses and visualization tools that convert disparate metrics into informative and readily interpretable graphics.
MLB’s situation has certain similarities to the discipline of safety toxicology, where the use of high-information content assays for characterizing chemicals’ toxicological profiles has exploded [ii]. Drawing conclusions from multiple biomarkers and test systems is challenging, as it requires synthesis of large amounts of dissimilar data sets. One tool that toxicologists have found useful is the Toxicological Prioritization Index, or ToxPi for short [iii]. ToxPi is an analytical software package that was developed to combine multiple sources of evidence by transforming data into integrated, visual profiles. Read the rest of this entry »