Scaling new bioinformatic heights
Updated: Apr 18
Genpax analysis was created for the real-world problem from the bottom-up with scalability as an essential requirement.
Scalability is a constraint for existing solutions, the three main reasons being:
• The statistical nature of relationship determinations that are impossible to scale (they are ‘NP-hard’ problems)
• The loss of resolution with increasing diversity and number of compared strains with common genome SNP
• Data generated with more than one reference genome when analyzing a species cannot be readily integrated
A key aspect of scalability is deliverability. The time and computational costs of analysis using unscalable solutions rapidly grow with increasing resolution and numbers of strains being compared. Typically, high-resolution studies are limited to sets of highly related strains, to local and recent isolates (for example, over a 3-month window in a large hospital setting), or to restricted sets of highly related strains. This is because the time needed to compare hundreds or more strains rapidly become too slow when the information is needed quickly, and the process has to be re-run each time additional strains are added to the analysis. Even when restricted to Sequence Types with a good reference genome, the number of isolates can be too many; an extreme example is ST22 MRSA, which represent around 50% of the isolates in the UK.
Whether you wish to compare 5 strains, or five thousand, Genpax will deliver in real clinical time, and do so for multiple new strains in parallel, comparing each against every other previously and co-analyzed strain within a real-world clinically relevant turn-around target of 2 hours. Not at the resolution or with the errors of Sequence Typing (that must be followed up with further analyses); at full SNP resolution, with greater accuracy than other methods, and with more sequence addressed than ‘common genome SNP’.