Sunday, May 6, 2018

The real impact of the All of Us Program

Today, May 6, happens to be the national launch day for NIH's All of Us program. This is a program that focuses on precision medicine - an emerging approach for disease treatment and prevention that takes into account individual variability in genes, environment, and lifestyle for each person (definition from NIH).

Precision medicine is not a new concept. If you google for it, you can find several documented cases, especially in cancer treatment where genomic information of a single patient was used to specifically target his or her disease successfully. The Director of the All of Us program, Eric Dishman happens to be such a case and is the kind of leader the program needs to make sure everyone can make use of it.

Using the genomic data to precisely understand and define medical treatment for an individual seems logical. But how can we generalize this and make its reach wider? How can the treatment of one person at one end of the world have a positive impact on a patient at another end? How can we connect the dots? Can we broker the innovation produced by Facebook to socially engineer the ad targeting into our own healthcare? Can we target medicines or health treatments to an audience instead of ads?

The answer is of course a resounding YES. It is resounding enough to reveal that All of Us is not the first program to attempt to do so. So why would it be more successful than other data collection programs? It has many good things going on for it. Besides large national funding and a passionate team that is behind it, it attempts to engage community, capture diversity and use all available technology.

As I wrote in my previous article, the power of using data to help complement today's medicine is huge. The All of Us program extends the scope of this power to come from a much wider data source. This is the data collected from a lot of people and a lot of types of data besides just genomic data. It targets not just patients but also healthy people. So why is all of this necessary if we can just take someone's genome and use it to tweak their treatment. Think of what Facebook does by collecting large amounts of data from a large set of people having diverse behavior. Think of how Alexa learns to do some smart things by learning from questions asked by a large number of people. Perhaps a quick lesson from machine learning would be useful here.

Machine learning from data involves looking for patterns when trying to see why certain choices or events or input parameters led to a set of outcomes. Finding the patterns needs diversity of data, else machine learning can go very very wrong. Data which is skewed towards certain outcomes can lead to very poor models. In addition, the presence of redundant inputs or the absence of key inputs can lead to poor models too. Having too little data is also not helpful and can lead to failed models in real world. Data scientists often combine the results of many models that are based on many sets of data to build a more real world robust model. 

That is precisely why the program attempts to collect a diverse set of data from a diverse set of healthy and non-healthy individuals and make it available to a diverse set of researchers. This could fundamentally change how medicine works in a few years. Think of how much machine learning technologies have impacted our day to day work in the last ten years. And how cheap it has become to research and deliver the benefits! Very simple health related outcomes could be achieved in the next couple of years itself. More complex cloud based solutions that use this data to buy health benefit programs "as a service" could evolve in few years!

No comments: