Podcast: Play in new window | Download
If you wanted to build a machine learning model to understand human health, where would you get the data? A hospital database would be useful, but privacy laws make it difficult to disclose that patient data to the public. In order to publicize the data safely, you would have to anonymize it, so that a patient’s identity could not be derived from data about that patient–and true anonymization is notoriously difficult.
In every industry where privacy is a concern there is a similar challenge. If there is no place with public data sets, there is no place where the machines can go to learn. The possible machine learning applications that we can build are limited by the data sets that are available.
Auren Hoffman started his company SafeGraph to unlock data sets so that machine learning algorithms can learn from that data. In this episode, we talk about the machine learning landscape in both the short- and long-term time horizons. We also discussed some of Auren’s strategies for building companies, which have been crucial for me in thinking about how to build Software Engineering Daily.
Where Should Machines Go to Learn?