
Artificial intelligence (AI) is rapidly transforming the healthcare and life sciences industries, promising to accelerate drug discovery, improve patient outcomes, and streamline operations.
But beneath the hype lies a fundamental truth: AI is only as good as the data that powers it. Poor-quality, incomplete, or siloed data can lead to flawed predictions, harmful recommendations, and missed opportunities for innovation. For industries like biopharma and MedTech, where lives are at stake, the stakes couldn’t be higher.
At the forefront of addressing this challenge is Dr. Mitesh Rao, co-founder and CEO of OMNY Health, a company dedicated to unlocking the full potential of healthcare data. With access to one of the largest repositories of de-identified, structured, and unstructured clinical data in the US, OMNY Health provides the critical infrastructure that enables AI developers, pharma companies, and healthcare providers to train models on high-quality, representative datasets.
In an interview with Drug and Device World, Dr. Rao discusses why data quality, not just volume, is essential, how data democratization can reshape the industry, and what misconceptions still cloud the adoption of AI. He also shares how OMNY Health is helping accelerate innovation by bridging the gap between raw clinical data and AI-ready insights.
This conversation sheds light on the opportunities and challenges ahead as AI continues to advance in healthcare, and why the right data foundation is key to building safe, reliable, and transformative solutions.
This interview has been edited for clarity, consistency, and length.
Phalguni Deswal [PD]: Where do you see the biggest gaps in healthcare data quality today, especially when it comes to AI?
Dr. Mitesh Rao [MR]: Data quality determines how effective an AI solution can be. If the data is incomplete or inconsistent, the risk of errors, or worse, harmful outcomes, increases dramatically.
For example, if I ask an AI system to recommend dinner spots and it points me to a closed restaurant, it’s just an inconvenience. But if the same system suggests a wrong medication because the underlying data was poor, the consequences can be serious.
Unfortunately, much of healthcare data remains siloed within electronic health records (EHRs), often difficult to access or not in a clean, comprehensive format. Companies that have better access to this data have a major advantage in building reliable AI solutions.
PD: Why is data democratization such a critical shift for healthcare and Pharma?
MR: Historically, data was seen as a commodity, but now it’s a building block for AI. Pharma companies, for instance, already rely heavily on data for research and drug development, but structured datasets like claims or trial data don’t tell the whole story. Real-world, unstructured data, such as physician notes explaining why a therapy was switched, is crucial for training models.
As large language models become commoditized, the real differentiator will be access to high-quality, proprietary, and representative datasets. Without democratized access, innovation will be limited to a few players sitting on massive data reserves.
PD: How does access to quality data differ from simply having large volumes of data?
MR: Volume alone isn’t enough. The source and depth of the data are what matter. For example, scraped ambulatory records might provide unstructured data, but they often lack the richness needed to train robust models. True value comes from comprehensive clinical data that captures the entire patient’s journey, whether in rare disease care, oncology, or chronic conditions. Without that level of detail, AI models will only see part of the picture and produce limited or misleading insights.
PD: Can you share how OMNY Health’s data network is being used to accelerate AI development?
MR: Our network spans over 100 million patients and includes more than 6.5 billion de-identified clinical notes across the US. We clean, structure, and tokenize the data to make it AI- and research-ready. This allows pharma companies, AI developers, and providers to train models on truly representative datasets that cover diverse populations, geographies, and health systems.
We don’t build AI applications ourselves; we provide the infrastructure, the “pipes and electricity,” so that innovators can build solutions faster and more effectively. The result is AI models that can meaningfully speed up drug discovery, clinical research, and patient care innovation.
PD: What misconceptions do companies still have when deploying AI in healthcare?
MR: One misconception is that AI will immediately replace human oversight. In reality, we still need humans in the loop to validate outputs and catch errors. Another is that third-party platforms alone will drive progress.
Increasingly, pharma and healthcare organizations are realizing they can build their own AI solutions in-house, if they have the right data foundation. This shift toward internal development, powered by democratized access to quality data, will drive more sustainable progress.
PD: Looking ahead, what positive changes could the industry see if AI is built on high-quality, regulatory-grade data?
MR: If companies invest in quality data now, the benefits will be long-term. We’ll see faster development of life-changing therapeutics, better disease management tools, and more equitable healthcare outcomes across populations. But the key is recognizing that unstructured clinical data, physician notes, patient experiences, and outcomes documentation contain the richest insights. Training AI on that depth of data will unlock breakthroughs that claims data alone can never provide.
PD: Finally, what’s the one key takeaway you want readers to remember about AI and data in healthcare?
MR: AI is powerful, but it’s not a silver bullet. The quality, depth, and representativeness of the data powering these models are what make or break their value. Structured data is only part of the equation; the real knowledge lies in unstructured clinical notes that capture the “why” behind patient journeys. Without that, AI risks being incomplete, and in healthcare, incomplete isn’t good enough.


