Healthy datasets are the cornerstone of effective AI initiatives

The advent of artificial intelligence in healthcare, and its embrace by provider organizations, large and small, eager to explore its transformative potential, has come quickly. And it has come with a steep learning curve.

That’s led to an interesting conundrum recently, says Richard Cramer, chief strategist for healthcare and life sciences at Informatica: Most health systems are, organizationally and attitudinally, “ready for AI,” he said. “But their data isn’t.”

At HIMSS24 earlier this month, Cramer spoke alongside Anna Schoenbaum, vice president of applications and digital health at Penn Medicine, and Sunil Dadlani, chief information and digital officer at Atlantic Health System (where he also serves as CISO).

They explored how hospitals and health systems should approach the process of assessing how artificial intelligence and automation can fit into their organizations, and how to start new AI initiatives and enhance existing ones as they scale up projects across the enterprise.

Despite all the buzz and excitement about generative AI, it’s important to stick with the basics, said Cramer.

“I think the enthusiasm around ChatGPT makes people think that it’s something intrinsically new,” he said. “But we, as an industry, have been doing AI for a long time.”

And a core lesson from years of experience is that any AI or machine learning project needs one essential prerequisite: “accessible, trustworthy, fit-for-purpose data.”

What does trustworthy mean? “It’s all about transparency, right? I need to know where the data came from, everything that happened was on its way from source to being consumed,” Cramer explained.

“I’m a lifelong data analyst, and one of the things that I like to say is that if you’re transparent, I can disagree with your conclusion and still trust you, because I know what all your assumptions and everything are. But if you’re not transparent, I probably will never trust you, even if I agree with what your conclusion is.

“I think that really applies to what we’re talking about with AI,” he added. “Data doesn’t need to be perfect to be useful. But you don’t ever want to use data that’s not perfect and not know it.”

Dadlani teased out some key differences between the traditional AI that has been worked on for decades at health systems, and the new generative AI that’s currently at the tippy-top of the Gartner Hype Cycle.

“Traditional AI is just more deterministic, it’s trained for specific tasks,” he explained. “It’s more related to predictive analytics based on the data that you have in the real-world data. And I would say that traditional AI has become very mature in certain use cases where the output is more interpretable, more explainable, and it has matured and adopted across clinical and nonclinical areas.

“Whereas when you talk about generative AI, the way we differentiate is it’s more probabilistic, not deterministic. It’s self-learning, self-improving. It’s more about generalized solutions rather than a specific solution. It can learn, it can scale on its own.”

That “comes with its own risk, an explainability risk,” said Dadlani. “Because typically, generative AI are based on very advanced deep neural networks that are based on large language models. So the explainability and the interpretability of these AI models is really opaque.”

At Penn Medicine, data scientists have been working on AI for a long time, but genAI is “coming at a fast pace,” said Schoenbaum. “We do have processes in place, whether it is AI, predictive models or generative AI, into the same workflow. But what we’re trying to figure out is how to put policies and guardrails in place, and support model governance.”

Well-governed data is “absolutely important,” she said – and that requires robust interoperability, and information sharing with other healthcare organizations.

“You can’t just work within your own health system,” said Schoenbaum. “You’ve got to work regionally, in the community. You have to make sure that data is shareable with the right definition, because I think that’s how we can leverage the data in order to feed these systems.”

But when it comes to data governance, that “should be within your own organization,” she said. “As you add things, somebody should be monitoring, as well as who gets access to that data and make sure that data is protected. It is all about the patient, but it needs to be shared across institutions in order to get the better benefits.”

Mike Miliard is executive editor of Healthcare IT News
Email the writer: mike.miliard@himssmedia.com
Healthcare IT News is a HIMSS publication.

Healthy datasets are the cornerstone of effective AI initiatives

Leave a Reply Cancel reply