Why healthcare LLMs should address clinical quality measures

Large language models, a form of artificial intelligence, are generating a lot of hype in healthcare circles, primarily because of their potential to transform and improve various aspects of healthcare delivery and management. The buzz also is driven by rapid advancements in AI and machine learning.

But while there’s significant potential, challenges and ethical considerations remain, including concerns about data privacy and security, lingering bias, regulatory issues, data precision and more.

In short, AI is poised to do big things – but can it be made to work for clinicians?

Medicomp Systems CEO David Lareau believes it can – if the industry leverages complementary technologies that take advantage of the power of AI.

Healthcare IT News sat down with Lareau to talk about AI, LLMs and the future of healthcare.

Q. You suggest setting artificial intelligence to the task of identifying clinical quality measures and the coding of hierarchical condition categories for risk adjustment. How can AI help clinicians here? What can it do?

A. Artificial intelligence and large language models have powerful capabilities for generating textual content, such as drafting encounter notes and identifying multiple words and phrases that have similar meanings.

An example of this is the use of ambient listening technology with LLMs to capture and present draft notes of a clinical encounter by taking what is spoken during the patient encounter and converting it into text notes.

AI and LLMs enable a system to hear the patient say, “I sometimes wake up at night and have some trouble catching my breath,” and associate that with specific clinical concepts such as “shortness of breath,” “difficulty breathing,” “recumbent dyspnea,” and conditions or symptoms.

These concepts may have different diagnostic implications to a clinician, but by being able to associate what is said by a patient to specific symptoms or conditions that have clinical relevance to potential problems or diagnoses, the combination of AI/LLMs can help a clinician focus on conditions that qualify for risk adjustment, which in this case might include sleep apnea, heart failure, COPD or other ailments.

This powerful first step in identifying potential clinical quality measure applicability is crucial. However, it requires additional tools to evaluate complex and nuanced patient inclusion and exclusion criteria. These criteria must be clinically precise and involve additional content and diagnostic filtering of other information from a patient’s medical record.

Q. Regarding AI and CQM/HCC, you say even with advanced AI tools, challenges with data quality and bias loom large, as does the inherent complexity of medical language. Please explain some of the challenges.

A. In clinical settings, factors like gender, race and socioeconomic background play a crucial role. However, LLMs often struggle to integrate these aspects when analyzing individual medical records. Typically, LLMs draw from a broad range of sources, but these sources usually reflect the most common clinical presentations of the majority population.

This can lead to biases in the AI’s responses, potentially overlooking unique characteristics of minority groups or individuals with specific conditions. It’s important for these AI systems to account for diverse patient backgrounds to ensure accurate and unbiased healthcare support. Data quality presents a significant challenge in using AI effectively for chronic condition management and documentation.

This issue is particularly relevant for the thousands of diagnoses that qualify for HCC risk adjustment and CQMs. Different standard healthcare codes including ICD, CPT, LOINC, SNOMED, RxNorm and others have unique formats and don’t seamlessly integrate, making it hard for AI and natural language processing to filter and present relevant patient information for specific diagnoses.

Additionally, interpreting medical language for coding is complex. For example, the term “cold” can be related to having a cold, being sensitive to lower temperatures, or cold sores. Also, AI systems like LLMs struggle with negative concepts, which are crucial for distinguishing between diagnoses, as most current code sets don’t effectively process such data.

This limitation hinders LLMs’ ability to accurately interpret subtle but significant differences in medical phrasings and patient presentations.

Q. To overcome these challenges and guarantee compliance with risk-based reimbursement programs, you propose CQM/HCC technology that has the ability to analyze information from patient charts. What does this technology look like and how does it work?

A. CQMs serve as proxies for determining if quality care is being provided to a patient, given the existence of a set of data points indicating that a specific quality measure is applicable. Participation in a risk-adjusted reimbursement program such as Medicare Advantage requires providers to address the Management, Evaluation, Assessment and Treatment (MEAT) protocol for diagnoses included in HCC categories, and that the documentation supports the MEAT protocol.

Given there are hundreds of CQMs and thousands of diagnoses included in the HCC categories, a clinical relevance engine that can process a patient chart, filter it for information and data relevant for any condition, and normalize the presentation for a clinical user to review and act upon, will be a requirement for effective care and compliance.

With the adoption of FHIR, the establishment of the first QHINs, and the opening up of systems to SMART-on-FHIR apps, enterprises have new opportunities to keep their current systems in place while adding new capabilities to address CQMs, HCCs and clinical data interoperability.

This will require use of clinical data relevancy engines that can convert text and disparate clinical terminologies and code sets into an integrated, computable data infrastructure.

Q. Natural language processing is part of your vision here. What role does this form of AI have in the future of AI in healthcare?

A. Given a prompt, LLMs can produce clinical text, which NLP can convert into codes and terminologies. This capability stands to reduce the burden of creating documentation for a patient encounter.

Once that documentation is created, other challenges remain, since it is not the words alone that have clinical meaning, but the relationships between them and the ability of the clinician to quickly find relevant information and act upon it.

These actions include CQM and HCC requirements, of course, but the greater challenge is to enable the clinical user to mentally process the LLM/NLP outputs using a trusted “source of truth” for clinical validation of the output from the AI system.

Our focus is on using AI, LLMs and NLP to generate and analyze content, and then process it using an expert system that can normalize the outputs, filter it by diagnosis or problem, and present actionable and clinically relevant information to the clinician.

Follow Bill’s HIT coverage on LinkedIn: Bill Siwicki
Email him: bsiwicki@himss.org
Healthcare IT News is a HIMSS Media publication.

Why healthcare LLMs should address clinical quality measures

Leave a Reply Cancel reply