The Hidden Cost of PDF-Based Risk Adjustment

How Legacy Chart Review Slows Everything Down

It’s surprising how many health plans are still using PDFs for their risk adjustment coding.

Not because PDFs don’t work at all. They’ve worked for years. But because healthcare technology has leaped past the antiquated method of transporting medical records to payers as PDFs.

CMS and the OIG have been pushing advances in interoperability for years. Now the tools exist to retrieve charts directly from EHR systems in machine-readable formats like XML with speed, lower costs, and higher capture rates.

And yet, many risk adjustment programs are still relying on dated chart retrieval methods that deliver charts as static PDF files.

At first glance, that may seem like a minor technical detail. In practice, it shapes everything.

PDFs weren’t built for modern risk adjustment

A PDF is essentially an image of a medical record. It looks clean on the screen, but underneath, it’s not structured data. It’s an image that has to be interpreted.

That’s where Optical Character Recognition (OCR) comes in.

OCR attempts to “read” the PDF, extract text, and convert it into something usable. Sometimes it works well. Sometimes it doesn’t. And when it doesn’t, the consequences aren’t always obvious.

Dates of service can be misread.
The beginning and end of codable encounters can be difficult to identify.
Non-codable documents get mixed into the file.
Shading, formatting, and scanning artifacts distort text.

What looks simple to the human eye becomes complex and imperfect for a machine.

Over time, that imperfection adds friction.

The bottleneck most teams don’t see

When coding performance stalls, the instinct is often to look at staffing, training, or productivity. But in many cases, the real constraint isn’t the coders.

It’s the format.

When charts arrive as PDFs:

Teams must manually search and scroll
Coders must interpret where a codable document starts and ends
Dates and sections must be visually confirmed
Text inconsistencies require validation
Retrieval timelines introduce delays

Even the best coders can only move so fast through an unstructured document.

And no matter how much automation you layer on top, OCR is still interpreting an image – not reading the original clinical data as it was documented.

At scale, this becomes a structural speed limit.

Accuracy suffers quietly

The impact isn’t just about speed.

When OCR misreads text, loses formatting, or distorts characters, subtle documentation details can be missed. A single misread phrase can change how a diagnosis is interpreted. Dates of service can blur together. Context can be harder to trace.

This doesn’t always show up as obvious errors. Instead, it shows up as:

Lower recall of risk-adjusting diagnoses
Missed specificity
Increased validation burden
More rework
More coding mistakes that raise RADV audit risk

And because the workflow is built on document images, there’s always a layer of uncertainty about whether the extracted text truly matches what was documented.

That uncertainty affects both accuracy and defensibility.

Speed isn’t just about coding faster

Another hidden cost of PDF-based workflows is time – not just the time spent reviewing charts, but the time spent waiting for them.

Traditional retrieval methods can take weeks or even months to complete. In some cases, a meaningful percentage of a chase list may never be retrieved at all.

When charts arrive late, everything downstream moves late. Coding is delayed. Reimbursement is delayed. Visibility is delayed.

More importantly, opportunity is delayed.

When medical records are retrieved directly from EHR systems in structured formats, charts can be available within minutes of an encounter. That changes what’s possible. It opens the door to concurrent review rather than retrospective review. It allows faster identification of HCCs. It enables quicker reimbursement cycles.

PDF-based workflows simply can’t operate at that speed because their retrieval methods weren’t designed to.

Completeness affects financial performance

There’s another dimension to this issue: hit rate.

Older retrieval processes often fail to capture a meaningful percentage of the chase list. When charts aren’t retrieved, they can’t be coded. When they can’t be coded, revenue tied to those encounters is lost.

Modern interoperability-based retrieval methods can dramatically increase retrieval completeness – often into the high 90-percent range when records exist in the EHR. Some systems guarantee 99% retrieval rates when the EHRs contain the medical record.

That difference isn’t incremental. It directly impacts RAF performance and overall financial results.

Why this matters now

Risk adjustment is under more scrutiny than ever. CMS expectations around documentation linkage, specificity, and defensibility continue to tighten. Accuracy matters. Transparency matters. Compliance margins are thin.

In that environment, building workflows on top of PDF images and OCR interpretation becomes increasingly difficult to justify.

The issue isn’t that PDFs are unusable. It’s that they introduce avoidable friction at every stage:

Retrieval
Interpretation
Coding accuracy
Validation
Audit defense

That friction compounds over time, limiting both efficiency and performance.

The realization many leaders eventually reach

If your team is working harder each year just to maintain results, if coding throughput feels capped, if retrieval timelines slow down visibility, or if validation effort keeps growing, it may not be a people problem.

It may be a format problem.

PDF-based risk adjustment isn’t just a legacy technical choice. It’s a workflow constraint.

And once that becomes clear, a new question naturally follows:

What would change if charts weren’t being interpreted from images – but read directly from the source?

Recognizing that distinction is often the first step toward understanding why modern risk adjustment performance requires more than incremental process improvements.

Sometimes, it requires rethinking the foundation entirely.