Out With The Old

Friday, June 6, 2025

Picture of Deepak Singh
Deepak Singh
IDP

First-Mile Data™ is a term we developed to describe all the (often messy) data created outside of a business’ internal systems and departments – and therefore outside of that organization’s direct control.

It’s a growing point of frustration in many industries, as much of this external data is increasingly critical to carrying out successful business operations and digital transformation.

On top of that, much of a business’ first-mile data exists in unstructured formats, like emails, PNG files, Slack messages and PDFs.

In this post, we will explore one type of unstructured data – documents housed in PDF format – and explain how Adeptia built a tool that helps businesses take an expensive, week-long legacy process and turn it into something that took just 5 minutes to complete – all with a higher degree of accuracy than before.

Let’s get started.

The $25,000 Question

Several months ago, we were approached by a midsized business loan provider who was looking to streamline their applications processing and underwriting process.

Every day, they received thousands of small business loan applications from other businesses, all in the form of PDFs. As this data was unstructured, their first priority was to extract and convert that data into a CSV format that would fit with their existing system.

To do that, they were offshoring this data to a partner facility, where it was manually processed into Excel by a team of workers overseas. Despite being significantly cheaper than it would have been to handle in-house, this manual data entry process was costing the company around $25,000 per month.

Additionally, because the data was all manually entered, they were only getting about 91-92% accuracy due to human error and the number of handoffs involved in their data processing workflows.

Most importantly, however, it took nearly 3 days to go from PDF to structured data via this process. That’s a full week before the company could even get back to the applicant with a loan offer, and by that time, many of those businesses had received and accepted offers from competitors.

This company knew there had to be a better, more automated way, and they turned to Adeptia’s intelligent data automation platform to

The Limits Of LLM Solutions

Our client’s number one goal was to be able to take their highly manual process and create a tool capable of handling it automatically, quickly and with a high degree of accuracy.

Initially, they tried feeding a simple mock-up PDF data into a commonly used LLM to see how that performed. Immediately, it ended up returning processed data with about 95-96% accuracy, which was already better than the 91-92% accuracy that our client was getting from their manual approach.

It was certainly much faster and cheaper than the $25,000 they were spending each month on their manual process overseas. However, OpenAI had several limitations.

  • Many of their documents were too large, and with too many pages, for processing via OpenAI.
  • It was a closed system, and very difficult for users to verify the accuracy of the data as the process went on.
  • There was no way to validate the data automatically at the end of the process.

Most importantly, even 95% accuracy isn’t good enough for most businesses, who stand to lose substantial amounts of time and business if their data cannot be trusted.

In the end, we knew Adeptia’s solution could do better.

Enterprise-Grade Document Processing

Adeptia offers a solution capable of processing thousands of loan applications to a degree of accuracy that met and often exceeded our client’s needs.

In the end, we were able to take that week-long, error-prone process and build a solution that accomplished the entire workflow in five minutes or less.

Not only that, but it did so with over 99% total accuracy, every single time.

So how did we do it?

The solution we built, Adeptia Intelligent Document Processing, or AIDP™, and it’s already helping businesses save time, streamline operations and keep up with the exponentially-growing pace of innovation we’re seeing today.

I’ll be covering all the big breakthroughs, lessons and strategies we used in-depth in my next post later this month, but at a high level it came down to solving several core challenges that off-the-shelf AI tools simply aren’t able to solve. These are:

  • Human-In-The-Loop Support
  • Automated Business Rules
  • Large File Processing
  • Fuzzy Scanned Documents
  • Pre-Built Data Formats (e.g. Invoices, Orders, etc.)

If you're interested in exploring how our Intelligent Document Processing (IDP) solution can help you gain control of your business’ unstructured first-mile data, we’d be happy to schedule a demo to show you the platform in action.