Process PDF documents automatically

Automatically capturing and error-free processing of PDF orders, invoices, delivery bills or packing slips increases productivity and reduces costs. After all, who wants to type mass data into their ERP, WMS, TMS or OMS? We show how OCR processes can be automated with maximum accuracy so that employees can devote themselves to more demanding tasks.

Millions of PDF documents, TIFF files and other image formats are exchanged between companies every day and processed manually for the most part. The structure, format and content are too different for OCR technology to recognize and process the texts without errors. This is the widespread view. In addition, 90% accuracy is usually not sufficient, especially if the errors run through a large number of documents and are not limited to individual documents. After all, what good is 90% if people still have to manually check, correct or even enter 50% of the PDFs as a whole?

 

90% accuracy sounds good, but is usually not enough

Instinctive solutions fail

Danger recognized, danger averted, one would like to shout here. The right software will sort it out. Or even better: force PDF senders to only send highly standardized documents that match the capabilities of their own OCR software. Unfortunately, both are doomed to failure. Because the reality is, as always, more complicated.

Although powerful OCR software makes a difference, practice shows that ML-capable IDP (Independent Document Processing) software in particular does not always deliver convincing results with structured PDF documents, despite intensive training. Getting business partners to send only standardized PDF documents adapted to their own requirements is only possible with a very high level of negotiating power. In a complex and dynamic world, however, this is rarely a reality.

Artificial intelligence and standardization alone will not solve the problem

What to do? Fortunately, resourceful companies have developed their IDP platforms with integrated OCR software in such a way that, with the right know-how, they can be adapted to almost all use cases, i.e. PDF documents. With the help of sophisticated filters, operations and configurations, the relevant texts can be read from the sources and then further processed via an integration and automation solution in order to be processed in the desired ERP, TMS, WMS or OMS - without any human interaction. So is it all a question of the right tool?

 

Intelligent processing instead of pure readout

No. Because automation requires much more than just information recognition. Data such as article numbers, contractual partners, goods items or prices must not only be read from the PDF documents, but must also be further processed before being imported into the target system. Time data must be converted, price and quantity details or discount campaigns must be calculated or validated before they flow into the ERP system; addresses or reference numbers must be validated with the help of other systems and supplemented or corrected if necessary; several orders, invoices and delivery bills must be combined into one data record before they can be processed further. All of this should happen automatically in the background without any manual intervention being necessary. But these are just a few examples of how PDF data can or must be enriched and structured.

 

PDF data can be automatically enriched and enhanced as required

Incorrect PDF documents with missing customer numbers or article numbers, for example, are rejected via error handling routines and forwarded by email or chat to Customer Service for correction or coordination with the sender. This means that you can rely on complete and error-free processing in the core systems. After all, what good is a 95% hit rate if you can't rely on correct processing? All processing must work. Errors must at least be corrected: Management by exception.

Successful automation involves several steps

How do we achieve this result? As is so often the case, success is the child of several parents.

  1. Firstly, selecting the right software for the application is the first step. However, in order to be able to make an accurate selection, you need to proceed independently of the provider and the situation - free from subjective preferences or existing contractual relationships. In addition, the costs must be at an attractive level so that the total process costs end up below the status quo. Software-as-a-Service lends itself here due to the pay-as-you-go principle.
  2. Secondly, you have to be able to master the selected software in detail and creatively solve borderline cases - because every software has its limits somewhere, but there is always a workaround that you either have to know or find. This is where human creativity comes into play.
  3. Thirdly, the overall process must always be considered when it comes to automation. It is not one system that leads to success, but the combination of several services that can be put together as required. Gone are the days when large software providers could satisfy all requirements. Today's requirements are far too heterogeneous and complex to be and remain successful.
  4. Fourthly, safe operation must already be considered during technical implementation. How are new special cases recognized so that they can be incorporated at short notice? How are problems escalated in a system-controlled manner thanks to logging and monitoring so that they can be resolved at an early stage - ideally before the customer even notices anything?
  5. Fifthly, further development must not be ignored. After all, success breeds success. The more manual activities that can be automated in a working world with a shortage of employees, the greater the demand from the specialist departments and the more automation ideas will be discussed. The solution used must satisfy these requirements quickly in order to exploit the momentum and distribute the start-up costs optimally. This is how a "culture of automation" of recurring activities develops. People should be allowed to work with their creativity and not be forced to copy and paste or perform monotonous tasks. Nobody can keep that up in the long term.

| Success feeds success

 

Project planning and implementation

What remains is the question of the specific procedure. As we are in the field of business process automation, implementation requires both technical and specialist expertise. Describing and understanding the functional requirements of the process to be automated is essential for success. True to the motto "whoever writes stays", the technical process as well as the respective input and output information should be documented immediately and underpinned with examples. On this basis of a common understanding of the target state as well as the influencing variables and functional framework conditions, implementation can begin.

Ideally, this is done in several cycles, i.e. iteratively, so that the basiccaseis implemented first. Once this has been successfully tested and proven in operation, the special cases are implemented according to their importance. This not only ensures that the benefits are realized as early as possible, but also eliminates misunderstandings that, despite the best planning, only become apparent shortly before or even during implementation. As a result, this interactive and iterative approach ensures maximum project efficiency.

In addition, knowledge of the business processes, technical implementation and automation options is growing on all sides. The integration of in-house IT also ensures self-sufficiency from the service provider and further scaling of the technologies and methods used, provided in-house resources are available. Because one thing is certain: with food comes hunger

About Business Automatica GmbH:

Business Automatica GmbH specializes in the automation of business processes of all kinds. From the core process of order processing to HR and financial processes, Business Automatica supports companies in increasing productivity and optimizing IT-supported workflows.