Skip to content

Understand default line recognition

Sebastian Radloff edited this page Nov 29, 2021 · 1 revision

Understand functionality of basic line recognition

Document Capture is able to recognize documents with lines by default but with some limitations an user should be aware of. To get the advanced line recognition working it's important to understand the basic functionality first.

Requirements of basic line recognition

For Document Capture to identify line-based data in the document, the following requirements must be met:

  • the values must be placed in separate columns in tabular form. Each column can then be read out via its own template fields and configured as required via rules, field translations, etc.
  • Line based data are all in one line and not staggered over several lines:

If the requirements are met Document Capture can easily find all information in the document like in the following example:

Simple document with default line recognition

Figure 1 - Simple document for basic line recognition

If the requirements are not met and the information of each invoice/document line is staggered over different lines you won't be able to get a result.

The following figure shows one document line of an invoice that cannot be processed with basic line recognition:

Documentation cannot be recognized by basic line recognition

Figure 2 - Documentation cannot be recognized by basic line recognition

How to capture lines with Document Capture

This documentation is not intended to explain the training of basic line recognition.

As a Continia partner you can use Continia learn to find a guide that describes the concept:

https://continia.docebosaas.com/learn/course/41/play/439:392/basic-line-recognition

Or visit the Continia youtube channel:

Recognize lines in PDF invoices - YouTube