OCR engine

Specifies the optical character recognition (OCR) engine for extracting text from images. For information about additional OCR options, see OCR.

Enums

The following table shows the enums available for the OCR Engine parameter.

enum	description
Amazon	Default engine for the OCR preprocessor.
Microsoft	Default engine for document types.Suited to typewritten documents and large documents up to 50 MB in size.
Lazarus	Faster than Microsoft and produces similar output.
Google	Faster than Microsoft and suited to handwriting and documents that are 5 pages or fewer. The Google engine doesn’t merge words into lines automatically. Use the Merge Lines preprocessor in your configurations to do so.

Note: When Sensible extracts from portfolios, it uses Microsoft OCR, and ignores any OCR settings in the portfolio’s document types.

Notes

You can use the Query Group method’s Multimodal Engine parameter as an alternative to OCR engines to extract from non-text images or from poor-quality text images, such as handwriting.

Fingerprint mode OCR level

Welcome

Integrations

LLM-based Extractions

Layout-based Extractions

Document Type Classification

Best Practices

SenseML Reference

API

Enums

Notes

​Enums

​Notes

Enums

Notes