OCR (text extraction)

A machine-learning algorithm that utilizes optical character recognition techniques to accurately identify and extract text from images or scanned documents

Overview

The OCR ML model is an integral component of our service, providing robust optical character recognition capabilities with support for multiple languages.

The model is designed to convert text present in images or scanned documents into editable and searchable data. It leverages the power of machine learning techniques to accurately recognize characters and words, enabling efficient text extraction and analysis.

It examines the input, tries to detect any text fragments present in the image and recognizes the characters in those fragments according to the specified language. Detected fragments with a good enough confidence level are returned as text strings.

Use cases

The OCR service can be useful for multiple use cases, including:

  • Text extraction and indexing - The model extracts text from images or scanned documents, enabling efficient indexing and searching of digital assets. Users can find images or documents based on specific keywords or phrases mentioned within the text content.

  • Document digitization - Important information can be preserved by converting physical documents into digital formats.

  • Language translation - Combined with language translation capabilities, OCR can facilitate multilingual asset management, enabling users to search and translate text in various languages.

  • Improved accessibility - Converting text within images or scanned documents into machine-readable format enhances accessibility for visually impaired individuals, enabling screen readers or assistive technologies to interpret the content.

API endpoints

An up-to-date reference with all API endpoints is available here:

Example API responses

Input imageAPI response

Last updated