smartICR
Last updated
Last updated
Revolutionize your text extraction process with smartICR, a leading tool in the industry. Utilizing cutting-edge deep learning techniques in computer vision and natural language processing, the software is capable of accurately extracting text from images and videos in over 100 languages. Its use of an LSTM neural network allows for unparalleled performance in sequence prediction, making it impervious to repeated text within image sequences. Experience the efficiency and effectiveness of smartICR today.
Max File Size is 10 MB Max requests per second: 1 Processing Time: Up to the number of characters in an image
All rights reserved by Laigo
POST
https://use.laigo.io/api/FileUpload/v1/Upload/smartICR
Name | Type | Description |
---|---|---|
First you need to have a document sample that you want to use with smartICR. It can be any document type such as: invoice, receipt, letter, email etc. For e.g you can use the invoice sample below:
Inorder to use our smartICR tool you need to have an API key which can be generated from here.
If you don’t have an account, first you need to set up your account then you can generate the API key. Create an account here.
After you have generated your API key, then you can use smartICR in any popular programming language, where you can find the code snippets here.
Run your code. You will receive a JSON response with the invoice details.
The calcualtion of processing a page:
1. A general description of the AI system including:
(a) its intended purpose:
SmartICR is developed for text recognition and detection of document images. It is extracting text from documents and is converting scanned PDF or images into searchable PDF-A. With this, text images can be detected, marked and copied.
the person/s developing the system the date and the version of the system:
The AI system smartICR was developed at
Laigo GmbH Eckenerstr. 65 88046 Friedrichshafen
and
Laigo DOO Cevahir Sky City, 16B
Municipality Aerodrom, 1000 Skopje North Macedonia
Releasing Version 1.0 of the AI system, on the 20.12.2022.
(b) how the AI system interacts or can be used to interact with hardware or software that is not part of the AI system itself, where applicable:
The AI system is integrated on the webpage https://laigo.ai/ and can be called by using API. It is served on Laigo internal servers.
(c) the versions of relevant software or firmware and any requirement related to version update:
[…]
(d) the description of all forms in which the AI system is placed on the market or put into service:
The AI system smartICR is placed as secured web application and standalone API on the market.
(e) the description of hardware on which the AI system is intended to run;
The AI system smartICR is running on Laigo internal server, located in Germany.
(f) where the AI system is a component of products, photographs or illustrations showing external features, marking and internal layout of those products;
The AI system smartICR is deployed on Laigos internal server, located in Germany. It is part of a Software as a Service product and therefore a IT service.
(g) instructions of use for the user and, where applicable installation instructions;
All instructions of the product usage are mentioned on Laigos homepage https://laigo.ai/docs.
2. A detailed description of the elements of the AI system and of the process for its development, including:
(a) the methods and steps performed for the development of the AI system, including, where relevant, recourse to pre-trained systems or tools provided by third parties and how these have been used, integrated or modified by the provider;
The AI system smartICR is based on the Open Source OCR Engine from Google Tesseract. Laigo GmbH is using Python Wrapper of Tesseract and is providing the model over an API on https://laigo.ai/. Pre- and post processing steps required for the OCR usage is developed by Laigo.
(b) the design specifications of the system, namely the general logic of the AI system and of the algorithms;
The AI system has same existing original design of Tesseract OCR of Google. For text detection a LSTMN approach has been used and applied. Detailed information can be found in technical papers of Google.
the key design choices including the rationale and assumptions made, also with regard to persons or groups of persons on which the system is intended to be used;
No design choices or modifications are done from the existing system by Laigo GmbH.
the main classification choices;
Laigo GmbH classifies this AI system with “no risk”, since none of the EU requirements are applicable for this use case.
what the system is designed to optimise for and the relevance of the different parameters;
The AI system is designed to optimize OCR by considering context. With this a higher accuracy can be reached compared to classical OCR approaches.
[…]
(c) the description of the system architecture explaining how software components build on or feed into each other and integrate into the overall processing;
Laigo is using the python wrapper of the open source Tesseract OCR engine and integrating it into Laigo own API system using microservices.
the computational resources used to develop, train, test and validate the AI system;
Laigo is using the pre-trained OCR engine from Google. No additional training, validation and testing is done. Further information can be found on Googles technical papers.
(d) where relevant, the data requirements in terms of datasheets describing the training methodologies and techniques and the training data sets used, including information about the provenance of those data sets, their scope and main characteristics;
Laigo GmbH did not fine-tune the existing Tesseract OCR engine and therefore no additional datasets or training methodologies have been used, beside the once which have been mentioned by Google.
[…]
3. Detailed information about the monitoring, functioning and control of the AI system, in particular with regard to:
its capabilities and limitations in performance, including the degrees of accuracy for specific persons or groups of persons on which the system is intended to be used and the overall expected level of accuracy in relation to its intended purpose;
Especially, if the quality of the image and the resolution is bad, text can be missed or misinterpreted in a wrong language. This appears also, if the orientation of the picture/PDF is not aligned to the orientation of the text. Certain preprocessing methods which are applied, are improving the accuracy of text recognition.
The overall accuracy is independent of specific persons or groups of persons which the system is intended to be used.
the foreseeable unintended outcomes and sources of risks to health and safety, fundamental rights and discrimination in view of the intended purpose of the AI system;
No risk to health and safety expected. Fundamental rights are respected and no discrimination of certain individuals considered.
[…]
5. A description of any change made to the system through its lifecycle;
As the AI system has been released with the first version, no changes appeared so far.
[…]
EU DECLARATION OF CONFORMITY
1. AI system name and type and any additional unambiguous reference allowing identification and traceability of the AI system;
The AI system is called “smartICR” and is based on Googles open source OCR engine “Tesseract”.
Technical information and documentation about Tesseract can be found here:
Changes, adaptions and interaction with Tesseract in the context of “smartICR” can be found on https://laigo.ai/docs.
2. Name and address of the provider or, where applicable, their authorised representative;
Laigo GmbH Eckenerstr. 65 88046 Friedrichshafen
[…]
7. Place and date of issue of the declaration, name and function of the person who signed it as well as an indication for, and on behalf of whom, that person signed, signature.
Friedrichshafen, 20.12.2022 Yvonne Gaissmaier
INFORMATION TO BE SUBMITTED UPON THE REGISTRATION OF HIGH-RISK AI SYSTEMS IN ACCORDANCE WITH ARTICLE 51
1. Name, address and contact details of the provider;
Laigo GmbH Eckenerstr. 65 88046 Friedrichshafen
[…]
3. Name, address and contact details of the authorised representative, where applicable;
Yvonne Gaissmaier Eckenerstr. 65 88046 Friedrichshafen
4. AI system trade name and any additional unambiguous reference allowing identification and traceability of the AI system;
The AI system trade name is “smartICR”.
5. Description of the intended purpose of the AI system;
SmartICR is developed for text recognition of document images. It is extracting text from documents and is converting scanned PDF or images into searchable PDF-A. With this, text images can be detected, marked and copied.
6. Status of the AI system (on the market, or in service; no longer placed on the market/in service, recalled);
On the market
[…]
9. Member States in which the AI system is or has been placed on the market, put into service or made available in the Union;
The AI system is placed in Friedrichshafen, Germany.
[…]
11. Electronic instructions for use; this information shall not be provided for high-risk AI systems in the areas of law enforcement and migration, asylum and border control management referred to in Annex III, points 1, 6 and 7.
Any instructions for the usage of smartICR can be found in https://laigo.ai/docs.
12. URL for additional information (optional).
Name | Type | Description |
---|---|---|
Name | Type | Description |
---|---|---|
Page | Laigos |
---|---|
languageHint
String
Define language of the file content with ISO code (ISO-639-1)
outputFormats
String
Define output format(s) like PDF, TXT, JSON. Default is PDF
String
Define receiver email address
threshold
String
Set minimum probability to be reached by model
accessToken*
String
A JWT issued to your application by the Laigo identity provider.
file*
String
The file for uploading
1
1