Explore chapters and articles related to this topic
Review on Optical Character Recognition-Based Applications of Industrial IoT
Published in Sudan Jha, Usman Tariq, Gyanendra Prasad Joshi, Vijender Kumar Solanki, Industrial Internet of Things, 2022
The demand of automation has benefitted a lot of businesses. There are many mundane tasks which when automated help businesses save a lot of money and increases the productivity of the organization. With OCR in place, many tasks are now automated including data entry for documents, automatic number plate extraction, searching PDFs, etc. [10, 11].
Video-Image-Text Content Mining
Published in Wahiba Ben Abdessalem Karaa, Nilanjan Dey, Mining Multimedia Documents, 2017
OCR (optical character recognition) is a technique that converts different types of scanned images that are captured by digital camera of documents, (PDF files, sales receipts, mail, handwritten, typewritten, or any number of printed records) into searchable and editable data. It is widely used for extracting textual metadata, that is, machine-encoded text. (www.abbyy.com) states that the recognized document by OCR looks like the original. Therefore, these textual data can be used in machine processes such as machine translation, text to speech, and text mining. The OCR software allows saving a lot of time and effort spent in creating and processing and repurchasing various documents. OCR is a field of research in pattern recognition, artificial intelligence, and computer vision [12,13].
Using computer software packages to assist engineering activities
Published in David Salmon, Penny Powdrill, Mechanical Engineering Level 2 NVQ, 2012
Scanners are input devices. They are able to ‘read’ both images and text and feed this information into the computer and on the screen. Scanners are extremely useful and provide a quick method of copying information into computer formats that can then be stored, printed or sent elsewhere. Scanners are a type of hardware but they need specialised software; the most common one allows editing of the scanned image. Another aspect of the software is optical character recognition (OCR), which means that the scanned text can be used with word processor software.
Recognition of expiry data on food packages based on improved DBNet
Published in Connection Science, 2023
Jishi Zheng, Junhui Li, Zhigang Ding, Linghua Kong, Qingqiang Chen
As the demand for OCR has increased many open-source OCR methods have emerged, such as Tesseract OCR, Keras OCR, Easy OCR, etc. A Tesseract OCR-based application is presented in (Hosozawa et al., 2018) and its performance is compared with two other open-source OCR engines: NHocr and OCRopus (Breuel, 2008). Tesseract has the most applicable character types and the best recognition accuracy, but it must be pre-processed to remove the influence of the background to have good recognition results in complex backgrounds. The open-source OCR method is used in (Kamisetty et al., 2022) to recognise the invoice content, where the paper first performs image preprocessing, after which three different OCRs are tested: the Keras OCR, Easy OCR and Tesseract OCR, where the Tesseract OCR gives the best recognition accuracy. But these open-source OCR engines are mainly used for character recognition of simple backgrounds like electronic documents. Complex pre-processing operations are required for images with background patterns, and the recognition results are not satisfactory.
Separation of Machine-Printed and Handwritten Texts in Noisy Documents using Wavelet Transform
Published in IETE Technical Review, 2019
The development of OCR started in the early 1950 and after 1980, it is rapidly advanced because of in-depth research and development. Building up an OCR, which works for both printed and handwritten texts is not an easy task. Moreover, it is an ineffective solution from the time and cost perspectives. To solve this problem, one of the efficient ways is the utilization of separate OCRs for these two types of texts. Thus, it becomes an essential condition to separate these texts. Such texts separation will increase the accuracy of OCR because either of two OCRs, printed or handwritten, only an appropriate will be activated at a time.
Mutual neighbors and diagonal loading-based sparse locally linear embedding
Published in Applied Artificial Intelligence, 2018
Rassoul Hajizadeh, Ali Aghagolzadeh, Mehdi Ezoji
In this study, we are interested in the offline Persian handwritten digits/characters recognition. OCR is a technique that converts the digital image and PDF format documents to searchable and editable documents. Machines enable to automatically recognize and separate check banks, bank statements, postal code, license plate etc. by OCR technique. OCR techniques are categorized into online and offline.