Tesseract – Knowledge and References

Explore chapters and articles related to this topic

Smart Parking System Using YOLOv3 Deep Learning Model

Published in Sam Goundar, Archana Purwar, Ajmer Singh, Applications of Artificial Intelligence, Big Data and Internet of Things in Sustainable Development, 2023

Narina Thakur, Sardar M N Islam, Zarqua Neyaz, Deepanshu Sadhwani, Rachna Jain

An extensive literature review was conducted to understand various deep learning models capable of performing real-time number plate detection. The fastest R-CNN model, VGG 16, YOLOv3, and Tiny-YOLOv3 have been identified as the most efficient and appropriate algorithms for detecting number plates in real-time in a literature review. The proposed system was trained using the YOLOv3-Darknet framework. The model for license plate detection was trained using YOLOv3 with CNN, which is capable of detecting object and entities. The results of our method yielded an accuracy score on training data and on validation to be 94.2 percent and 80 percent respectively. The YOLOv3 algorithms have been proven to be more accurate than the VGG16 methods. As a result, it has been concluded that YOLOv3 is the most effective algorithm for real-time detections. It is clear that due to the complicated ANPR system, it is currently impossible to achieve a 100 percent overall accuracy since each stage is dependent on the previous step. However, if bounding boxes are accurate, the algorithm will be able to extract the correct license plate numbers from an image. The research work can further be augmented by employing an applied noise reduction technique to improve license plate recognition accuracy without dramatically increasing calculation time. The disadvantage of using a single class classifier in an ensemble model is that it significantly increases computation time. Two strategies were investigated to address this issue. A proposal-based technology like Fast R-CNN can be utilized to minimize the calculation time of the underlying classifier. A parallel calculation can be employed to simultaneously calculate the basic classifier. Algorithms such as super resolution of images can be applied for low-resolution images. For segmenting multiple vehicle license plates, a coarse-to-fine methodology may be beneficial. Since OCR has become a commonly used and common tool in recent years, instead of redesigning the entire ANPR, the ANPR developers are focusing on increasing OCR accuracy. Even the Tesseract model (open sources) can be modified to improve the accuracy.

Pre-Processing of Dogri Text Corpus

View Chapter

Purchase Book

Published in Durgesh Kumar Mishra, Nilanjan Dey, Bharat Singh Deora, Amit Joshi, ICT for Competitive Strategies, 2020

Sonam Gandotra, Bhavna Arora

Tesseract have been used by various researchers for identification of the text from the images. (Kumar Audichya and Saini 2017) have used this open-source tool for recognition of Guajarati characters by using an available training script. Mean confidence of 86% has been achieved even when variations in font-size and styles are done. (L, J, and N 2016) has also compared the text extraction process by comparing the combination of Imagemagick and Tesseract with OCRopus. The results of the later combination are more promising as compared to the OCRopus. For pre-processing task like stop-word removal, the various methods used by the researchers account from DFA based models to frequency based to seeking help of linguistic experts for creating these lists as discussed by (Gandotra 2018). (Siddiqi and Sharan 2018) has prepared a generic stop-word list consisting of 800+ words entered manually in consultation with the linguistic experts. Manual creation of such lists is time-consuming and expensive. These lists are biased and there are chances of missing out some important information. (Jha et al. 2016) has employed the DFA approach for construction of the stop-word list by making use of the linguist features of the Hindi language. Pattern based on sequence of Hindi characters are used for DFA modelling. 200 documents are used for testing the generated stop-word list made from the proposed model and attained an accuracy of 99% and also employed least time for execution i.e. 1.77 secs only. Dictionary-based technique also known as the classical method has been used by (Vijayarani, Ilamathi, and Nithya 2015) for creating the stop-word list. In this case also, manual experts are employed for creating the list but more than one linguistic expert are employed for the task. This manual creation is helpful if no digital data is present or the corpus is not available. Statistical based approach has been used by (Garg et al. 2014) for Hindi stop-word list creation. Zipf’s law has been applied to extract high frequency and low-ranked words from the corpus. Also, (Puri, Bedi, and Goyal 2013) has also applied frequency-based technique for generating stop-word list for Punjabi language. They used a combination of two approaches i.e. frequency distribution and probability distribution for extraction of stop-words from the corpus. Both the approaches are combined to generate the desired list.

Multi-modal features and correlation incorporated Naive Bayes classifier for a semantic-enriched lecture video retrieval system

View Article

Journal Information

Published in The Imaging Science Journal, 2018

N. Poornima, B. Saleena

Tesseract classifier for OCR: Figure 2 shows the text extraction steps using OCR. This technology allows the machine to recognize the keywords automatically. For keyword extraction, OCR technology uses the tesseract classifier. The tesseract classifier [23] is an open-source optical character engine that extracts keywords from the images. The step-by-step procedure for the extraction of keywords from images is as follows: The input to the OCR block is the image, and the image is initially converted into a binary image and this binary image provides all the information to extract the character lines. Blobs are created from the character lines for the purpose of organizing the text lines. This text line is further analysed to determine the text size and organize the keywords. This tesseract classifier has an advantage of recognizing the keywords both in white and in black environment. Tessaract uses a command tool to convert and process the image to keyword. In this command tool, it requires two parameters, namely, the image file name that carries the keyword and the output text file that has all the extracted keywords. The output file extension of the tesseract is .txt. Tesseract recognizes all the languages and especially, in this paper, tesseract extracts the keywords in the English language. The accuracy of the tesseract classifier is cent per cent.

Fuzzy String Matching with a Deep Neural Network

View Article

Journal Information

Published in Applied Artificial Intelligence, 2018

Daniel Shapiro, Nathalie Japkowicz, Mathieu Lemay, Miodrag Bolic

A baseline measurement for the detection accuracy of keywords is to measure exact matches between OCR input and output text, and to calculate the resulting accuracy. The default OCR configuration had 61% accuracy processing the TESTING dataset. The tesseract-OCR software configuration can be tuned in various ways to improve OCR accuracy on non-dictionary text (Morris et al. 2016). Disabling the dictionaries did not improve the OCR accuracy (60%). Disabling OCR dictionaries was not an effective strategy for improving OCR accuracy when processing error message text from TESTING. These results leave significant room for other approaches to provide improvements in accuracy.