Explore chapters and articles related to this topic
Introduction
Published in Yu-Jin Zhang, A Selection of Image Analysis Techniques, 2023
The above-mentioned technologies can be unified together and called Image Engineering (IE) technology. IE is a new interdisciplinary subject that systematically studies various image theories, technologies, and applications (Zhang 1996). From the perspective of its research methods, it can learn from many disciplines, such as mathematics, physics, physiology, psychology, electronics, and computer science. From the perspective of its research scope, it is related to and overlaps with many disciplines, such as pattern recognition, computer vision, and computer graphics. In addition, the research progress of IE is closely related to theories and technologies such as artificial intelligence, neural networks, genetic algorithms, fuzzy logic, and machine learning. Its development and application are related to and indivisible with medicine, remote sensing, communication, document processing, industrial automation, intelligent transportation, and so on.
Artificial Intelligence for Document Image Analysis
Published in Sk Md Obaidullah, KC Santosh, Teresa Gonçalves, Nibaran Das, Kaushik Roy, Document Processing Using Machine Learning, 2019
Himadri Mukherjee, Payel Rakshit, Ankita Dhar, Sk Md Obaidullah, KC Santosh, Santanu Phadikar, Kaushik Roy
Artificial intelligence has developed significantly over the years and has provided different simplified solutions. Document image processing has also advanced with the passage of time. Document processing involves both recognition and understanding. Optical character recognition is required for recognition of characters, words and sentences from document images which is followed by natural language processing for the purpose of understanding. Research in these fields for different non-Indic languages has progressed significantly, while for Indic languages it has not. In this chapter, we have presented some avenues for OCR and NLP which are critical in developing a full-fledged system which can truly understand a document in an Indic language.
Introduction
Published in Anuradha D. Thakare, Shilpa Laddha, Ambika Pawar, Hybrid Intelligent Systems for Information Retrieval, 2023
Anuradha D. Thakare, Shilpa Laddha, Ambika Pawar
An information collection system plays a critical role in information retrieval, query processing, and wireless networks, in addition to its signifi-cant role within the network information platform. As methods for record recovery, researchers are aided in extracting documents from datasets. Traditional keyword-based information retrieval models ignore semantic information that can’t reflect the user’s needs. As a result, it is important for consumers to be able to get personalized information quickly. For exact index term frequency, there is no list of experts on ontology-based systems. Therefore, an ontology model for document processing and document recognition [4] was proposed by the researchers.
Composition pattern-aware web service recommendation based on depth factorisation machine
Published in Connection Science, 2021
Bing Tang, Mingdong Tang, Yanmin Xia, Meng-Yen Hsieh
Then, we compare the proposed EWACP-DeepFM with TF-IDF, LDA-FM, RTM-FM, and WDDF, and the comparison results have been shown in Figure 5. They have different document processing methods and feature extraction methods, and the factors considered are also different. The recommended number of Web services are also set to 2, 3, 5, 8 and 10. Overall, the performance of EWACP-DeepFM and WDDF is better than the others. Both EWACP-DeepFM and WDDF use DeepFM to learn features, which also proves the effectiveness of DeepFM. TF-IDF is inferior to others due to that it simply calculates document similarity. It can be seen that the precision of EWACP-DeepFM outperforms other methods distinctly. There is a continuous improvement in the performance of recall for the EWACP-DeepFM method as the increase of the recommended number of Web services. As the increase of the recommended number of Web services, EWACP-DeepFM is also better than all other methods in terms of average F-measure.
Creating research topic map for NIMS SAMURAI database using natural language processing approach
Published in Science and Technology of Advanced Materials: Methods, 2021
Sae Dieb, Kou Amano, Kosuke Tanabe, Daitetsu Sato, Masashi Ishii, Mikiko Tanifuji
The research output for a researcher is represented as a set resulting from the merging of five sets. In research publications, sections in the paper structure have different levels of influence on the publication topic [34]. For example, an occurrence in the title is more topic representative than an occurrence in the body. Several studies have used a term weighting strategy for document processing applications [35,36]. For this reason, occurrences of the terms should be wighted differently based on their location in the publication. We use a simple strategy by assigning a heavier weight for terms in the title and in the keywords section than those in the abstract. Because domain knowledge terms are extracted separately from other terms, the frequencies of domain knowledge terms are used in the case of double extraction to avoid double-counting. We denote , , , , and as the weights for , , , , and , respectively. Here is given by Equation (1)
A knowledge-based document assembly method to support semantic interoperability of enterprise information systems
Published in Enterprise Information Systems, 2022
Marko Marković, Stevan Gostojić
Along with simplifying document creation, facilitating its machine readability is important to enable the automation of additional steps during document processing. Izza (2009) distinguished three areas for the potential automation of enterprise operations, including internal operations (management and optimisation of resources), inbound logistics (supply chain management of goods and services), and outward logistics (delivery of goods and services to customers). Document flow is present in each of these areas, so the incorporation of automation in all three operational efforts would benefit from the availability of machine-readable documents.