XPath – Knowledge and References

Explore chapters and articles related to this topic

1

Published in Uwe Engel, Anabel Quan-Haase, Sunny Xun Liu, Lars Lyberg, Handbook of Computational Social Science, Volume 2, 2021

Stefan Bosse, Lena Dahlhaus, Uwe Engel

XPath is a language for matching paths and patterns in tree-structured documents, that is, XML or HTML documents. The XPath patterns address structure as well as data values. An XPath query uses a search pattern to return a list of matching nodes. An XPath is just a string descriptor composed of expressions, shown in Table 4.3.

Opinion Classification from Blogs

View Chapter

Purchase Book

Published in Wahiba Ben Abdessalem Karaa, Nilanjan Dey, Mining Multimedia Documents, 2017

Eya Ben Ahmed, Wahiba Ben Abdessalem Karaa, Ines Chouat

XPath is a language used to locate a portion of an XML document. Indeed, we use it to extract the required data from the resulting XML document. In our case, we employ the Firefox extension called “Firebug,” which determines the path XPath of a website element.

XML

View Chapter

Purchase Book

Published in David Austerberry, Digital Asset Management, 2012

David Austerberry

The XLink can be used for hyperlinking, but it is not limited to that function. Related to XLink are XPath and the XPointer framework. XPath is a language used for addressing internal elements of an XML document through XML trees.

Personalized file data query matching method based on SOA

View Article

Journal Information

Published in International Journal of Computers and Applications, 2021

Siyang Liu, Cheng Zhang

XPath and XQuery are the main query languages for XML document queries. XQuery is built on top of XPath expressions. They are based on the twig pattern query. XPath uses an axis to define a set of nodes relative to the current node, including the child axis ‘/’ (select all the child elements of the current node), descendant axis ‘//’ (select all descendant elements of the current node), and wildcards ‘*’ (matches any element node).

Robust Web Data Extraction Based on Weighted Path-layer Similarity

View Article

Journal Information

Published in Journal of Computer Information Systems, 2022

Peng Gao, Hao Han

A sample page of a shopping site and one of its page variants are mixed shown in the left part of figure 1. Their HTML trees are illustrated in the right part of the figure. For the sake of simplicity, the variation version just inserted a new color option for the product. Correspondingly, a red dotted branch carrying an image node that is marked as inserted has been added into the HTML tree and the other nodes are not changed. The XPath (XML Path Language) expression,25 that is often simply referred to as the XPath, can be used to navigate through elements and attributes in an HTML tree and locate the target. We define the following format to describe an XPath information: N1[O1][@A1-0 = “V1-0”][…][@A1-m1 = “V1-m1”]/ … /Nn[On][@An-0 = “Vn-0”][…][@An-mn = “Vn-mn”]/Nn+1 …, where Nn is the HTML node name of the n-th node; [On] is the order of the n-th node among its siblings having Nn as names; An-mn is the name of the m-th attribute of the n-th node; Vn-mn is the corresponding value of attribute An-mn; and Nn is the parent node of Nn+1. For instance, suppose that the target data is the price value of the phone product, i.e., “$451.75” and the text node with label ⑥ in figure 1. On the original page (before inserting), the following XPath expressions can be used to extract the target. Pa = /html/body/div[2][@id = “container”]/div[2][@id = “price”]/table[@id = “main”][@class = “product”]/tr[1]/td[2][@class = “price”]/text()Pb = /html/body/div[2]/div[2]/table/tr[1]/td[2]/text()Pc = //div[@id = “price”]/*/tr[1]/td[2]/text()