Explore chapters and articles related to this topic
Reading in data locally and from the web
Published in Tiffany Timbers, Trevor Campbell, Melissa Lee, Data Science, 2022
Tiffany Timbers, Trevor Campbell, Melissa Lee
Another interesting thought: websites themselves are data! When you type a URL into your browser window, your browser asks the web server (another computer on the internet whose job it is to respond to requests for the website) to give it the website's data, and then your browser translates that data into something you can see. If the website shows you some information that you’re interested in, you could create a data set for yourself by copying and pasting that information into a file. This process of taking information directly from what a website displays is called web scraping (or sometimes screen scraping). Now, of course, copying and pasting information manually is a painstaking and error-prone process, especially when there is a lot of information to gather. So instead of asking your browser to translate the information that the web server provides into something you can see, you can collect that data programmatically—in the form of hypertext markup language (HTML) and cascading style sheet (CSS) code—and process it to extract useful information. HTML provides the basic structure of a site and tells the webpage how to display the content (e.g., titles, paragraphs, bullet lists etc.), whereas CSS helps style the content and tells the webpage how the HTML elements should be presented (e.g., colors, layouts, fonts etc.).
HTML and Scripts
Published in Tom Hutchison, Paul Allen, Web Marketing for the Music Business, 2013
Hypertext markup language (HTML) is the predominant authoring language for the creation of web pages. HTML defines the structure and layout of a web document by using a variety of tags and attributes to denote formatting of certain text as headings, paragraphs, and lists. HTML is written in the form of tags bracketed by the greater than and less than symbols, such as <tag>. Most tags come in pairs, the opening tag is listed as <tag>, and the closing tag is </tag>, which denotes the end of the previous command. For example, if you wanted to italicize a word in a sentence, you would precede the word with the tag for italics <i> and follow the word with the end tag </i>. Failure to include the end tag would result in everything from that point forward being presented in italics. A web visitor’s browser examines the HTML for instructions on how to display the graphics, text, and other multimedia components. Tutorials can be found at www.w3schools.com.
Speaking Naturally: Text and Natural Language Processing
Published in Jesús Rogel-Salazar, Advanced Data Science and Analytics with Python, 2020
Given the semi-structured nature of the data encountered in the web, it is necessary for us to determine what information is relevant to be scraped and whether it requires multiple pages to be parsed. Typically, we will need to parse HTML code standard for creating webpages and web applications. The elements that describe the page are defined by tags using angle brackets and they may look like this: <body>…</body>. In this case we have a body tag and the text between <body> and </body> corresponds to the visible content of the page. HTML stands for Hypertext Markup Language.
EZRVS: An AI-Based Web Application to Significantly Enhance Seismic Rapid Visual Screening of Buildings
Published in Journal of Earthquake Engineering, 2023
EZRVS is a web application used for rapid assessment of structures using both the FEMA-154 method and the method introduced in the previous section based on the neural network model. Given that the FEMA-154 building assessment is conducted at the building site, there is a need for a platform that can be used on smartphones and is compatible with various operating systems. On the other hand, the hardware limitations of smartphones increase the analysis time. Thus, contrary to some applications that had previously been developed for a specific platform only (such as Windows or Android), EZRVS is designed to be responsive and web-based so that it can be used on any device such as smartphones, laptops, PCs, tablets, and so on. In addition, all the calculations are conducted on the server to prevent hardware limitations from affecting the application. The application uses a Model-View-Template (MVT) architecture. MVT is a software design model for web application development. The model acts as a data interface and stores the data. The aforementioned is the logical data structure behind the entire application, which is represented by a database (MySQL in this case). “View” is the user interface, which includes everything the user sees in their browser while rendering a website and every process is done in view. The “Template” is provided by HTML, CSS, and JavaScript programming languages. The template of a model includes the fixed output components of the desired HTML and several syntaxes explaining how the dynamic content is inserted. Figure 5 demonstrates the respective process.
Design and implementation of a VoIP PBX integrated Vietnamese virtual assistant: a case study
Published in Journal of Information and Telecommunication, 2023
Hai Son Hoang, Anh Khoa Tran, Thanh Phong Doan, Huu Khoa Tran, Ngoc Minh Duc Dang, Hoang Nam Nguyen
The front-end languages used to create graphical user interfaces (GUI) in this article are hypertext markup language (HTML), cascading style sheets (CSS), and JavaScript (JS). The Bootstrap framework and the JQuery library are also used due to their customizability, speed of development, and ease of use. Moreover, for back-end development, Python and Perl languages are used. Asterisk's Monitor library is used to record voice commands and encode them into a.wav file before performing any analysis tasks. Rasa chatbot's modules were chosen because it is free and supports the Python language. Additionally, contacting for Rasa support is easier and faster than consulting a forum of many members worldwide. In addition, the provided tool has the advantage of supporting local languages, including Vietnamese, unlike similar products provided by AWS or Microsoft.
Managing the Intervention Costs of Musculoskeletal Disorders in the Hospital Workplace
Published in IISE Transactions on Occupational Ergonomics and Human Factors, 2021
Kyriakos Koklonis, Michail Sarafidis, Maria Vastardi, Stamatis Philippakis, Dimitrios Koutsouris
These indicators and their values are thoroughly explained in the manuals of the German Federal Institute for Occupational Safety and Health (Länderausschuss für Arbeitsschutz und Sicherheitstechnik (LASI), 2001, 2002). In the case that multiple work activities are combined, the average risk value is calculated. All the basic indicators along with the time indicator should be estimated in every assessment for every work activity. There are four different scales according to the final risk value, as summarized in Table 1 (Koklonis et al., 2019; Schmitter et al., 2010; Steinberg, 2012). An electronic application of the KIM was implemented that incorporates HTML, CSS, PHP, JavaScript, Apache HTTP Server as web server, and MySQL Server as the database management system. This application has been presented in detail in a previous study (Koklonis et al., 2019).