Explore chapters and articles related to this topic
R and the Tidyverse
Published in Tiffany Timbers, Trevor Campbell, Melissa Lee, Data Science, 2022
Tiffany Timbers, Trevor Campbell, Melissa Lee
Below you’ll see the code used to load the data into R using the read_csv function. Note that the read_csv function is not included in the base installation of R, meaning that it is not one of the primary functions ready to use when you install R. Therefore, you need to load it from somewhere else before you can use it. The place from which we will load it is called an R package. An R package is a collection of functions that can be used in addition to the built-in R package functions once loaded. The read_csv function, in particular, can be made accessible by loading the tidyverse R package3 [Wickham, 2021b; Wickham et al., 2019] using the library function. The tidyverse package contains many functions that we will use throughout this book to load, clean, wrangle, and visualize data.
Attribute data operations
Published in Robin Lovelace, Jakub Nowosad, Jannes Muenchow, Geocomputation with R, 2019
Robin Lovelace, Jakub Nowosad, Jannes Muenchow
Descriptive raster statistics belong to the so-called global raster operations. These and other typical raster processing operations are part of the map algebra scheme, which are covered in the next chapter (Section 4.3.2). Some function names clash between packages (e.g., select(), as discussed in a previous note). In addition to not loading packages by referring to functions verbosely (e.g., dplyr::select()), another way to prevent function names clashes is by unloading the offending package with detach(). The following command, for example, unloads the raster package (this can also be done in the package tab which resides by default in the right-bottom pane in RStudio): detach(”package:raster”, unload = TRUE, force = TRUE). The force argument makes sure that the package will be detached even if other packages depend on it. This, however, may lead to a restricted usability of packages depending on the detached package, and is therefore not recommended.
Introduction to R
Published in Jan Žižka, František Dařena, Arnošt Svoboda, Text Mining with Machine Learning, 2019
Jan Žižka, František Dařena, Arnošt Svoboda
To load a package in R, we call the library() function where the parameter is the name of a package. We may also use the function require(). It is a similar function which accepts a slightly different set of arguments.
Jumping and throwing performance in the World Masters’ Athletic Championships 1975-2016
Published in Research in Sports Medicine, 2019
Alexandra M. L. Kundert, Stefania Di Gangi, Pantelis T. Nikolaidis, Beat Knechtle
For data visualization, ggplot2 package was used. Independent t-test was performed to compare the average performance between genders for each discipline (i.e. event). Two-way analysis of variance (ANOVA) was used to compare effects of sex and time (i.e. calendar year, as factor variable) and effects of sex and age on performance for each event. Then effects (i.e. sex, time, age-group) and interactions (i.e. time and sex, age-group and sex) were considered more rigorously through a linear regression model for each event separately. Since there were repeated measurements within athletes, a mixed model was performed, with random effects on intercept for each athlete. R package lmer was used. Different regression model specifications were considered, with none, one, and two interaction terms and with different hypotheses about the time trend, linear and non-linear up to fiveth order. Model selection was performed using both Akaike information criterion (AIC) and the Bayes information criterion (BIC). The selected model, for each event except discus and javelin throw, was specified as follow:
Skill Requirements in Big Data: A Content Analysis of Job Advertisements
Published in Journal of Computer Information Systems, 2018
Adrian Gardiner, Cheryl Aasheim, Paige Rutner, Susan Williams
Before the content analysis phase, the job advertisements within the corpus were processed by removing all extraneous materials (e.g., HTML tags, company logos), leaving the core of the job description text. The R package tm was then used to eliminate typical stop words, stem terms, eliminate white space, and convert all text to lower case. This is a typical pre-processing step when dealing with automated and computer-aided text analysis. The R package tm was then used to identify n-grams (1–8 word phrases), using the NGramTokenizer procedure from the Weka machine learning library. Stemming, stop word removal, and term tokenization are standard text processing practices so there are a number of packages that can be used to reach this point in pre-processing.
A Study on the Estimation of Optimal Traffic Distribution near Breakwater in Busan Port
Published in Journal of International Maritime Safety, Environmental Affairs, and Shipping, 2020
Woo-Ju Son, Hyeong-Tak Lee, Ik-Soon Cho
As a result of the analysis of the optimal probability distribution model, it was analyzed that the distribution at the approach area near the breakwater of Busan Port is lognormal distribution in arrival and gamma distribution in departure. In this study, the “fitDistr” function of the propagate library of the R package was used to increase the reliability of the above results. This is a method to derive the optimal distribution based on BIC by comparing observed values of 32 probability distributions using the Monte Carlo simulation results through a histogram. Table 9 shows the comparison targets of 32 probability distributions compared through Monte Carlo simulation.