R

The first thing I need to do whenever I run an R workshop is to provide my participants some motivations that why they should bother learning R.

Below are some of the cool things you can do with R, ranging from data analysis, statistical modeling, visualization to making slides, dashboards, blogpost, interactive tutorials…

Statistical Modeling

First thing first, most people get to know about R and feel like using it, particularly for research purposes, because they have to do some statistical modelling and there is an R package that can do the job. In fact, statistical modeling is one of R’s greatest advantages. For simple stats modelling, SPSS, a program with no coding requirements, can do as good as R and many people are still relying on SPSS for their stats modelling. SPSS is not open-source and is owned by a company while R is open-source. This means that R users all around the world can contribute to enriching its fuctionalities via packages instead of a group people in the company. This advantage is not obvious in terms of simple stats models, such as t-tests or ANOVA. With this said, this website Summary and Analysis of Extension Program Evaluation in R, is my go-to place when I need to see how to do t-tests (of all kinds), ANOVA, regression. For more complicated and sophisticated models, you can simple google: “model name” with R.

Data wrangling

However, the data you get may not be ready as model input, and you will need to pre-process them or in a more technical term do data wrangling. Many functionalities R provides can be realized in Excel. But Excel cannot record the workflow in a natural way and has problems with very large dataset. In R, there are many packages (maybe too many!) that can do almost everything in terms of data pre-processing. A good starting point and reference point is R for Data Science by Garrett Grolemund and Hadley Wickham. I also have a tutorial for data wrangling which I use to run a three-hour workshop. Other books: Data Science Live Book

Data visualization

Usually, as linguistic researchers or speech scientists, we do data visualization in a static way, for journal publications or reports. R has a very powerful data visualization package called ggplot2.

A good and comprehensive introduction would be ggplot2: Elegant Graphics for Data Analysis.

For a quick reference, this is my go-to place when I do not know how to change small things like font size of a legend. The author of this site also has a free online e-book, R Graphics Cookbook, 2nd edition.

For people who are interested in doing map visualization, Geocomputation with R, a book on geographic data analysis, visualization and modeling, prvoides a good introduction.

You can also use R for network visualization, and here are some very comprehensive tutorials:

And this is an academic paper about network visualization issues.

It is also possbile to do diagrams in R, you can use a package called DiagramR.

Text analysis/mining

Text mining is the process of deriving high-quality information from text. Text mining usually involves the process of structuring the input text (usually parsing, along with the addition of some derived linguistic features and the removal of others, and subsequent insertion into a database), deriving patterns within the structured data, and finally evaluation and interpretation of the output. ‘High quality’ in text mining usually refers to some combination of relevance, novelty, and interest.

Typical text mining tasks include text categorization, text clustering, concept/entity extraction, production of granular taxonomies, sentiment analysis, document summarization, and entity relation modeling (i.e., learning relations between named entities). (From Wikipedia)

Text analysis involves information retrieval, lexical analysis to study word frequency distributions, pattern recognition, tagging/annotation, information extraction, data mining techniques including link and association analysis, visualization, and predictive analytics. The overarching goal is, essentially, to turn text into data for analysis, via application of natural language processing (NLP), different types of algorithms and analytical methods. An important phase of this process is the interpretation of the gathered information.

Books: Text mining with R

Web scrapping

We are living in a world of information and the Internet becomes a gaint center of it. Getting information from the Internet can be useful for language researchers who are interested in social variations of language.

Here are some packages and tutorials. rvest: easy web scraping with R

Tutorial 1

Tutorial 2

As more and more people are deeply immersed in the social media, analyzing their verbal behavior could be potentially revealing.

Book: Learning Social Media Analytics with R

Tutorial 1

Speech analysis

wrassp is a wrapper for R around Michel Scheffers’s libassp (Advanced Speech Signal Processor). The libassp library aims at providing functionality for handling speech signal files in most common audio formats and for performing analyses common in phonetic science/speech science. This includes the calculation of formants, fundamental frequency, root mean square, auto correlation, a variety of spectral analyses, zero crossing rate, filtering etc. This wrapper provides R with a large subset of libassp’s signal processing functions and provides them to the user in a (hopefully) user-friendly manner. Tutorial
soundgen builds upon the functionality of seewave, adding high-level functions for sound synthesis (see the vignette on sound synthesis), manipulation, and analysis. Tutorial
EMU-R: The EMU Speech Database Management System (EMU-SDMS) is a collection of software tools which aims to be as close to an all-in-one solution for generating, manipulating, querying, analyzing and managing speech databases as possible. Manual

Writing reports, blogs, slides or books with codes

When you want to output your work with R, it is desirable to embed your code with your analysis. This is not easy with Word or other file types. Thanks to the r markdown package, you can now:

Compile a single R Markdown document to a report in different formats, such as PDF, HTML, or Word.
Create notebooks in which you can directly run code chunks interactively. Tutorial: R Markdown: The Definitive Guide Tufte DataTables
Generate websites and blogs. Tutorial: blogdown: Creating Websites with R Markdown
Author books of multiple chapters. Tutorial: bookdown: Authoring Books and Technical Documents with R Markdown
Make slides for presentations Tutorial: xaringan, creating remark.js through R Markdown reveal.js Presentations reveal.js
Interactive Tutorials for R Tutorial
Produce dashboards with flexible, interactive, and attractive layouts. Short tutorial Full tutorial
Build interactive applications based on Shiny. Tutorial Example More examples

More advanced R programming

Books

Advanced R The book is designed primarily for R users who want to improve their programming skills and understanding of the language. It should also be useful for programmers coming to R from other languages, as help you to understand why R works the way it does.

R packages Packages are the fundamental units of reproducible R code. They include reusable R functions, the documentation that describes how to use them, and sample data. In this book you’ll learn how to turn your code into packages that others can easily download and use. Writing a package can seem overwhelming at first. So start with the basics and improve it over time. It doesn’t matter if your first version isn’t perfect as long as the next version is better.

Hands-On Programming with R

Efficient R Programming

R

Annotated R learning recources

R

Annotated R learning recources

Statistical Modeling

Data wrangling

Data visualization

Text analysis/mining

Web scrapping

Speech analysis

Writing reports, blogs, slides or books with codes

More advanced R programming

Books

Statistical Modeling

Data wrangling

Data visualization

Text analysis/mining

Web scrapping

Social media analysis

Speech analysis

Writing reports, blogs, slides or books with codes

More advanced R programming

Books