This document is to accompany Eploratory Data Analysis with R tutorial for DH Downunder 2019 at the University of Newcastle, Australia, from 9-13 December.

I am a speech scientist working on cross-language lexical tone perception and production. I have rich experience dealing with experimental data and I am keen to help others with data wrangling, data visualization and statistical modelling problems. I aspire to promote a streamlined workflow with R packages to improve data analysis efficiency in quantitative analysis in the field of social science and linguistics.

If you have any questions about the tutorial, please e-mail me at: j.chen2@westernsydney.edu.au

This workshop will show how to use data transformation and visualization to explore your data in a systematic way, or in a statistical term, exploratory data analysis. Participants will learn to generate questions about the data, search for answers by transforming, visualizing and modeling the dataset, and use what they learn to further refine the questions and/or generate new questions. The workshop will start by exploring variations in (categorical and continuous) one variable and move on to investigate covariations among two or three variables. Participants will learn to produce summary tables (calculating mean or standard deviation etc. of one or multiple variables by one or more variables) and will also learn to draw figures with ggplot2. This workshop builds on some knowledge of data wrangling. Therefore, it is desirable that participants should take the Introduction to data wrangling with R, if they have no such knowledge. Participants are welcomed to bring their own data and apply what they learn on the spot.

Before we start our journey of data wrangling with R, you will need to install R on your laptop. R is multi-platform, which means you can install R on your PC or MAC.

Here are the workshop materials.