Text Analysis With R

Over the last few years, social research methodologies have rapidly evolved. Particularly in the communication sciences field, the enormous amount of documents made available by digital media has motivated researchers to identify and develop new methods for large-scale analysis of text data. Computational methodologies have proved helpful in extracting information from texts relying on the power of algorithms. Researchers have increasingly employed them to analyze data from various sources, such as social media, online newspapers, and political speeches. This three-day seminar will provide an overview of computational methodologies for text analysis and guide the participants to the essential tools to conduct computational analysis on text data.

Main targets and objectives

This course is intended for communication and social sciences researchers at the Ph.D. and postdoc levels who want an overview of the computational techniques available for analyzing text data. The most important analytical techniques and their R software implementations will be presented. Hands-on tutorials will guide participants in the analysis of real-world data.

Program

The seminar introduces basic concepts and principles of computational text analysis and their implementation in R. It offers an overview of the most common approaches, from exploratory analysis to the supervised, semi-supervised, and unsupervised text analysis methods, including dictionary-based analysis, topic modeling, and machine learning classification techniques. It guides the participants step by step through the different phases of a text data analysis project, from data collection and preprocessing to the R coding necessary to conduct the analysis, and offers the opportunity to apply methodological knowledge to real-world data during hands-on coding sessions. A detailed program is outlined below.

Prerequisites and requirements

No previous experience with R and automated text analysis techniques is assumed. An introduction to R will be provided at the beginning of the course.

Participants need to bring their laptops with them, on which they must have R and RStudio installed beforehand. Both softwares are free and available for Mac and Windows. Please follow the instructions provided at https://posit.co/download/rstudio-desktop/] to install R and RStudio (possibly the last version).

If you prefer instead to open a free account on RStudio Cloud, follow the instructions at this link: https://posit.cloud/plans/free.

About me

My name is Nicola Righetti, and I am a researcher in Computational Communication Science at the Department of Communication of the University of Vienna (Austria), where I teach Advanced Data Analysis with R and other methodological classes. I am also an associate researcher at the Computational Communication Lab of the same university, a research center whose goal is to facilitate the development and application of computational methods through excellence in empirical research and the advancement of novel theoretical and methodological perspectives.

More about me:

Website: https://www.nicolarighetti.net

Mastodon: https://mas.to/@nicolarighetti

GitHub: https://github.com/nicolarighetti

LinkedIn: https://www.linkedin.com/in/nicolarighetti79/