PULS
Foto: Matthias Friel
Social Scientists are increasingly using unconventional sources of web data that are originally not provided for scientific purposes but contain valuable information on human behavior, interactions, attitudes or institutional settings. Applied examples include, among many others, discrimination on Blablacar (Tjaden et al. 2018), political advertising on Wikipedia (Göbel & Munzert 2017) or measuring the impact of journalistic articles’ publication on social media activity on Twitter (King et al. 2017).
This course aims at enabling students with the R programming skills necessary to gather online data for their research by themselves and to transform it into a format suitable for analysis. Different types of online data sources (static web pages, dynamic web pages, APIs) will be covered that need different approaches in R. To scrape multiple pages, automatization techniques such as for-loops will be covered. It will also be discussed how large language models such as ChatGPT can assist you in writing your syntax. Research papers applying the methods will be provided as readings and can be discussed if there is sufficient time.
Students are required to bring their own Laptop with a recent version RStudio installed. It is advantageous to already have some basic knowledge of the R programming language before visiting the course. If this is not the case and you would still like to participate, I recommend using one of the many online sources beforehand (e.g. https://jaspertjaden.github.io/course-intro2r/).
At the end of the course, there will be a graded assignment in which students gather web data by themselves and report the key insights in a seminar paper. Depending on course size and preferences, this may also happen in small groups.
Basic R programming skills
Availability of personal Laptop
© Copyright HISHochschul-Informations-System eG