Zur Seitennavigation oder mit Tastenkombination für den accesskey-Taste und Taste 1 
Zum Seiteninhalt oder mit Tastenkombination für den accesskey und Taste 2 

Foto: Matthias Friel

Fundamentals of Data Science - Einzelansicht

Veranstaltungsart Vorlesung/Übung Veranstaltungsnummer 418112
SWS 4 Semester SoSe 2023
Einrichtung Wirtschaftswissenschaften   Sprache englisch
Weitere Links Moodle
Belegungsfrist 03.04.2023 - 10.05.2023

Belegung über PULS
Gruppe 1:
     jetzt belegen / abmelden
    Tag Zeit Rhythmus Dauer Raum Lehrperson Ausfall-/Ausweichtermine Max. Teilnehmer/-innen
Einzeltermine anzeigen
Vorlesung/Übung Di 10:00 bis 14:00 wöchentlich 18.04.2023 bis 25.07.2023  3.06.S18 Abramova ,
Gladkaya
04.07.2023: 
Einzeltermine anzeigen
Vorlesung/Übung Di 10:00 bis 14:00 Einzeltermin am 04.07.2023 3.06.S17 Gladkaya  
Kommentar

The first class will take place on 25.04.2023 - Introductory session (kick-off, introduction into data science & R).
Location: 3.06. S18 (Haus 6)
Please self-enroll in the Moodle course: https://moodle2.uni-potsdam.de/course/view.php?id=37026, password fods2023

Please expect information updates after 18.04., not earlier.

Please refrain from writing individual emails about topics, ECTS and organization. These will be announced during the 1st session.
Instead, invest time in installing R and RMarkdown on your laptops

  • Download and install R from the Comprehensive R Archive Network (CRAN) https://cran.r-project.org/
  • Download and install RStudio from http://www.rstudio.com/download

or familiarize yourself with an online solution https://posit.cloud/

For those who would like to refresh statistics knowledge, browse the book (uploaded to Moodle)

  • Practical Statistics for Data Scientists 50+ Essential Concepts Using R and Python
Literatur
  • Practical Statistics for Data Scientists 50+ Essential Concepts Using R and Python (uploaded to Moodle)
Voraussetzungen

Interest in Data Science.

This class is limited to 30 students.

The class will be held in English. Project presentations can be held in German or in English. Exam answers can be written in German or in English.

Leistungsnachweis

Graded assignments, project presentation & short report, and written exam.
If only a "pass" grade needed, exam is obligatory. Presentations and assignments cannot replace an exam.

The course is worth 6 ECTS. If needed, we can award less ECTS.
Erasmus students get a certificate upon successful completion of the course.

Lerninhalte

Data is increasingly seen as a driving force behind many industries, ranging from data-driven start-ups to traditional manufacturing companies. Recent years have been marked by the hype around big data technologies and the implications that go along with it. In response to these developments, data science has become one of the most demanded specializations. Against this background, this class will introduce students to the fundamentals of data science, using R for data analysis.

Purpose of the class: This course is an introduction to data science using the statistical programming language R. Preliminary R knowledge is not required. We start by introducing the very basic concepts of R programming and work our way through more sophisticated tasks of data representation, manipulation, and analysis. We illustrate every step with easy-to-follow examples.  After taking the course, you should be able to do the following: - Program in R for data science, which includes (a) getting help and (b) applying the code contributed by the active community of R developers - Get the data in and out of R - Understand the data via conducting descriptive analysis and visualizing the data - Create beautiful graphs and visualizations with the ggplot package - Use the power of R to build and assess statistical and machine learning models - Write reports and blog-posts in R Markdown

Audience: Bachelor students who are interested in data science and data analysis. At a broader level, the course serves as good preparation for writing a bachelor thesis or doing an internship in the "data science" field.

Format: Each week, we will cover a new topic and offer materials for practicing new skills and self-studying (HW assignments). Towards the end of the semester, group project work will allow course participants to apply their R-programming and data science skills and share results with fellow students. Each project group is assigned a specific dataset and works on the corresponding task, e.g., predicting customer churn, earthquakes, defaults on a loan or mortgage.

The language of project presentations: German or English. Lectures and Exercises will be held in English.

Syllabus (Tentative)

Tue 19.04 -- Organisational trivia & introduction into R

Tue 26.04 -- Objects in R

Tue 27.04 -- Functions & flow of the code & Data Import/Export

Tue 03.05 -- EDA & Visualization I

Tue 10.05 -- EDA & Visualization II

Tue 17.05 -- Visualisation III ggplot

Tue 24.05 -- Modeling Part I

Tue 31.05 -- Modeling Part II

Tue 07.06 -- Modeling Part III

Tue 14.06 -- Modeling Part IV & Consultation Hours

Tue 21.06 -- Project work (there will be no Q&A and no tutorial this week)

Tue 28.06 -- Project work: Deadline & Presentations-Session

Tue 05.07 -- Academic coordination

Tue 12.07 -- Academic coordination

Tue 19.07 -- Exam preparation

Tue 26.07 --Exam Termin 1. The exam questions will be formulated in English. You can answer them either in English or German.

Tue End of September --Exam Termin 2.


Strukturbaum
Keine Einordnung ins Vorlesungsverzeichnis vorhanden. Veranstaltung ist aus dem Semester SoSe 2023 , Aktuelles Semester: SoSe 2024