Digital Data Analysis (Autumn 2023)
Official course description:
Course info
Programme
Staff
Course semester
Exam
Abstract
The goal of the course is two-fold. On the one side you will learn how to clean, manipulate, process and visualize data in Python with a specific focus on unstrucured data that is typically produced online. On the other side, you will learn how to formulate hypothesys based on this data that can be used in the context of the evaluation or the re-design of a digital product.
Description
Every digital service leaves tons of data that, when analysed, can reveal important insights about the users as well as about the services they are using. Unfortunately, these data are not produced to answer specific questions - such as the case with surveys - but are the byproduct of users' activity. To untap the full potential of users' data we need to understand both how to ask relevant questions and how to answer those questions with the data available.
The course will cover the following areas:
- Digital data types, format and structure
- Formulating hypothesis and statistical testing
- Data visualization and data exploration with Vega-Altair
- Data wrangling with PANDAS
- Analysis of unstructured text data
- Analysis of network data
The course builds on top of previous courses: Brugerundersøgelser og kvantitative metoder (for hypothesys testing) and Introduction to programming for basics of Python.
Please note that the course shifts focus from using R to using Python from Autumn 2023
Formal prerequisites
This course is a 3rd semester course on the BSc Digital Design and Interactive Technologies. Students are expected to be familiar with quantitative methods and descriptive statistics.
The course builds on top of previous courses: Brugerundersøgelser og kvantitative metoder (for hypothesys testing) and Introduction to Programming for basic knowledge of Python.
Intended learning outcomes
After the course, the student should be able to:
- Analyze and visualize quantitative data produced in a variety of contexts (e.g. social media, online reviews)
- Use the Vega-Altair library to visualize data
- Understand interaction happening on social media or other relational data through network analysis
- Plan and design a data driven hypothesis testing analysis in the context of a digital product
Learning activities
14 lectures + 14 exercise sessions with ad-hoc exercises.
During the course the students will:
- Use various python libraries to analyze data produced in a digital contexts (e.g. social media, review data)
- Use the Altair-Vega library to visualize data
- Explore relational data using the NetworkX library
- Understand the hypothesis testing in the context of digital user data
Course literature
Selected chapters from:
- Python for Data Analysis, Wes McKinney O'REILLY
- Designing with Data, Rochelle King, Elizabeth F. Churchill & Caitlin Tan, O'REILLY
Student Activity Budget
Estimated distribution of learning activities for the typical student- Preparation for lectures and exercises: 20%
- Lectures: 30%
- Exercises: 30%
- Exam with preparation: 20%
Ordinary exam
Exam type:C: Submission of written work, Internal (7-point scale)
Exam variation:
C1G: Submission of written work for groups
The students will work in small groups (2-3 people) and present a report analyzing a dataset provided by the teacher. The students will have the opportunity to choose between multiple datasets, but they will have to perform a fixed set of data analysis mapped on the ILOs.
Group
- 2-3
reexam
Exam type:C: Submission of written work, Internal (7-point scale)
Exam variation:
C1G: Submission of written work for groups
The students will work in small groups (2-3 people) and present a report analyzing a dataset provided by the teacher. The students will have the opportunity to choose between multiple datasets, but they will have to perform a fixed set of data analysis mapped on the ILOs.
Group
- Group size 2-3 people
Time and date
Ordinary Exam - submission Mon, 8 Jan 2024, 08:00 - 14:00Reexam - submission Wed, 28 Feb 2024, 08:00 - 14:00