Reflections on Data Science
AbstractIn this course you will learn to reflect on the use and societal implications of data, models and algorithms.
Whether to provide predictions, information, assessments, or evidence of some sort, the use of data comes with important responsibilities, ethical concerns, social impact, and sometimes generates unintended consequences. How can we check that a claim based on data is plausible? How can we ensure that our data analysis is sound and reproducible? What are the technical and societal consequences of using biased data to train our algorithms? In this course we will explore these and other similar questions using real-world cases studies, and we will provide a set of concepts, approaches and tools
- to think critically about the data, models and algorithms that constitute evidence in the social and natural sciences and that provide predictions of any sort, and
- to reflect on the consequences and ethical concerns when using data.
- Calling BS with data
- Reviewing causality and lying with statistics
- Algorithmic bias
- Traps in the use of big data
- Diffusion processes important for society (fake news, epidemics, performance and success)
For each topic, the course will focus on the societal impact of the studied concepts, and will emphasize how we, data scientists, can ensure and promote data use that is correct, ethical and unbiased.
This course is designed for 6th semester Bachelor in Data science students, and as such builds on the knowledge acquired in the courses in the previous 5th semesters.
Intended learning outcomes
After the course, the student should be able to:
- Describe cases of misuse of data, and identify wrong or inaccurate claims using various appropriate tools
- Apply tools to ensure the reproducibility of data results
- Provide and discuss causal explanations based on data
- Describe issues that can arise with the use of big data
- Identify biased analyses and algorithms, and discuss possible solutions to correct for the biases
- Apply theoretical concepts and approaches to think critically about the data and models that constitute evidence in the social and natural sciences,
- Reflect on the benefits and drawbacks of using digital data in research, business, and in our everyday life
consists of lectures and exercises. Beyond lectures and exercise sessions, we will have various online
activities to be done before and after classes.
These online activities will include: readings and videos, discussion on forum, writing documents, statistical analyses. During the class, we will have group discussions, class discussions, quizzes, writing sessions, and various hands-on exercises based on the preparation activities done at home.
During the course the teachers will offer the opportunity to submit optional assignments (with specific format and deadlines) and receive feedback.
There are no mandatory activities. The students are however strongly encouraged to hand in assignments during the course to receive feedback.
The student will receive the grade NA (not approved) at the ordinary exam, if the mandatory activities are not approved and the student will use an exam attempt.
Study materials will be provided during the course from multiple sources (book excerpts , research papers, videos)
Student Activity BudgetEstimated distribution of learning activities for the typical student
- Preparation for lectures and exercises: 15%
- Lectures: 25%
- Exercises: 20%
- Assignments: 10%
- Project work, supervision included: 15%
- Exam with preparation: 10%
- Other: 5%
Ordinary examExam type:
C: Submission of written work, Internal (7-point scale)
C1G: Submission of written work for groups
Time and dateOrdinary Exam - submission Fri, 4 Jun 2021, 08:00 - 14:00
Reexam - submission Wed, 14 Jul 2021, 08:00 - 14:00