Reflections on Data Science
AbstractIn this course you will learn to reflect on the use and societal implications of data, models and algorithms.
Whether to provide predictions, information, assessments, or evidence of some sort, the use of data comes with important responsibilities, ethical concerns, social impact, and sometimes generates unintended consequences. How can we check that a claim based on data is plausible? How can we ensure that our data analysis is sound and reproducible? What is surveillance and how is it used? What are the technical and societal consequences of using biased data to train our algorithms? In this course we will explore these and other similar questions using real-world cases studies, and we will provide a set of concepts, approaches and tools
- to think critically about the data, models and algorithms that constitute evidence in the social and natural sciences and that provide predictions of any sort, and
- to reflect on the consequences and ethical concerns when using data.
- Calling BS with data
- Reviewing causality and lying with statistics
- Algorithmic bias
- Traps in the use of big data
- Good and bad surveillance, privacy
- Diffusion processes important for society (fake news, epidemics, performance and success)
For each topic, the course will focus on the societal impact of the studied concepts, and will emphasize how we, data scientists, can ensure and promote data use that is correct, ethical and unbiased.
This course is designed for 6th semester Bachelor in Data science students, and as such builds on the knowledge acquired in the courses in the previous 5th semesters.
Intended learning outcomes
After the course, the student should be able to:
- Describe cases of misuse of data, and identify wrong or inaccurate claims using various appropriate tools
- Apply tools to ensure the reproducibility of data results
- Provide and discuss causal explanations based on data
- Describe issues that can arise with the use of big data
- Identify biased analyses and algorithms, and discuss possible solutions to correct for the biases
- Apply theoretical concepts and approaches to think critically about the data and models that constitute evidence in the social and natural sciences,
- Reflect on the benefits and drawbacks of using digital data in research, business, and in our everyday life
consists of lectures and exercises. Beyond lectures and exercise sessions, we will have various online
activities to be done before and after classes.
These online activities will include: readings and videos, discussion on forum, writing documents, using tools to ensure reproducibility. In class we will have frontal lectures, group discussions, class discussions, quizzes, writing sessions, and various hands-on exercises based on the preparation activities done at home.
During the course the teachers will offer the opportunity to submit optional assignments (with specific format and deadlines) and receive feedback.
Student Activity BudgetEstimated distribution of learning activities for the typical student
- Preparation for lectures and exercises: 15%
- Lectures: 25%
- Exercises: 20%
- Assignments: 10%
- Project work, supervision included: 15%
- Exam with preparation: 10%
- Other: 5%
Ordinary examExam type:
C: Submission of written work, internal (7-trinsskala)
CG: Submission of written work for groups.
Students will received detailed info during the course about the work to submit for the exam.
Group and individual
- Group size 3-4