Official course description:
AbstractThis course aims to familiarize students with the pipeline for Data Science projects: From a domain-specific context and associated data we need to identify and formulate a domain-specific research question and translate it into a technical problem, which can then be addressed with techniques within Data Science. After performing the relevant data analysis, the results should be communicated in the context of the domain.
DescriptionThe course consists of a series of full-fledged Data Science mini-projects from start to finish, including the initial memo, technical translation of the problem, some methodology decisions, implementation, evaluation, and translation of the results back into non-technical language. Through this course students will gain experience with online collaboration using platforms such as GitHub and Overleaf.
This course combines knowledge from the first-semester courses Introduction to Data Science and Programming, Linear Algebra and Optimisation, and Data science in Research, Business and Society with knowledge that will be acquired during the second semester from the two concurrent courses.
The course is only open for students enrolled in BSc in Data Science.
Intended learning outcomes
After the course, the student should be able to:
- Identify and delimit a problem in Data Science within a given domain-specific context
- Discuss the relevant options for an appropriate scientific methodology to address the problem; this covers considerations on the data-analytical approach and on the implementational approach
- Carry out the full analysis according to the selected methodology
- Communicate their work to both experts and non-experts; this should cover the entire pipeline from problem formulation to analysis methods and their results
The course comprises two group projects. Each project is associated with a lecture series on the topics of the project, exercise sessions, and project supervision. Hand in and presentation of the first project is a mandatory activity.
The second project is submitted as exam submission followed by oral exam.
In this project you will learn to measure features in images of skin lesions, and predict the diagnosis (for example, melanoma) from these features in an automatic way. You will:
- Implement methods to measure ”handcrafted” features
- Explore and transform these features
- Use the features with a machine learning classifier to predict the lesion
- Perform experiments to evaluate different parts of your method
You will receive data and Python code to start with at the start of the project.
obligatory activity comprises of a hand in of the first project and an oral presentation of the first project.
The pedagogical function of the obligatory activities is to provide the
students with an opportunity to practice the ILO’s of the course, specifically
competences such as oral presentation. In relation to the entire study
programme, the mandatory activity will act as a “lab” for testing out and get
feedback on their level of competence.
The students will be given formative feedback in order to further develop their competences i.e., oral presentation, written presentation, choice of data visualisation etc. The feedback will serve as supervision-like feedback. The students will be praised on the good parts as well as given constructive feedback on what can improve.
If the students fail to participate in the oral presentation or the presentation in itself is insufficient, they will be given another try.
The student will receive the grade NA (not approved) at the ordinary exam, if the mandatory activities are not approved and the student will use an exam attempt.
The course literature is published in the course page in LearnIT.
Student Activity BudgetEstimated distribution of learning activities for the typical student
- Preparation for lectures and exercises: 15%
- Lectures: 10%
- Exercises: 15%
- Assignments: 5%
- Project work, supervision included: 50%
- Exam with preparation: 5%
Ordinary examExam type:
D: Submission of written work with following oral, Internal (7-point scale)
D2G: Submission for groups with following oral exam supplemented by the submission. Shared responsibility for the report.
You must hand in:
- groupX report.pdf: A project report, written in LaTeX. A template will be provided.
- groupX code.zip: A zip file of your Github repository, with code that can reproduce your results and classify other images.
More specifically the zip file should contain:
- A .csv file with the features you measured
- A Python script that starts from the raw data, and creates the figures and tables in your report. Do not include the data itself.
- A Python script that takes an image and outputs its probability of being melanoma
- Group size: 4-5 students
- The exam starts with a maximum 10 minute presentation by the group
- We will then ask questions while the whole group present. The questions can be about your project, and general material covered during the lectures.
- Typically the questions will start with undirected questions (anybody in the group can answer), followed by directed questions (for example if a specific student did not volunteer any answer yet).
- The group leaves for a few minutes while the examiners deliberate, then we invite you back for your grade/short feedback
- More feedback will be available afterwards
Group exam : Joint student presentation followed by a group dialogue. All the students are present in the examination room throughout the examination.
B: Oral exam, Internal (7-point scale)
B1I: Oral exam with time for preparation. In-house.
Time and dateOrdinary Exam - submission Fri, 2 June 2023, 08:00 - 14:00
Ordinary Exam Mon, 19 June 2023, 09:00 - 21:00
Ordinary Exam Tue, 20 June 2023, 09:00 - 21:00
Ordinary Exam Wed, 21 June 2023, 09:00 - 21:00
Reexam Mon, 14 Aug 2023, 09:00 - 21:00