Official course description:

Full info last published 29/06-21
Course info
Language:
English
ECTS points:
7.5
Course code:
KSAASMC1KU
Participants max:
25
Offered to guest students:
no
Offered to exchange students:
yes
Offered as a single subject:
no
Programme
Level:
MSc. Master
Programme:
MSc in Computer Science
Staff
Course manager
Assistant Professor
Teacher
Postdoc
Course semester
Semester
Forår 2021
Start
1 February 2021
End
14 May 2021
Exam
Exam type
ordinær
Internal/External
ekstern censur
Grade Scale
7-trinsskala
Exam Language
GB
Abstract
This course introduces fundamental and advanced concepts in statistics and probability from a data-science perspective. The aim of the course is for the student to be familiarized with probabilistic and statistical methods that are widely used in data analysis.
Description

The aim of the course is to enable the student to work systematically with data sets with several variables which is important in regard to performing statistical analyses in data science. The course builds on the knowledge acquired in courses such as “Applied statistics” and “Machine Learning” and intends to give the student additional tools to identify, and solve statistical problems.

The course will cover the following subjects:
  • Probability theory 
  • Random variables 
  • Multivariate random variables 
  • High-dimensional problems 
  • Convergence of random processes 
  • Expectation-maximization 
  • Bayesian methods 
  • Outlier detection and clustering
Formal prerequisites
  • The prerequisites required for admission to the course is Linear Algebra and Probability or similar.
  • It is recommended to have a basic knowledge of statistics.
  • Students must be able to programme. The default language is Python, but other languages are possible.
Intended learning outcomes

After the course, the student should be able to:

  • Analyze statistical problems and reason about the most appropriate methods to apply
  • Apply and reflect on advanced applied statistical methods and tools for multivariate calculus
  • Identify and describe problems that can be solved using multivariate techniques
  • Implement basic statistical algorithms and interpret results
  • Summarize the results of an analysis in a statistical report
Learning activities

We turn this course into a PC 2.0 course.  
The outline for PC2.0 structure are: 
The teacher (in agreement with Head of Programme) choose another format for the teaching such as lectures, workshops or seminars, which is aligned with the intended learning outcome and the student population.
Learning activities of the PC 2.0 course form per definition also includes a written assignment with supervision.
The exam form for a PC 2.0 course is either C or D. 
For this course the exam form remains C.
Please read more about the learning activities below and in the course room in LearnIT.”

The course consists of lectures and seminars ending with a project for the last part of the course. Classes will consist of lectures, seminars, independent programming exercises and discussion sessions.
The default language is Python, but other languages are possible.

For the final project you will specify and work on a relevant project of your choice. In this project you will apply the techniques and algorithms studied during the course on relevant problems. Besides the hours planned for lectures, seminars, tutorial, and exercise, supervision sessions for the projects are planned which complement the theory covered during the lectures and are necessary for meeting the learning objectives of the course. Short lectures will provide theoretical foundations and walk-through examples of relevant data mining algorithms while programming exercises will focus on students discussing, applying, and implementing the central algorithms themselves. (Depending on the number of students that take the course, the projects can be done in groups of two.)

Course literature

The course literature is published in the course page in LearnIT.

Student Activity Budget
Estimated distribution of learning activities for the typical student
  • Preparation for lectures and exercises: 20%
  • Lectures: 20%
  • Exercises: 30%
  • Project work, supervision included: 30%
Ordinary exam
Exam type:
C: Submission of written work, External (7-point scale)
Exam variation:
C1G: Submission of written work for groups
Exam submission description:
The submission is a report describing a statistical analysis of a self-selected real world topic. Techniques from the course must be described and applied in the report.
If more than one students are in the group, the written submission will allow for an allocation of effort in the group.

Group submission:
Group
  • Group size 1-2 students.


reexam
Exam type:
C: Submission of written work, External (7-point scale)
Exam variation:
C1G: Submission of written work for groups

Time and date