Data Mining (Autumn 2019)

Autumn 2019
Autumn 2019 Autumn 2020 Autumn 2021 Autumn 2022 Autumn 2023 Autumn 2024

Official course description:

Full info last published 27/06-19

Course info

Language:

English

ECTS points:

7.5

Course code:

KSDAMIN1KU

Offered to guest students:

yes

Offered to exchange students:

Offered as a single subject:

yes

Price for EU/EEA citizens (Single Subject):

10625 DKK

Programme

Level:

MSc. Master

Programme:

MSc in Software Design

Staff

Course manager

Leon Derczynski

Assistant Professor

Course semester

Semester

Efterår 2019

Start

26 August 2019

End

31 January 2020

Abstract

This course gives an introduction to the field of data mining. The course is relatively practically oriented, focusing on applicable algorithms. Practical exercises will involve both use of a freely available data mining package and individual implementation of algorithms.

Description

The course will cover the following main topics:

The data mining process
Cluster analysis
Data pre-processing
Pattern and association mining
Classification and prediction

Application examples will be given from domains including demographics, image processing and healthcare.

Formal prerequisites

Students must have experience with and be comfortable with programming, and be capable of independently implementing algorithms from descriptions. This corresponds to at least having passed an introductory programming course, and preferably also an intermediate-level programming course. The course will contain compulsory programming.

Information about study structure:
This course is a specialisation course on the MSc Software Design study programme, as well as an elective for other MSc study programmes.
Moreover the student must always meet the admission requirements of the IT University.

Intended learning outcomes

After the course, the student should be able to:

Analyze data mining problems and reason about the most appropriate methods to apply to a given dataset and knowledge extraction need.
Implement basic pre-processing, association mining, classification and clustering algorithms.
Apply and reflect on advanced pre-processing, association mining, classification and clustering algorithms.
Work efficiently in groups and evaluate the algorithms on real-world problems.

Learning activities

The course consists of lectures ending with a project for the last part of the course. Most lectures are followed by a lab exercise, which involves independent programming. Students must be able to program. The default language is Python, but other languages are possible.

A large part of the course will be taken up by the weekly exercises. For the final project you will specify and work on a relevant Data Mining project of your choice. In this project you will apply the techniques and algorithms studied during the course on relevant real world problems. This will be done in groups of 3 persons.

Besides the hours planned for lectures, tutorial, and exercise, supervision sessions for the group projects are planned which complement the theory covered during the lectures and are necessary for meeting the learning objectives of the course. Lectures provide theoretical foundations and walk-through examples of relevant data mining algorithms while exercises focus on students discussing and implementing the central algorithms themselves.

Mandatory activities

There will be one mandatory assignment, consisting of using self-implemented data mining techniques on a simple data set and writing a report about it.

The student will receive the grade NA (not approved) at the ordinary exam, if the mandatory activities are not approved and the student will use an exam attempt.

Course literature

The course literature is published in the course page in LearnIT.

Ordinary exam

Exam type:
D: Submission of written work with following oral, external (7-trinsskala)
Exam variation:
D2G: Submission of written work for groups with following oral exam supplemented by the work submitted. The group has a shared responsibility for the content of the report.
Exam description:
Mixed exam 1
The students make a joint presentation followed by a group dialogue. Subsequently the students are having individual examination with presentation and / or dialogue with the supervisor and external examiner while the rest of the group is outside the room.
Duration: 20 minutes per student.
The oral exam will cover the entire course's material. The written hand-in is the result of a self-selected data mining project performed in a group of three. The written submission will allow for an allocation of effort in the group.

reexam

Exam type:
Z. To be decided, - (-)