Official course description:

Full info last published 27/11-23
Course info
Language:
English
ECTS points:
7.5
Course code:
KSDAMIN1KU
Participants max:
70
Offered to guest students:
yes
Offered to exchange students:
yes
Offered as a single subject:
yes
Price for EU/EEA citizens (Single Subject):
10625 DKK
Programme
Level:
MSc. Master
Programme:
MSc in Software Design
Staff
Course manager
Associate Professor
Course semester
Semester
Efterår 2023
Start
28 August 2023
End
26 January 2024
Exam
Exam type
ordinær
Internal/External
ekstern censur
Grade Scale
7-trinsskala
Exam Language
GB
Abstract

This course gives an introduction to the field of data mining. The course is relatively practically oriented, focusing on applicable algorithms. Practical exercises will involve both use of a freely available data mining package and individual implementation of algorithms.

Description

The course will cover the following main topics:

  • The data mining process
  • Cluster analysis
  • Data pre-processing
  • Pattern and association mining
  • Classification and prediction

Application examples will be given from domains including demographics, image processing and healthcare.

Formal prerequisites

Students must have experience with and be comfortable with programming, and be capable of independently implementing algorithms from descriptions. This corresponds to at least having passed an introductory programming course, and preferably also an intermediate-level programming course. The course will contain compulsory programming in Python.

Students must be familiar with basic mathematical notation and concepts such as variables, sets, functions, averages, and variance. These competencies can be obtained by taking e.g. a course on discrete mathematics.

Information about study structure:

This course is a specialisation course on the MSc Software Design study programme, as well as an elective for other MSc study programmes. 
Moreover the student must always meet the admission requirements of the IT University. 


Intended learning outcomes

After the course, the student should be able to:

  • Analyze data mining problems and reason about the most appropriate methods to apply to a given dataset and knowledge extraction need.
  • Implement basic pre-processing, association mining, classification and clustering algorithms.
  • Apply and reflect on advanced pre-processing, association mining, classification and clustering algorithms.
  • Work efficiently in groups and evaluate the algorithms on real-world problems.
Learning activities

The course consists of lectures ending with a project for the last part of the course. Most lectures are followed by a lab exercise, which involves independent programming. Students must be able to program. The default language is Python, and there is an introduction to this in week 1, as well as during the labs.

There is one mandatory assignment around the course midway point, where you will apply the techniques learned so far.

For the final project you will specify and work on a relevant Data Mining project of your choice. In this project you will apply the techniques and algorithms studied during the course on relevant real world problems. This will be done in groups of 2-4 persons. 

Besides the hours planned for lectures, tutorial, and exercise, supervision sessions for the group projects are planned which complement the theory covered during the lectures and are necessary for meeting the learning objectives of the course. Lectures provide theoretical foundations and walk-through examples of relevant data mining algorithms while exercises focus on students discussing and implementing the algorithms themselves.

Mandatory activities

There will be one mandatory assignment, consisting of using self-implemented data mining techniques on a simple data set and writing a report about it. 

The students will receive a pass/fail grade on the assignment, with follow-up formative feedback.

The pedagogical function of the mandatory project is to provide the students with an activity where they gain experiential knowledge supporting the ILOs reached in the course so far, including data preparation and machine learning classification. 

If the students fail to hand in/fail to pass the mandatory activity they will have to pass a repeat mandatory examination provided within a month of the grade.


The student will receive the grade NA (not approved) at the ordinary exam, if the mandatory activities are not approved and the student will use an exam attempt.

Course literature

The 100-page Machine Learning Book: http://themlbook.com/

Data Mining: Concepts and Techniques, 3rd ed.

Student Activity Budget
Estimated distribution of learning activities for the typical student
  • Preparation for lectures and exercises: 20%
  • Lectures: 15%
  • Exercises: 10%
  • Assignments: 10%
  • Project work, supervision included: 35%
  • Exam with preparation: 10%
Ordinary exam
Exam type:
D: Submission of written work with following oral, External (7-point scale)
Exam variation:
D1G: Submission for groups with following oral exam based on the submission. Shared responsibility for the report.
Exam submission description:
The final assessment will be a jointly written report. There will be a group presentation of this report followed by individual questions about the report and the work behind it, resulting in individual grades.
Group submission:
Group
  • 2-4
Exam duration per student for the oral exam:
15 minutes
Group exam form:
Mixed exam 2 : Joint student presentation followed by an individual dialogue. The group makes their presentations together and afterwards the students participate in the dialogue individually while the rest of the group is outside the room.


reexam
Exam type:
B: Oral exam
Exam variation:
B22: Oral exam with no time for preparation.
Exam duration per student for the oral exam:
20 minutes

Time and date
Ordinary Exam - submission Thu, 21 Dec 2023, 08:00 - 14:00
Ordinary Exam Mon, 15 Jan 2024, 09:00 - 21:00
Ordinary Exam Tue, 16 Jan 2024, 09:00 - 21:00
Ordinary Exam Wed, 17 Jan 2024, 09:00 - 21:00