Official course description:

Full info last published 15/11-22
Course info
Language:
English
ECTS points:
7.5
Course code:
KSDASCP1KU
Participants max:
50
Offered to guest students:
yes
Offered to exchange students:
yes
Offered as a single subject:
yes
Price for EU/EEA citizens (Single Subject):
10625 DKK
Programme
Level:
MSc. Master
Programme:
MSc in Data Science
Staff
Course manager
Associate Professor, Head of study programme
Course semester
Semester
Forår 2023
Start
30 January 2023
End
25 August 2023
Exam
Exam type
ordinær
Internal/External
ekstern censur
Grade Scale
7-trinsskala
Exam Language
GB
Abstract

This course will introduce classes of tasks that are at the core of most real-world production systems. It will teach advanced solutions to solve these tasks on complex and large-scale data with state-of-the-art tools.

Description

At the core of most IT production systems there are algorithmic solutions to problems of ranking and matching. Solving these two fundamental tasks enables a wide variety of services: getting the best list of images for a search query, getting a recommendation for the best next song to listen to, finding new friends in online social media, and much more. In this course, we will introduce advanced concepts of Information Retrieval, Recommenders Systems, Computational Advertising, and Dev-Ops tools to deploy these services at scale.

In particular, the course will cover the following subjects:

  • Information retrieval systems
    • Indexing large-scale data
    • Ranking and weighting for relevance
    • Search strategies
    • Learning to Rank
    • Grouping and detection of near duplicates
    • Elasticsearch
  • Recommender systems
    • Content-based recommendations
    • Collaborative filtering
    • Dimensionality reduction
    • Matrix factorization for personalization
    • Multiarmed bandits
    • Link recommendation
  • Computational Advertising
    • Advertisement auctions and bidding
    • Advertisement matching
    • A/B testing
  • Dev-Ops concepts and tools
    • Deployment
    • Orchestration
    • Dev-ops tools (e.g., Docker, Kubernetes)
  • Metrics to evaluate the performance of ranking and matching systems

Formal prerequisites

A solid background in Python programming, Linear Algebra, and fundamentals of machine learning is required. 


Intended learning outcomes

After the course, the student should be able to:

  • Design and implement a recommender system that satisfies given requirement
  • Design and implement simple information retrieval systems
  • Design and implement methods to extract structured information from linked data
  • Discuss possible architectural solutions to address complex problems of ranking and matching
  • Recommend the most appropriate techniques and metrics to evaluate the performance of a given production task
  • Design and implement software for basic deployment and orchestration of services
Learning activities

The course will consist of lectures and hands-on practice with coding, mostly in Python.

The students will be presented with tasks that are typical of IT production systems and they will be asked to reflect on them and to propose possible solutions. These activities will be similar to those that the students will need to complete for their exam. The students will also have the opportunity to code some of the solutions they come up with.

After the lecture, the students will be invited to perform some complimentary activities including reading and watching videos that expand on the concepts discussed during the lectures.


Course literature

Some of the material included in the following books will be part of the course. These books are intended as optional reading and support material. The course will be self-contained and reading these books is not necessary to pass the exam with full grades.

Student Activity Budget
Estimated distribution of learning activities for the typical student
  • Preparation for lectures and exercises: 5%
  • Lectures: 25%
  • Exercises: 35%
  • Exam with preparation: 35%
Ordinary exam
Exam type:
C: Submission of written work, External (7-point scale)
Exam variation:
C11: Submission of written work
Exam submission description:
The exam will consist of two parts. First, a series of open questions about the topics taught in class, to be answered with a short paragraph each. Second, a set of coding exercises in the domains of recommendation systems, information retrieval, graph mining, computational advertising, and DevOps. The students will be asked to comment the code to justify and explain their choices. The final submission will contain a pdf with the answers to the questions and a some software that implements the coding exercises (mostly Python code).


reexam
Exam type:
C: Submission of written work, External (7-point scale)
Exam variation:
C11: Submission of written work

Time and date