Official course description:
Full info last published 15/06-22

Data Science in Production

Course info
Language:
English
ECTS points:
7.5
Course code:
KSDASCP1KU
Participants max:
40
Offered to guest students:
no
Offered to exchange students:
no
Offered as a single subject:
no
Programme
Level:
MSc. Master
Programme:
MSc in Data Science
Staff
Course manager
Associate Professor
Course semester
Semester
Forår 2022
Start
31 January 2022
End
31 August 2022
Exam
Exam type
ordinær
Internal/External
ekstern censur
Grade Scale
7-trinsskala
Exam Language
GB
Abstract

This course will introduce classes of tasks that are at the core of most real-world production systems. It will teach advanced solutions to solve these tasks on complex and large-scale data with state-of-the-art tools.

Description

At the core of most IT production systems there are algorithmic solutions to problems of ranking and matching. Solving these two fundamental tasks enables a wide variety of services: getting the best list of images for a search query, getting a recommendation for the best next song to listen to, finding new friends in online social media, and much more. In this course, we will introduce advanced concepts of Information Retrieval, Recommenders Systems, and Linked Data Mining, which use a variety of advanced techniques of ranking and matching to enable complex services. We will introduce basic concepts of computational advertising.


In particular, the course will cover the following subjects:

  • Recommender systems
    • Content-based recommendations
    • Collaborative filtering
    • Dimensionality reduction
    • Matrix factorization for personalization
  • Information retrieval systems
    • Indexing large-scale data
    • Grouping and detection of near duplicates
    • Ranking and weighting for relevance
    • Search strategies
  • Computational Advertising
    • Advertisement auctions and bidding
    • Advertisement matching
  • Linked-data mining
    • Link analysis
    • Predicting the evolution of graphs
    • Graph representation learning
  • Metrics to evaluate the performance of ranking and matching systems

Formal prerequisites

A solid background in Python programming is required. Basics of Statistics, Linear Algebra, and Fundamentals of machine learning are strongly recommended.


Intended learning outcomes

After the course, the student should be able to:

  • Design and implement a recommender system that satisfies given requirement
  • Design and implement simple information retrieval systems
  • Design and implement methods to extract structured information from linked data
  • Discuss possible architectural solutions to address complex problems of ranking and matching
  • Recommend the most appropriate techniques and metrics to evaluate the performance of a given production task
Learning activities

The course will consist of lectures and hands-on practice with coding, mostly in Python.

The students will be presented with tasks that are typical of IT production systems and they will be asked to reflect on them and to propose possible solutions. These activities will be similar to those that the students will need to complete for their exam. The students will also have the opportunity to code some of the solutions they come up with and to submit them as optional assignments to receive feedback.

During the lecture, there will be moments for the students to engage with each other and with the teacher through quizzes and open discussions in groups. After the lecture, the students will be invited to perform some complimentary activities including reading and watching videos that expand on the concepts discussed during the lectures.


Course literature

Some of the material included in the following books will be part of the course. These books are intended as optional reading and support material. The course will be self-contained and reading these books is not necessary to pass the exam with full grades.

Student Activity Budget
Estimated distribution of learning activities for the typical student
  • Preparation for lectures and exercises: 15%
  • Lectures: 25%
  • Exercises: 30%
  • Exam with preparation: 30%
Ordinary exam
Exam type:
C: Submission of written work, External (7-point scale)
Exam variation:
C11: Submission of written work
Exam submisson description:
The exam will consist of a series of tasks in the domains of recommendation systems, information retrieval, graph mining, or computational advertising. The student will be asked to write a report on how these tasks can be solved and to write code to implement the proposed solutions. The final submission will contain a written report and Python code.


reexam
Exam type:
C: Submission of written work, External (7-point scale)
Exam variation:
C11: Submission of written work

Time and date
Ordinary Exam - submission Fri, 3 Jun 2022, 08:00 - 14:00
Reexam - submission Wed, 27 Jul 2022, 08:00 - 14:00