Official course description:

Full info last published 16/08-19
Course info
Language:
English
ECTS points:
15.0
Course code:
KSADDAS1KU
Offered to guest students:
yes
Offered to exchange students:
Offered as a single subject:
yes
Price for EU/EEA citizens (Single Subject):
21250 DKK
Programme
Level:
MSc. Master
Programme:
MSc in Computer Science
Staff
Course manager
Full Professor
Teacher
Associate Professor
Course semester
Semester
Efterår 2019
Start
26 August 2019
End
31 January 2020
Exam
Exam type
ordinær
Internal/External
intern censur
Grade Scale
7-trinsskala
Exam Language
GB
Abstract

To transform the sheer amount of complex data into timely discoveries that influence the society, data-intensive systems (including database system and machine learning platforms) must utilize the full processing power offered by modern servers.

In this course, you will learn how to design, implement, and evaluate new components of a production-grade open-source data-intensive system. You will learn the techniques for data management on modern hardware (multi-cores, microsecond-scale storage, and 100 GBE) and apply them with hands-on experience with the internals of an open-source system.

Description

Formal prerequisites

Computer Systems Performance class.

Intended learning outcomes

After the course, the student should be able to:

  • Analyze the functional and performance requirement of a data-intensive system (database system or machine learning platform);
  • Navigate the codebase of production-grade open-source software;
  • Design and implement components in the context of a production-grade data system;
  • Evaluate the performance characteristics of a software system.
Learning activities

The course is based on lectures, a seminar and assignments:

  • The lectures focus on fundamental principles underlying the design and implementation of modern operating system, network system, file system and database system;
  • The seminar (presentation and structured discussion of research articles) will focus on recent advances in data systems including computational storage, cross-layer design, in-network processing as well as experience report of machine learning and data systems solutions in modern data centers;
  • The assignments will consist of two new software components to be developed in the context of the OX NVMe controller accessed from a user-space NVMe driver. The first component is a data system task (e.g., partitioning, hashing, matrix multiplication) on the host side, the second component is a computational storage component on the storage side. Developments will take place on a Stingray platform (100 GE, ARM V8, SSD) accessed from a x86 server (32 cores, 256 GB RAM).
Note! The course is transformed to a PC 2.0 course and learning activities stated above do not apply any longer. Please consult learnIT for a current description.

Mandatory activities

There are 2 mandatory deliverables corresponding to the two components to be developed.

The student will receive the grade NA (not approved) at the ordinary exam, if the mandatory activities are not approved and the student will use an exam attempt.

Course literature

The course literature is published in the course page in LearnIT.

Ordinary exam
Exam type:
D: Submission of written work with following oral, internal (7-trinsskala)
Exam variation:
D2G: Submission of written work for groups with following oral exam supplemented by the work submitted. The group has a shared responsibility for the content of the report.
Exam description:

The report will consist of the description of the design choices and implementation techniques used during the assignments as well as an experimental study of the developed system.

Group size: 2 persons.

Group form: Group exam

Duration of the oral exam: 20 minutes per student.



Time and date