Advanced Data Systems
In this course, you will both learn state-of-the-art techniques that power state-of-the-art data-intensive applications and systems running on modern hardware and get to apply these techniques on a modern data-intensive system.
To transform the sheer amount of complex data into timely discoveries that influence the society, data-intensive systems (including database systems, big data processing systems and machine learning frameworks) must utilize the full processing power offered by modern processor and storage technologies.
In this course, you will learn the state-of-the-art techniques for data management and processing on modern hardware (multicores, hardware accelerators, microsecond-scale storage, and 100 GBE).
In parallel, you will apply
some of these techniques on widely-used open-source data-intensive systems.
As a result, you will get hands-on experience with how to design, implement, and evaluate new components of an open-source data-intensive system.
If you are taking this course as part of Data Systems specialization of ITU Computer Science MSc, then Computer Systems Performance class is a pre-requisite.
On the other hand, while the knowledge gained from Computer Systems Performance is very useful in this class, the course is also open to students who hasn't taken Computer Systems Performance. Therefore, if you wish to learn about or interested in research in advanced data systems topics, don't hesitate to register.
Intended learning outcomes
After the course, the student should be able to:
- Analyze the functional and performance requirements of a data-intensive system (e.g., database system or machine learning platform)
- Navigate the codebase of complex production-grade open-source software
- Design and implement components in the context of a production-grade data system
- * Evaluate the performance characteristics of a software system
- Reflect upon the evolution of the hardware (processors, storage, networks) and its impact on the landscape of data-intensive systems and applications
- Reflect upon research papers published by others and present research work to a broad technical audience
The course is composed of lectures and exercise sessions.
- The lectures will have presentations and discussions of recent research papers (done by both the lecturers, external invited speakers, and students) that focus on data systems on modern hardware.
- Each week, we will have a specific topic to focus on in terms of papers.
- The exercise sessions will focus on project assignments. Therefore, in the distribution of time below for different class activities, exercises and project time add up to the total time spent on projects.
The course literature is published on the course page on LearnIT.
There will be research papers to read each week on that week's topic.
Student Activity BudgetEstimated distribution of learning activities for the typical student
- Preparation for lectures and exercises: 20%
- Lectures: 25%
- Exercises: 25%
- Project work, supervision included: 25%
- Exam with preparation: 5%
Ordinary examExam type:
D: Submission of written work with following oral, External (7-point scale)
D22: Submission with following oral exam supplemented by the submission.
Students submit a report based on the project assignments. This report is submitted for the exam.
Time and dateOrdinary Exam - submission Wed, 4 Jan 2023, 08:00 - 14:00
Ordinary Exam Mon, 23 Jan 2023, 09:00 - 21:00
Reexam Fri, 10 Mar 2023, 09:00 - 12:00