Please be aware that changes may occur.
This course gives an introduction to the evolution of the data management landscape during the past 4-5 decades, with particular emphasis on relational databases and recent hardware trends.
An important problem solved by computers is that of data storage and retrieval: E.g., to store efficiently the grade at every course of every student of the ITU University obtained in the last 5 years; and the ability to query such a database, i.e., find the grades of all students who enrolled after 2016 and who did not take a given course. This problem arises very broadly, in essentially every sector, every industry, and every application.
The course gives an introduction to relational databases as well as an introduction to data analytics, both from a practical and theoretical point of view.
The main part of the course deals with relational databases, including theory and practice for modeling and querying a database. Towards the end of the course the focus will be on techniques for data analytics.
For relational databases the following topics will be covered:
- Programming in SQL, including all basic operations as well as some more advanced constructions (e.g. subqueries).
- Other basic concepts related to relational databases and SQL, such as views, procedures, triggers, etc.
- Using SQL in applications, e.g. Java-applications.
- Database design using E-R modelling.
- Defining a database design using the relational model and SQL schemas.
- Normalization of relations.
- Query processing and optimization basics.
- Use of different indexes, including hash indexes, B-tree indexes, non-clustered and clustered indexes.
- Transactional concepts and transaction handling.
For data analytics the following topics will be covered:
- Approaches to data analytics (including OLAP and data warehousing).
- Introduction to data wrangling/processing.
- Societal context, including ethics concerns.
- Distributed map-reduce processing.
The course is only open to BSc DS third semester. The course assumes that the students have taken an introductory programming course and have some prior knowledge on data structures (e.g., took an Algorithms and Data Structures course). Moreover the student must always meet the admission requirements of the IT University.
Intended learning outcomes
After the course, the student should be able to:
- Select an appropriate data management system (or set of systems), access methods, and data layout given a data science use case.
- Describe the pros and cons of different (classes of) data management systems for modern analytics and data science.
- Reflect upon the landscape of data management applications/workloads and their impact on data management system design.
- Reflect upon the evolution of the hardware and storage hierarchy and its impact on data management system design.
- Explain the internals of a traditional database system.
- Query and modify data using SQL.
- Design a database using the relational model and normal form theory.
Lectures will provide tools and methods for describing, creating and using databases. * Weekly exercises consist of coding exercises, applying techniques, and using them to analyze and improve designs.
The course has 4 mandatory assignments. 3 of the assignments need to be completed and approved before you can take the examination. Deadlines will be advertised during the course on LearnIT. Approval will be communicated via LearnIT and general feedback will be given during subsequent exercise sessions.
The student will receive the grade NA (not approved) at the ordinary exam, if the mandatory activities are not approved and the student will use an exam attempt.
Principles of Database Management: The Practical Guide to Storing, Managing and Analyzing Big and Small Data - Wilfried Lemahieu, Seppe vanden Broucke, Bart Baesens
Cambridge University Press; 1 edition (August 30, 2018)
Ordinary examExam type:
A: Written exam on premises, external (7-trinsskala)
A22: Written exam on premises with restrictions. Restrictions may concern which software and which books you may use.
A22 LearnIT exam with restricted networks
The final grade is based solely on the written examination.
The duration of the written examination on premises is 4 hours with the following restrictions:
1. Physical copies of the course textbook and other printed materials are permitted.
2. e-books on laptops, iPads, and other e-book readers are permitted.
3. Use of a local DBMS on your laptop is permitted.
4. Accessing material posted on the course web-page on LearnIT is permitted.
5. It is *not* permitted that you access any other information from the internet, including newsgroups, social media, email, Facebook, Twitter, etc. or elsewhere that is not in book form.
6. Use of pocket calculator is not permitted.
Students should bring a computer with wifi and with the MySQL database system installed.