Official course description:
Full info last published 3/07-19

Data Management

Course info
Language:
English
ECTS points:
7.5
Course code:
BSDAMAN1KU
Offered to guest students:
no
Offered to exchange students:
Offered as a single subject:
no
Programme
Level:
Bachelor
Programme:
BSc in Data Science
Staff
Course manager
Associate Professor
Teaching Assistant
Teaching Assistant (TA)
Teaching Assistant
Teaching Assistant (TA)
Teaching Assistant
Teaching Assistant (TA)
Teaching Assistant
Teaching Assistant (TA)
Teaching Assistant
Teaching Assistant (TA)
Teaching Assistant
Teaching Assistant (TA)
Teaching Assistant
Teaching Assistant (TA)
Course semester
Semester
Efterår 2019
Start
26 August 2019
End
31 January 2020
Exam
Exam type
ordinær
Internal/External
intern censur
Grade Scale
7-trinsskala
Exam Language
GB
Abstract

Please be aware that changes may occur.

This course gives an introduction to the evolution of the data management landscape during the past 4-5 decades, with particular emphasis on relational databases and recent hardware trends.

Description

An important problem solved by computers is that of data storage and retrieval: E.g., to store efficiently the grade at every course of every student of the ITU University obtained in the last 5 years; and the ability to query such a database, i.e., find the grades of all students who enrolled after 2016 and who did not take a given course. This problem arises very broadly, in essentially every sector, every industry, and every application.

The course gives an introduction to relational databases as well as an introduction to data analytics, both from a practical and theoretical point of view.

The main part of the course deals with relational databases, including theory and practice for modeling and querying a database. Towards the end of the course the focus will be on techniques for data analytics. 

For relational databases the following topics will be covered: 

  • Programming in SQL, including all basic operations as well as some more advanced constructions (e.g. subqueries).
  • Other basic concepts related to relational databases and SQL, such as views, procedures, triggers, etc.
  • Using SQL in applications, e.g. Java-applications.
  • Database design using E-R modelling.
  • Defining a database design using the relational model and SQL schemas.
  • Normalization of relations.
  • Query processing and optimization basics.
  • Use of different indexes, including hash indexes, B-tree indexes, non-clustered and clustered indexes.
  • Transactional concepts and transaction handling.


For data analytics the following topics will be covered: 

  • Approaches to data analytics (including OLAP and data warehousing).
  • Introduction to data wrangling/processing.
  • Societal context, including ethics concerns.
  • Distributed map-reduce processing.

Formal prerequisites

The course is only open to BSc DS third semester. The course assumes that the students have taken an introductory programming course and have some prior knowledge on data structures (e.g., took an Algorithms and Data Structures course). Moreover the student must always meet the admission requirements of the IT University.

Intended learning outcomes

After the course, the student should be able to:

  • Select an appropriate data management system (or set of systems), access methods, and data layout given a data science use case.
  • Describe the pros and cons of different (classes of) data management systems for modern analytics and data science.
  • Reflect upon the landscape of data management applications/workloads and their impact on data management system design.
  • Reflect upon the evolution of the hardware and storage hierarchy and its impact on data management system design.
  • Explain the internals of a traditional database system.
  • Query and modify data using SQL.
  • Design a database using the relational model and normal form theory.
Learning activities

Lectures will provide tools and methods for describing, creating and using databases. * Weekly exercises consist of coding exercises, applying techniques, and using them to analyze and improve designs. 

Mandatory activities

The course has 4 mandatory assignments. 3 of the assignments need to be completed and approved before you can take the examination. Deadlines will be advertised during the course on LearnIT. Approval will be communicated via LearnIT and general feedback will be given during subsequent exercise sessions. 

The student will receive the grade NA (not approved) at the ordinary exam, if the mandatory activities are not approved and the student will use an exam attempt.

Course literature

Principles of Database Management: The Practical Guide to Storing, Managing and Analyzing Big and Small Data -  Wilfried Lemahieu, Seppe vanden Broucke, Bart Baesens Cambridge University Press; 1 edition (August 30, 2018)

Ordinary exam
Exam type:
A: Written exam on premises, external (7-trinsskala)
Exam variation:
A22: Written exam on premises with restrictions. Restrictions may concern which software and which books you may use.
Exam description:

A22 LearnIT exam with restricted networks

The final grade is based solely on the written examination. 

The duration of the written examination on premises is 4 hours with the following restrictions: 

1. Physical copies of the course textbook and other printed materials are permitted. 

2. e-books on laptops, iPads, and other e-book readers are permitted. 

3. Use of a local DBMS on your laptop is permitted. 

4. Accessing material posted on the course web-page on LearnIT is permitted. 

5. It is *not* permitted that you access any other information from the internet, including newsgroups, social media, email, Facebook, Twitter, etc. or elsewhere that is not in book form. 

6. Use of pocket calculator is not permitted.

Students should bring a computer with wifi and with the MySQL database system installed. 




reexam
Exam type:

Exam variation: