Official course description:
AbstractThe course gives an introduction and overview of data engineering techniques.
A data ‘revolution’ is underway, one that is already reshaping how knowledge is produced, business conducted, and governance enacted. Data has traditionally been time-consuming and costly to generate, analyse and interpret, and generally provided a relatively static and coarse snapshot of phenomena.
This state of affairs is changing now. Rather than being scarce and limited in scope, the production of data is increasingly becoming a ‘deluge’ i.e. a wide flow of real-time, varied, resolute and relational data that are relatively low in cost.
Outside of business, data is increasingly becoming open as well. This data abundance (as opposed to data scarcity) is reshaping how we work with, circulate, trade, analyse and exploit data. This development is founded on the latest wave of information and communication technologies such as the plethora of digital devices encountered in homes, workplaces and public spaces as well as mobile, distributed and cloud computing; social media, and inter-worked sensors and devices.
These technical infrastructures are leading to evermore aspects of everyday life – work, consumption, travel, communication, and leisure – being captured as data. Moreover, they are re-configuring the production, circulation and interpretation of data, producing what has been termed ‘big data’.
The students will gain an understanding of the technical aspects of data management and the opportunities and risks they create for organisations.During the course the students will relate to the (changing) nature of database use and design, including:
- Data representation and modelling
- Data storage and retrieval
- Data Engineering and Big Data Processes
- Architecture of Unbundled Data Systems
Python programming is a prerequisite.
This course is part of the second semester in the bachelor's degree in Global Business Informatics.
Intended learning outcomes
After the course, the student should be able to:
- Explain the difference between the relational and non-relational data models
- Describe the architecture and components of a database system
- Design an ER model and a relational model in a concrete scenario
- Define SQL queries in a concrete scenario
- Define Python programme for batch data processing
April 21st, 2020: Exam changed due to the Covid-19 situation and the change to online exams.
Lectures and exercises. The exercises will essentially be based on programming tasks, in Python and SQL.
Designing Data-Intensive Applications - Martin Kleppmann, O'Reilly
Student Activity BudgetEstimated distribution of learning activities for the typical student
- Preparation for lectures and exercises: 40%
- Lectures: 20%
- Exercises: 20%
- Exam with preparation: 20%
Ordinary examExam type:
B: Oral exam, Internal (7-point scale)
B22: Oral exam with no time for preparation.