AbstractThis course introduces basics of Bayesian statistics, Bayesian data analysis, Bayesian learning, and the programming tools that enable automation of these methods. The course emphasizes programmable statistical methods over pen and pencil analytics.
Bayesian statistics and probabilistic programming are believed to be the proper foundation for development and industrialization of next generation of AI systems. Bayesian statistics gives a well defined theoretical basis, that is analytically understandable, while probabilistic programming gives an instrument of automation, needed for proper industrialization of the method.
This course introduces basics of Bayesian statistics, Bayesian data
analysis, Bayesian learning, Bayesian Hypothesis testing, and the programming tools that enable
automation of these methods. We will cover Bayesian reasoning and diagnosis and build models of concrete examples. We will learn several sampling methods and apply them to problems at hand. The course emphasizes programmable
statistical methods over pen and pencil analytics. Every week we solve a programing exercise associated with the topic of the lecture.
We will study Bayesian Analysis using an established textbook. For each chapter we will implement examples and exercises of models and analyses using Python's PyMC3 framework - probably the most popular probabilistic library today. Occasionally we will show examples of other probabilistic programming languages to illustrate concepts.
- You need to be a confident programmer in object-oriented and functional programming styles
- You should know basic probability theory at high-school level (concepts like discrete probability, or normal distribution). We will recall all the necessary notions in the course, but we will not provide a systematic course on probability theory. You may consider taking Linear Algebra and Probability in parallel, if you lack a systematic exposition to these topics.
- You should know Python, or be willing to learn it fast (learning Python fast is possible). We will spend only one class explaining the basics of Python, so if you do not know Python before taking the course, you will have to pick it up by yourself. Python is a language with a low barrier of entry, but you are recommended to start learning it before the class begins, if you have no experience.
Intended learning outcomes
After the course, the student should be able to:
- Identify applications of Bayesian analysis
- Formulate Bayesian Models
- Implement construction of Bayesian Models using a probabilistic programming framework PyMC3 in Python
- Learn model parameters from data
- Explain differences between sampling algorithms, select, and use an appropriate inference algorithm
- Evaluate the models, inference, and learning queries experimentally
- Test your probabilistic programs
- About 12 lectures on probabilistic Bayesian modeling and programming. The lectures are an opportunity to reflect on the prescribed reading material and to engage in a discussion on this topic.
- About 12 exercise session on building models and programming analyses. The major part of the learning takes part in the exercises and the associated home works, as this is a programming skill oriented course.
- About 12 home works: each exercise is supposed to be continued in self-study homework style after the exercises. Group-work and individual study is permitted.
- Final mini-project (part of the final exam)
John Kruschke. Doing Bayesian Data Analysis. Academic Press 2015 (2nd edition)
Supporting material: Bayesian Methods for Hackers; available at: https://nbviewer.jupyter.org/github/CamDavidsonPilon/Probabilistic-Programming-and-Bayesian-Methods-for-Hackers/blob/master/Prologue/Prologue.ipynb
Student Activity BudgetEstimated distribution of learning activities for the typical student
- Preparation for lectures and exercises: 27%
- Lectures: 11%
- Exercises: 31%
- Project work, supervision included: 26%
- Exam with preparation: 5%
Ordinary examExam type:
D: Submission of written work with following oral, External (7-point scale)
D1G: Submission for groups with following oral exam based on the submission. Shared responsibility for the report.
A report from a small data analysis project prescribed by teachers, produced from the jupyter lab programming system (an executable report).
The report is joint group report, but students indicate how do they share responsibility for different parts.
- group size: 2-3 persons.
Individual exam : Individual student presentation followed by an individual dialogue. The student is examined while the rest of the group is outside the room.