Section: Master of Science in Information Technology (Software Design)
Section: Master of Science in Information Technology (Software Design)
Section: Master of Science in Information Technology (Software Design)

This elective will leave you comfortable understanding, applying and presenting results from data mining.

Basic practices

How to do data mining well and effectively - the steps before running the tool

  • Formulating questions
  • Features, classification, basic machine learning
  • Managing data: loading, storing, accessing
  • Independent vs dependent variables
  • Data manipulation and visualisation
  • Overfitting and underfitting
  • Interpreting results: What can we say, what can't we say

Algorithms

Most of these will be introduced

  • Clustering: k-Means
  • Clustering: RBF
  • Clustering: hierarchical agglomerative methods
  • Naive Bayes
  • Decision trees
  • Regression
  • Correlation
  • Neural net basics
  • Association rule mining
  • SVM and SVM variants
  • Dimensional reduction, e.g. PCA

Tuning

How to get the best results, and fix data mining problems

  • Clustering: how many clusters?
  • Pruning trees
  • Outlier detection
  • High dimensional data
  • t-SNE: what it can and cannot
  • Fast k-means

Applications

These are use-cases for demonstrating techniques.

  • mining text data
  • wage and age, wage and gender
  • image analysis
  • cultural similarity between countries
  • this list is open: send suggestions

Section: Master of Science in Information Technology (Software Design)
Section: Master of Science in Information Technology (Software Design)
Section: Master of Science in Information Technology (Software Design)
Section: Master of Science in Information Technology (Software Design)
Section: Master of Science in Information Technology (Software Design)