SEARCH

CONTACT

email: lich@vscht.cz address:
ICT Prague
Technická 5
166 28 Prague 6
map: how to find us

ICT Prague

Logo VŠCHT

Facebook

Facebook

Bioinformatika

Bioinformatika

ENBIK

ENBIK

« one step back EDUCATION

Data mining (N143034SIS)

Lecture Exercise Laboratory Examination Credits
2 2 0 Z+Zk 5
   
Guarantor Svozil Daniel
Teacher Svozil Daniel

Annotation:

The aim of the course is to introduce students into the field of data mining. Students will be familiarized with methods of data pre-processing, with machine learning algorithms, and with model performance measures and model selection approaches. The gained knowledge will be practised on the real data utilizing open-source software Rapidminer.

Syllabus:

  1. Introduction to data mining. CRISP-DM. Data warehousing. OLAP.
  2. Pattern recognition - basic concepts. Supervised/unsupervised learning. Classification and regression. Generalization. Overfitting. Bias-variance tradeoff..
  3. Test set. Cross validation. k-nearest neighbors.
  4. Cluster analysis.
  5. Information theory. Decission trees.
  6. Neural networks I. Threshold neuron. ADALINE. Linear perceptron.
  7. Neural networks II. Multilayer Perceptron.
  8. Neural networks III. Radial Basis Function (RBF) Networks.
  9. Neural networks IV. Self-Organizing Map.
  10. Support Vector Machines.
  11. Genetic Algorithms.
  12. Feature Selection. Feature Extraction.
  13. Ensemble learning.

Notes:

  • Credit project during the term, finished with oral presentation.
  • Witten exam at the end of the term.