uc:sendtilenven runat server id uc_sendtilenven
Ændre størrelse på tekst Print

Statistical methods for machine learning


Semesterangivelse: Forårs kursus Kurset udbydes i blok 3 Kurset udbydes i skemagruppe A Kurset giver 7,5 ETCS point

 


Udgave: Forår 2013 NAT
Point: 7,5
Blokstruktur: 3. blok
Skemagruppe: A
Fagområde: dat
Varighed: 8 weeks
Omfang: 20 hours per week
Institutter: Department for Computer Science
Studieordning: Computer Science master
Uddannelsesdel: Kandidat niveau
Kontaktpersoner: Christian Igel. E-mail igel@diku.dk
Andre undervisere: Sami Brandt
Skema- oplysninger:  Vis skema for kurset
Samlet oversigt over tid og sted for alle kurser inden for Lektionsplan for Det Naturvidenskabelige Fakultet Forår 2013 NAT
Undervisnings- form: Lecture and exercise classes
Formål: Traditionally Computer Science has been modeling and analyzing deterministic data. However certain types of data are stochastic by nature, this may be caused by a massive amounts of data, noise from physical measurements (sensor noise), or from complex dependencies in the data such as web page relationships (Google), IP network traffic, patterns in databases, signals and images, evaluation of human-computer interfaces and other. Machine learning and pattern recognition deal with the problem of learning models of data that can be used to analyze and predict new previously unseen data. Statistical machine learning and pattern recognition form a set of tools that are widely applicable for data analysis within a diverse set of problem domains such as data mining, search engines, digital image and signal analysis, natural language modeling, bioinformatics, physics, economics, biology, etc. The purpose of the course is to introduce students to probabilistic data modeling and the most common techniques from statistical machine learning and pattern recognition. This will be done in a case oriented manner (both in lectures and exercises) with a focus on examples of applications from the different problem domains. The students will obtain a working knowledge of probabilistic data modeling and statistical machine learning for pattern recognition. This course is relevant for students from among others the studies of Computer Science, E-Science, Bioinformatics, Physics, and Mathematics.
Indhold: The course covers the following tentative topic list: * Foundation of statistical learning: probability theory, likelihood framework, parametric and non-parametric representations. This includes Gaussian distributions, histograms, kernel density estimation, neighborhood based estimation (KNN). * Classification methods: linear models including kernel based such as support vector machines (SVM), K-Nearest Neighbor (KNN) and neural networks. * Probabilistic interpretations of classification algorithms. * Linear models for regression. * K-means clustering and mixture modeling. * Dimensionality reduction and visualization techniques such as principal component analysis (PCA).
Målbeskrivelse: At course completion, the student should be able to:
1. Recognize and describe possible applications of machine learning for pattern recognition and data mining.
2. Explain, contrast and apply basic Bayesian probability theory for modeling stochastic data, including both parametric and non-parametric representations.
3. Explain and contrast the concept of supervised and unsupervised learning.
4. Explain the concepts of classification and clustering.
5. Identify, explain and handle the common pitfalls of machine learning.
6. Describe and apply linear techniques for classification and regression.
7. Implementation of selected machine learning techniques.
8. Use software libraries for solving machine learning problems.
9. Visualize and evaluate results obtained with a machine learning method.
10. Compare, appraise and select methods of machine learning for solving specific problems of pattern recognition and data mining.
Lærebøger: Expected to be Christopher M. Bishop "Pattern Recognition and Machine Learning", Springer, 2006. ISBN: 0-387-31073-8
Tilmelding: November 15 to December 1, 2012, via KUnet, www.kunet.dk
Faglige forudsætninger: Knowledge of basic linear algebra corresponding to the course Linear Algebra. Knowledge of programming at an introductory level.
Formelle krav: None
Eksamensform: Mandatory assignments must be passed in order to be eligible for the exam.
Exam: Written mini-project. Grade: 7-point scale. External grading. Submission in Absalon.
Re-exam: 20 minutes Oral exam without preparation in course curriculum. Grade: 7-point scale. External grading.
Eksamen: Skriftligt mini-projekt afleveres d. 4. april 2013.
Reeksamen: Mundtlig prøve d. 27. juni 2013.
Kursus hjemmeside:
Undervisnings- sprog: Kun engelsk
Sidst redigeret: 31/10-2012



Københavns Universitet