


Supervised Learning : Regression and Classification 


13,5h L / 3h R / 1 WE + 1 OE / 4 ECTS credits in common with IIC_OAD1 and IIC_OAD2 / IIC_OAD3 


Michel IANOTTO (4h30) and Hervé FREZZABUET (9h) 


Statistical Learning Theory 


We will present the main ideas of Vapnik’s statistical learning theory and show how this theory can provide tools to control the risk of overfitting. We will envisage the notions of real risk, empirical risk, VapnikChervonenkis dimension and different induction principles too (structural or empirical risk minimization). Students will be prepared to understand the consequences of those theoretical results on such or such learning technique
Real and empirical risks. Induction principles. Empirical risk minimization. Structural risk minimization. Bounds, PAC learning, VC dimension for an hypothesis space. Cross validation.



Support Vector Machines (SVM) (6h) 


This course will broach the SVM method for regression and classification, insisting on the expression power of kernel methods. The main ideas of the classical SMO method for solving SVM will be given, in order to understand the significance of each parameter.
Classification : defining the margins, then stating the optimization problem. Lagrangian resolution: dual problem, SMO resolution principle. Regression: another type of optimization problem, epsilonloss concept. Others SVM. Using kernels : Mercer’s property and building rules, gaussian and polynomial kernels for textual data, kernels for structured data, real risk bounds.



Perceptrons (4h30) 


Perceptrons are the most frequently used methods in industry. They have been deeply investigated. Looking at perceptrons as an gradient descent method, results from the second year course “Numerical Methods and Optimization” will be applied to this specific case (backpropagation ,conjugate gradients, Newton methods). This topic will be broadened up by introducing perspectives on pruning techniques and radial functions networks.
Single layer perceptron. Multilayers perceptron (backpropagation + gradients conjugués, etc...). Optimal Brain Dammage, OWE. Régularisation. RBF Networks.






References
A. Cornuéjols, L. Miclet et Y. Kodratoff, Apprentissage artificiel, concepts et algorithmes, Eyrolles, 2002. Neural Network for Pattern Recognition, Christopher M. Bishop, Oxford University Press, 1995.
Nello Christianni and John ShaweTaylor, An Introduction to Support Vector Machines and other kernelbased learning methods, Cambridge University Press, 2000.
Vladimir Vapnik, The Nature of Statistical Learning Theory, Springer, 1995.



Last update 06/07/2007 by Cl.M. 
