RDFIA – Pattern Recognition for Image Analysis and Interpretation

Objective: This course introduces key concepts and methods for the automatic analysis and interpretation of visual content in images. We propose modern machine learning approaches to explore fundamental and advanced methods for computer vision. A particular emphasis is placed on deep learning architectures and their training, which play a central role in today’s computer vision. The course addresses a broad spectrum of vision tasks, including image classification, segmentation, vision-language models, and generative modeling. Recent advances such as convolutional neural networks at scale, Vision Transformers, self-supervised learning, and diffusion models are covered in detail. Issues of model robustness, explainability, transfer learning, and domain adaptation are also discussed.

Theoretical lectures are complemented by hands-on programming sessions in Python, where students implement and experiment with the models studied. All about practicals: Here


Course 1 (Sept. 24, 2025)

Introduction to Computer Vision and ML basics slides1
Visual (local) feature detection and description
Bag of Word Image representation
Linear classification (SVM)

Course 2
Introduction to Neural Networks (NNs), training and Statistical decision theory slides2


Course 3

Datasets, benchmarks and evaluation slides3_DATA
Neural Nets for Image Classification slides3


Course 4
End of course 3 on convolutional architectures and first large ConvNet: AlexNet
Weight initialization: Xavier formula
Complements for normalization: BatchNorm, LayerNorm, InstanceNorm


Course 5

Large Neural Nets for Image Classification slides5_LargeConvNet

Vision Transformers slides_vit1


Course 6
Details of Vision Transformers architecture: Embedding matrix, positional encoding, attention module, FFN module, Norms, classToken, …

Transfer learning slides6_Transfer

Vision-Language models (part I): CLIP models slides6_VLM_CLIP


Course 7
Vision-Language models (part II): MLLMs
Explaining&Monitoring MLLMs


Course 8
Segmentation
Domain adaptation


Course 9
Self-Supervised Learning in Vision


Course 10
Generative models with GANs


Course 11 (Jan. 07)
Control 2pm-3:45pm + practicals 4pm-6pm

Conditional GANs and Teaser Diffusion models


Course 12 Diffusion models for Image Generation (Alasdair)
Course 13 Bayesian deep learning (Clement)
Course 14 Failure and ood detection (Clement)


Prerequisites: Basic knowledge of digital image representation, statistical data processing, and scientific computing in Python.

Further reading (available at SorbonneU library):
Book Pattern Recognition and Machine Learning, C. M. Bishop
Book Deep Learning, I. Goodfellow, Y. Bengio, A. Courville
Book Computer Vision: Algorithms and Applications, Richard Szeliski