Matthieu Cord - Miscellaneous

Demo Foodingue!

Amazing Sorbonne Univ collection (MNHN)

News Archives

2023

Main publi/infos:

3 NeurIPS OBELICS, Rewarded soups and REx: Data-Free Quantization
2 ICCV eP-ALM: Efficient Perceptual Augmentation of Language Models and ZestGuide accepted to ICCV 2023 in Paris!
1 paper UnIVAL: Unified Model for Image, Video, Audio and Language Tasks accepted to MMFM Workshop @ICCV.
1 ICML Model Ratatouille: Recycling Diverse Models for ood
4 CVPR Counterfactual Explanations, Semantic and Panoptic Segmentation, Visual Recognition, Improving Selective VQA
2 ICLR Image editing with diffusion models and Data free quantization of deep nets

1 IJCV paper on Explainability of vision-based autonomous driving systems: Review and challenges

AndrewYNg’s news letter pointed out our recent strategy for training Transformers in Vision: Cookbook for Vision Transformers: Good insight about our DeiT III !

2022

Main publi/infos:

Dec.: 2 papers presented at NeurIPS, 1 paper at CoRL
Nov.: 1 paper at BMVC Conference Efficient Vision-Language Pretraining with Visual Concepts and Hierarchical Alignment!
Oct: 3 papers presented at ECCV including STEEX! and 2 papers about DeiT
July: 1 paper at ICML Fishr: Invariant Gradient Variances for Out-of-distribution Generalization
June: 2 papers at CVPR 1- Flexible Semantic Image Translation, 2- Transformers for continual learning + 8 workshop papers:
T-FOOD Transformer for Cross-Modal Food Retrieval,
Raising context awareness in motion forecasting,
Embedding Arithmetic of Multimodal Queries for Image Retrieval,
Towards efficient feature sharing in MIMO architectures,
Dynamic Query Selection for Fast Visual Perceiver,
Swapping Semantic Contents for Mixing Images,
Multi-Head Distillation for Continual Unsupervised Domain Adaptation in Semantic Segmentation
and RED++: Data-Free Pruning of Deep Neural Networks via Input Splitting and Output Merging published T-PAMI
Janv. 2022 MLIA is joigning the ISIR lab of Sorbonne University, it is a fantastic opportunuity with a lot of new challenges for us!

2021

Main publi/infos:

2 NeurIPS – Data-Free Compression of Deep Neural Networks and on Efficient Black-box Explanations for deep nets
5 ICCV – domain adaptation, bias, Mixing and data augmentation, self supervised, transformers (Xtarget Domain adaptation, Bias and Xmodal shortcuts, CaiT Deeper with transformers, Deep ensembles with 1 net, Coarse labels but fine-grained classification )
1 ICML – DeiT data-efficient image transformers
3 CVPR – Continual Semantic Segmentation, Semantic editing GANs, and Self Supervised BoW + workshops: Insights from the Future for Continual Learning at CVPR workshop on continual learning in computer vision, Learning Reasoning Mechanisms for Unbiased Question-based Counting at CVPR workshop on VQA
1 ICLR – diversity in deep ensembles
1 BMVC – GAN editing
1 PAMI – confidence for deep
and others …
1 IEEE Trans on Intelligent Transportation Systems – Detecting 32 Pedestrian Attributes for Autonomous Vehicles
1 CVIU zero-shot semantic segmentation with domain adaptation published in CVIU

1 paper BEEF on XAI and Autonomous driving at the NeurIPS workshop: Machine Learning for Autonomous Driving

2020

Main publications:

2 ECCV papers accepted on PODNet incremental learning and Knowledge distillation
1 CVPR paper accepted on self learning: Learning Representations by Predicting Bags of Visual Words
1 PR Journal paper on semantic segmentation SEMEDA
1 ICASSP paper on GAN: This dataset does not exist: training models from generated images
1 AAAI paper accepted on our GAN strategy for Cross-Channel Image Completion

Participation of PhD defenses in 2020: Benjamin Piwowarski (HDR 10/23), Srijan Das (10/01), Thomas Lucas (09/25), Yifu Chen (09/09), Pierre Jacob (08/09), Rémi Cadene (07/08), Daniel Brooks (07/03), Mickaël Chen (07/02), Martin Engilberge (06/12), Adrien Poulenard (04/15)

2019

Main publications:
4 NeurIPS (NIPS) – Bias in VQA, zero-Shot Segmentation, Failure Detection, Riemannian NNets
4 ICCV – UDA, Face alignment, Few-shot, 3D manifold learning
3 CVPR – SoDeep, Advent and MUREL
1 AAAI – Tensor decomposition

Oct.
ICCV conf.
PhD defense of E. Mehr with MP. Cani, L. Guibas, T. Boubekeur, K. Bally, M. Ovsjaniko, V. Guiteny, M. Cord (Supervisor)
PhD defense of T. Robert with S. Canu, G. Mori, C. Achard, K. Amahari, D. Picard, N. Thome, M. Cord (Supervisor)

Sept.
4 NeurIPS (NIPS) accepted! Biais in VQA, 0-Shot Segmentation, Failure Detection, Riemannian NNets

Aug.
PhD defense of J. Peyre, with J. Sivic, C. Schmid, I. Laptev, F. Jurie, S. Lazebnik, M. Cord (Rev)

July
4 ICCV accepted!
PhD defense of Junhao Wen with F. Barkhof, P. Coupé, O. Commowick, O. Colliot, S. Durrleman, M. Cord (President)

June
CVPR conference

May
PhD defense of H. Ben-younes Multi-modal representation learning towards visual reasoning with Y. Lecun, J. Verbeek, V. Ferrari, C. Wolf, P. Perez, L. Soulier, N. Thome, M. Cord (Supervisor)

March
3 CVPR SoDeep, Advent and MUREL accepted this year including 2 oral!
PhD defense of Arthur Guillon with N. Labroche, J. Velcin, A. Cornuéjols, C. Frélicot, MJ. Lesot, C. Marsala, M. Cord (President)

Feb
HdR defense of K. Bailly with L. Heutte, JM. Odobez, B. schuller, M. Chetouani, M. Cord (Pres), J. Crowley

Jan
Our PAMI,IJCV and Neurocomputing journal papers online in my publication page!
Our CVPR 2018 code on multimodal (text+im)embedding online

2018

Dec
Neurips presentation of our poster Revisiting Multi-Task Learning with ROCK
PhD defense of J. Ogier du Terrail with E. Fromont, S. Lefevre, F. Jurie, M. Cord (Pres), F. Oudyi
PhD defense of K. Blanc with F. Bremond, B. Merialdo, T. Menguy, N. Thome, M. Cord (Rev), D. Lingrand
PhD defense of M. Carvahlo Deep space representation with L. Soulier, E. Gaussier, S. Lefevre, F. Precioso, H. Leborgne, N. Thome, M. Cord (Supervisor)
Nov AAAI 2019: our paper BLOCK: Bilinear Superdiagonal Fusion for Visual Question Answering and Visual Relationship Detection accepted!
PhD defense of T. Mordan Designing deep architectures for visual understanding with F. Perronnin, J. Sivic, A. Alahi, N. Neverova, N. Thome, M. Cord (Supervisor)
PhD defense of M. Blot Study on training methods and Generalization performances of deep learning with A. Rakotomamonjy, C. Wolf, E. Fromont, A. Bellet, L. Ralaivola, N. Thome, M. Cord (Supervisor)
Oct ICIP 2018: BEST PAPER AWARD for our paper SHADE: INFORMATION-BASED REGULARIZATION FOR DEEP LEARNING!
Sept NIPS 2018: our paper Revisiting Multi-Task Learning with ROCK: a Deep Residual Auxiliary Block for Visual Detection accepted!
July ECCV 2018: our paper HybridNet: Classication and Reconstruction Cooperation for SSL accepted
June RFIAP conference: with 3 invited speakers: A. Alahi, Y. LeCun, Maja Pantic
PhD defense of H. Jain with E. Kijak, F. Jurie, M. Cord(R), F. Perronnin(R), P. Perez, C. Guillemot, R. Gribonval, H. Jégou
PhD defense of Y. Tamaazousti with I. Kokkinos, PH. Gosselin, C. Hudelot, H. Leborgne, M. Cord(P), P. Piantanida, F. Perronnin
May PhD defense of B. Moysset with E. Fromont, L. Heutte, C. Couprie, C. Wolf, C. Kermorvant, M. Cord (Pres)
April Our paper Cross-modal retrieval in the cooking context: Learning semantic text-image embeddings accepted as a full paper for oral presentation at the SIGIR 2018! completed with an ICDE 2018 workshop paper dedicated on computational cooking scenarios
April Entretien sur France Culture avec René Frydman
March 2 papers accepted at CVPR 2018!
March Our TNNLS paper SyMIL: MinMax Latent SVM for Weakly Labeled Data, with T. Durand and N. thome accepted!
Feb DeepVision Workshop in Vancouver
Jan PAMI paper on Negative Evidence for Deep Latent Structured Models accepted!

2017

Dec 19 PhD defense of R. Rezende with P. Perez, F. Perronin, J. Ponce, F. Bach, F. Jurie, M. Cord (Pres)
Nov 20 Talk at Deep Learning UPMC
Nov 09 VQA Talk at INRIA Sophia
Nov 09 PhD defense of M. Koperski with JM Odobez, F. Precioso, F. Bremond, M. Cord (rev), L. Sigal, F. Gianpiero
Nov 02 and 03 deep learning course at Institut d’automne de l’IA, Lyon
Oct 06 PhD defense of E. Oyallon with P. Perez, F. Perronin, S. Mallat, I. Laptev, N. Paragios, M. Cord (Pres)
Oct Talk on Visual Question Answering VQA in FranceIsIA event, Paris
Sept 29 PhD defense of my Student X. Wang with F. Precioso, N. Thome, M. Cord, P. Le Callet, C. Achard, C. Wang, PH. Gosselin
Sept 20 PhD defense of my Student T. Durand with F. Bach, C. Schmid, N. Thome, M. Cord, P. Perez, A. Rakotomamonjy, V. Serfati (Invited)
Sept BMVC 2017 Best Science paper Award for our paper DEFORMABLE PART-BASED FULLY CONVOLUTIONAL NETWORK FOR OBJECT DETECTION
July ICCV 2017 paper on Tensor decomposition for VQA task accepted
July BMVC 2017 paper deep architecture for detection accepted
July CVPR WILDCAT paper presentation, VQA workshop (Challenge VQA2)
June Invited speaker, deep learning session in Big data advanced school, San Carlos, Brazil
Feb Talk on Global average pooling in deep ConvNets at THOTH INRIA grenoble
Feb, 06: PhD def. of M. Paulin with J. Sivic, V. Lepetit, C. Wolf, F. Perronnin, J. Mairal, C. Schmid, Z. Harchaoui, M Cord (Reviewer)
Jan, 23: PhD def. of P. Kulkarni with P. Perez, F. Jurie, S. Canu, J. Zepeda, J. Verbeek, M Cord (Reviewer)

2016

Dec., 2016: Invited speaker in Panel session in Future of Emerging Technologies – Memristors and Machine Learning, in 23rd IEEE International Conference on Electronics Circuits and Systems (ICECS), Monte Carlo, Monaco
Dec., 2: PhD def. of M. Chevalier with S. Canu, F. Bremond, C. Achard, P. Perez, M. Cord, G. Henaff, N. Thome
Nov., 3: PhD def. of J. Pasquet with M. Chaumont, C. Garcia, M. Cord, V. Charvillat, G. Subsol, P. Poncelet
Oct., 2016: Invited talk in Symposium on Deep Learning and Artificial Intelligence, Tokyo, Japan
Oct., 2016: Invited talk Deep learning and weak supervision for visual recognition in GdR ISIS workshop
June, 2016: 2 papers accepted at CVPR 2016
June 8, 2016: Talk at INRIA Thoth symposium on Computer Vision and Deep Learning
June 7, 2016: HDR defense of Jakob Verbeek with M. Cord (reviewer), A. Zisserman, E. Gaussier, E. Learned-Miller, C. Schmid, T. Tuytelaars.
May 19, 2016: Talk at I3S lab. on deep and weak supervision
April 14, 2016: Half Day Deep learning Workshop (2nd edition) with GdR-ISIS at UPMC
- Introduction by M. Cord
- Plenary Talk of Yann LeCun (Facebook AI research, NYU, College de France) on predictive learning
- Poster session
March 8, 2016: PhD def. of J. Nicolle with M. Cord (Pres), JP Thiran, B. Schuller, C. Garcia, K. Bailly, M. Chetouani
Feb, 2016: (Press) France Culture Radio show on Lecun and deep learning and broccoli…

2015

Dec: Paper on min-max LSSVM for classification & ranking presented at ICCV 2015 conf.
Sept, 7: Tutorial with Aurélien Bellet on Metric Learning at ECML-PKDD, link to talks
Sept, 3: Invited talk at the SMART summer school
Aug, 31: PhD def. of Christion A. Kuoman with M. Cord (Pres), S. Tollari, M. Detiniecki, Ph. Mulhem, A Popecu, B. Ionescu
July, 1: HDR def. of N. Thome with C. Schmid, N. Paragios, M. Cord (Pres), S. Canu, F. Jurie, S. Marchand-Maillet
June, 30: PhD def. of C. LeBarz with F. Brémond, L. Denoyer, N. Thome, M. Cord, JY. Dufour, D. Filliat, S. Herbin
June: seminar in Brazil, UNICAMP and UFMG
Apr, 28: seminar at The Hong Kong Polytechnic University while visiting Lei Zhang colleague
March, 20: Half Day Deep learning Workshop with GdR-ISIS
Feb, 02: invited talk at the LIRIS monthly seminars on visual metric learning
Jan, 22: PhD def. of A. Sanoja with M. Cord (Pres), M. Rukoz, E. Murisasco, L. Bouganim, P. Senellart, S. Gancarski
Jan, 20: PhD def. of M. Law with F. Bach, F. Precioso, J. Ponce, P. Perez, P. gallinari, M. Cord, S. Gancarski, N. Thome
Jan, 15: PhD def. of I Mhedhbi with P. Garda, A. Beghdadi, H. Rabbah, K. Hachicha, A. Bensrhair, D. Heudes, S. Hochberg, M Cord (President)

2014

Dec: PhD Thesis of Hanlin Goh Learning deep visual representations
Dec: our CVPR 2014 paper on fantope available
Dec, 02: HDR def. of M. Visani with I. Bloch, F. Sedes, R. Ingold, JM. Ogier, K. Tombre, JM. Jolion, M. Cord (Pres)
Oct, 06: HDR def. of V. Courboulay with JP Domenger, C. Fernandez-Maloigne, P. Le Callet, A. Guerin, M. Cord (Reviewer), A. Tabbone, A. Revel
Sept: 02-03 Invited speaker at the first Franco-Taiwanais workshop at EURECOM Sept, 29: PhD def. of A. Cahn Hon Tong with S. Canu, A. Caplier, M. ElYacoubi, C. Achard, L. Lucat, M Cord (Pres)
Sept, 16: HDR def. of S.A. Berrani with, T. Ebrahimi, N. Boujeemaa, M. Cord (Rev), B. Merialdo, P. Gros, J. Carrive, Ph. Joly
Sept: 05-PhD def. of D. Awad with F. Precioso, A. Revel, V. Courboulay, M. Cord, B. Girau
June, 13: PhD def. of A. Fagette with I. Laptev, L. Wong (NUS) F. Moutarde, O. Koch, D. Racoceanu, J Dufour, M. Cord (Pres)
May, 13: PhD def. of A. Bourrier with P. Perez, R. Gribonval, F. Bach, E. LePennec, G. Blanchard, M. Cord (Chair)
April, 15: Talk at Lyon LIMA2
April, 9: PhD def. of M. Jain with A. Smeulders, C. Schmid, P. Perez, P. Bouthemy, P. Gros, H. Jegou, M. Cord (reviewer)
April, 3: PhD def. of M. Jiu with C. Schmid, G. Taylor, M. Rombaut, A. Baskurt, C. Wolf, M. Cord (reviewer)
March, 31: PhD def. of DP Vo with H. Sahbi, J. Benois-Pineau, C. Djeraba, F. Jurie, JM Ogier, M. Cord
Feb, 17-18: SCAPE project meeting
Feb, 11: PhD def. of A. Znaidia with H. Maitre, B. Merialdo, P. Lambert, S. Ayache, S. Marchand-Maillet, N. Paragios, C. Hudelot, H. Le Borgne, M. Cord
Janv, 6: PhD def. of Z. Akata with C. Lampert, C. Schmid, V. Ferrari, F. Perronnin, G Quenot, M. Cord (reviewer)

2013

Dec: NIPS conf presentation of our paper on deep learning
Dec: Talk at Google, Mountain View on metric learning
Nov, 12: Final presentation for ANR GeoPeuple Detection
Oct, 11: PhD def. of Pehlivan Zeynep
Oct, 09: PhD def. of P. Phothisane with T. Chateau, R. Seguier, M. Cord, K. Bailly, L. Prevost, E. Bigorne
Oct, 02: PhD def. of LAI Hien Phuong about interactive CBIR with C. garcia, F. Sedes, B. Merialdo, M. Cord, JM Ogier, M Visani, A Boucher
Sept: Invited talk in ERMITES summer school
Sept, 30: PhD def. of my student Denis Pitzalis with N. Paparoditis, F. Niccolucci, F. Precioso, M. Joly and M. Detyniecki
Sept: 1 paper accepted at NIPS 2013
Sept: 1 paper accepted at ICCV 2013
Aug: Rodrigo Minetto First place of PhD Thesis of SIBGRAPI 2013
July, 12: PhD def. of my student Hanlin Goh with Y. Lecun, F. Jurie, A. Rakotomamonjy, JH. Lim as external members
July, 9: PhD def. F. Martinez on Eye tracking with L. Chen, JM Odobez, A. caplier, F. Brémond, M. Cord as external members
June, 27: Poster CVPR on learning local motion descriptors for video classification
June, 14: PhD def. of my student Sandra Avila ! with C. Schmid, F. Perronnin, P. Perez, M. Campos, and P. Gallinari (+superv.)
June, 5: PhD def. M. Trad on multimedia mining (chair) with B. Merialdo, A. Joly, N. Boujemaa, F. Precioso, D. teyssou, M. Cord
May, 21: PhD Def. of Alina Abduraman on TV content analysis with M. Cord (chair), B. Merialdo, Ph. Joly, G. Gravier, J. Carrive, S. Berrani
Jan: Session (Math. and Computer Science) chairman, JFFoS sympoqium, Kyoto Japon
- Introduction by M. Cord Speakers: T. Harada and I. Laptev
Jan: JMLR paper accepted in JMLR Track for Machine Learning Open Source Software project

2012

Dec, 20: talk at GdR Isis workshop
Dec, 13: seminar at Columbia Univ., NY, USA
Dec, 12: talk at ICMLA conf. Buca Raton, Fl, USA
Dec, 07: HDR def. of Antoine Manzanera with P. Bouthémy, M. Couprie, M. Cord, JM Jolion, B. Zavidovique, M. Paindavoine
Oct, 23: PhD Def. of MM Ullah on Human Action Recognition in Video (reviewer), with P. Perez, I. Laptev, F. Precioso, T. Tuytelaars, P. Bouthemy, E. Kijak
Oct, 15: Keynote speaker at 2nd Singaporean-French IPAL Symposium
Oct, 7-13: ECCV conf, poster presentation
Sept: PhD Def. of Roxana Horincar on dynamic Web content analysis (chair),
Sept: PhD Def. of Bahjat Safadi on semantic indexing of videos (reviewer)
Aug: ERMITES summer school
July: seminar in CNRS Summer School on Images, Content, recognition, classification
June: Workshop SCAPE in Paris
May, 23: INRIA Sophia seminar (Stars team) on Beyond the Bag of Word representation for image classification
May, 2: PhD defense (reviewer) of JP. Burochin, with P. Sturm, C. Heipke, F Tupin, N. Paparoditis, O. Tournaire
April, 3: PhD defense (reviewer) of Chao Zhu, with Liming Chen, CE Bichot, C Schmid and J. Benoit-Pineau
March, 19: PhD defense of my student Rodrigo Minetto with Patrick Perez and Marcin Detyniecki for the french side and Jorge Stolfi and 3 colleagues in Brazil
Feb: 1 paper accepted at ESANN 12 on learning product combinations of kernels with D. Picard, N. Thome and A. Rakotomamonjy

2011

Dec: seminar at Columbia university and visit of the Prof. Chang’s DVMM lab
Dec: seminar at NYU
Dec: PhD Def. of Alex Spengler on Web content analysis (chair),
Dec: PhD Def. (reviewer) of V. Dovgalecs
Nov: PhD Def. (chairman) of J. Defretin
Nov: Talk at ICCV workshop on Text Detection SnooperPlus.pdf
Nov: Poster at ICCV workshop on kernel learning cordWshopIccv.pdf
Oct: Talk at I3S lab., Sophia Antipolis Talki3sOct2011.pdf
Oct: Steve Jobs papers lemonde, libe
Oct: SCAPE Workshop at UPMC: Web page comparison talk
Sept: 2 papers accepted to workshops at ICCV 2011
Sept: ICIP conf. in Bruxelles
Sept: Midterm defense on deep architectures for image, Hanlin Goh midterm.pdf
Sept: Midterm evaluation of ANR ASAP slidesASAPmiparcours.pdf
August: paper on chi2 LSH accepted to PAMI
June: PhD Def. of Shuji Zhao
May: Yann LeCun invited in LIP6 for 1 month Talk_1.pdf
March: IGN french presentation on our text detection system IGNcord.pdf
Feb: Kick-off meeting of EU IP SCAPE project
Jan: Kick-off meeting of ANR Geopeuple project

2010

Dec: PhD Def. of D. Gorisse, HdR Def. of F. Precioso, PhD def. of JE. Haugeard
Nov: HdR def. of Maria Rifki (Pres.), PhD of P. Massip
Sept: ICIP in Hong Kong
Sept: SCAPE new IP european project accepted
Sept: Geopeuple new ANR project with EHESS and IGN accepted
July: PhD Def of T. Napoleon (reviewer), K. Bailly (Pres.)
June: PhD Def of Trinh-Minh-Tri Do (Pres.)
June: 4 papers accepted to ICIP 2010
April: invited seminar on similarity and scalability issues on CBIR at UniG university and at ECAIS
March: My paper An Application of Swarm Intelligence to Distributed Image Retrieval (with Picard and Revel) accepted to Information Sciences Journal
Feb: PhD Def of A. Bordes (Pres.), AM Tousch (reviewer)

Old News

2009

Dec09: IUF nomination
PhD Def of M. Hanif, E. Aldea
Nov09: ASAP new ANR project on deep learning
3 papers presented in Egypt ICIP 09
PhD Def. of Corina Iovan
GdR ISIS workshop organisation on scalability and cross-media
Oct09: Organize Digital Video event in Sibgrapi09, Brazil
Sept09: 2 papers in ISPRS workshop CMRT
Phd Def. of A. Auclair, MI. Akodjènou, J. Forest
March09: ITOWNS presentation in DAPA journey
Feb09: 1 week invited at I2R Institute in Singapore
Several Participations:
Program committee of CMRT09
Program committee of SinFra’09
Technical program committee of The 15th International MultiMedia Modeling Conference (MMM2009) http://mmm2009.eurecom.fr/
Program committee of CBMI 2009
Program committee of the International Conference on Imaging Theory and Applications 2009 http://www.imagapp.org//cfp.htm
Ad hoc reviewer for NSF, program 2009, USA
Committee CR2/CR1 of l’INRIA Rocquencourt 2009
Reference expert for Agropolis fondation, 2009

2008

Dec08: seminar at LEAR lab.
Presentation revue ANR
Paper in ICPR 08 on fast LSH kernels
Paper in ICIP 08 on distributed CBIR
Paper in CIKM 08 on high-dim indexing
Paper in TrecVid workshop TVS 08
PhD Def. of Diane Larlus Nov 28, LEAR lab, INRIA Grenoble
PhD Def. of David Picard Dec 5, ETIS lab

Some PhD thesis where I participated to the jury (or scheduled):

[2008] Nicolas Burrus

Apprentissage a contrario et architecture efficace pour la détection d’évènements visuels significatifs

[2008] Diane Larlus

Création et utilisation de vocabulaires visuels pour la catégorisation d’images et la segmentation de classes d’objets
Abstract:
This thesis deals with the interpretation of static images, with a focus on recognizing object categories. We consider several different approaches, which are all variations on the bag-of-words model, and all use local image descriptors. The first part of the thesis examines different methods for creating visual vocabularies. We aim to create vocabularies which perform well for image categorization. The first method proposed uses dense image representations. Feature descriptors are extracted and then quantized to visual words, using a two-stage clustering algorithm. We provide a full quantitative evaluation of the method. The second method we propose for creating visual vocabularies integrates the vocabulary into an image representation model. This generative model uses the image class labels via latent variables describing object aspect. Training the model leads to the creation of a compact and discriminative set of visual words. Next we show that traditional visual vocabularies (like the ones used above) can be replaced by random decision trees. Each tree provides a quantization of the space of descriptor representations into visual words. Since the trees are constructed using the image class labels, they have good classification performance. Each node uses a simple classifier, so processing of test images is fast. The random trees are also used for online learning of saliency maps which guide the process of descriptor sampling. The second part of the thesis deals with object category segmentation. We first present a method which uses an extended latent aspect based model. Instead of considering aspects at the image level, the method models them at the sub-region level. These semi-global regions can overlap and share information, allowing the local predictions to be improved. The local classifications are based on visual word statistics. The second segmentation method combines the low-level consistency properties of a Markov Random Field with an appearance model which provides higher-level constraints. The appearance model is based on regions which each represent a single object as a set of visual words. We also evaluate using decision trees instead of visual words. Finally, the method is applied to a real-world visual search problem for a humanoid robot. The method is used to generate hypotheses about the position of an object in the robot’s eld of view.

[2008] Karim Yousfi
Segmentation hiérarchique optimale par injection d’a priori radiométrique, géométrique ou spatial

[2008] Trang Vu
Apprentissage d’ordonnancements pour la constitution de Corpus d’evaluation et pour l’Agregation de listes en Recherche d’Information

[2008] Anne-Lise Chesnel
Damage assessment on buildings due to major disasters using very high resolution satellite multimodal images

[2008] Anastasia Krithara
Learning Aspect Models with Partially Labeled Data
Abstract:
Machine learning techniques have been used for various information access tasks, such as categorization, clustering or information extraction. Acquiring the annotated data necessary to apply supervised learning techniques is a major challenge for these applications, especially in very large collections. Annotating the data usually requires humans who can read and understand them, and is therefore very costly, especially in technical domains. Over the last years, two main approaches have been explored towards this direction, namely semi-supervised (SSL) and active learning. Both paradigms address the issue of annotation cost, but from two different perspectives. On the one hand, semi-supervised learning tries to learn by taking into account both labeled and unlabeled data. On the other hand, active learning tries to find the most informative examples to label, in order to minimize the number of labeled examples necessary for learning.
Either methods try to reduce the human labeling effort.
In this thesis, we address the problem of reducing this annotation burden. In particular, we investigate extensions of aspect models for the classification task, where the training set is partially labeled. We propose two semi-supervised PLSA algorithms, which incorporate a mislabeling error model. We then combine these semi-supervised algorithms with two active learning algorithms. Our models are developed as extensions of the classification system previously developed in Xerox Research Centre Europe. We evaluate the proposed models in three well-known datasets and in one coming from a Business Group of Xerox.

[2008] Eric Galmar Representation and Analysis of Video Content for Automatic Object Extraction
Abstract:
The recent explosion of multimedia applications has called for an increasing demand of advanced search and indexing of multimedia information. Among them, digital video content is certainly one of the most complex to analyze and represent. From this point of view, video objects are considered as essential elements for handling video contents, as they provide accurate and flexible representation for numerous applications such as semantic content analysis or video coding. Automated object extraction from videos is a difficult task that has been widely addressed in the past years in the context of MPEG-4 video coding. Methods developed so far mostly rely on motion estimation to define the object model and adapt this models frame to frame. However there is an agreement that robustness of motion and the accuracy of the support are dependent to each other. In this thesis, we first introduce a framework for video object modeling based on a spatiotemporal representation with graphs. The model describes both the internal structure of object regions and their spatiotemporal relationships inside the shot. This approach is fully supported by the MPEG-7 multimedia standard, where the information is structured hierarchically in scenes, shots, objects and regions. As the next step, we propose a 2D+T scheme for the extraction of spatiotemporal volumes. The method we developed uses local and global properties of the volumes to propagate them coherently in space and time. At this point we investigate grouping of spatiotemporal regions into complex objects using motion models. To address the difficulty of building motion models, we propose a method to propagate and match moving objects to areas where motion information is less relevant. In a third step, we investigate the benefit of semantic knowledge for spatiotemporal segmentation and labeling of video shots. For this purpose we extend a knowledge-based system providing fuzzy semantic labeling of image regions to video shots. The shot is split into smaller block units and, for each block, volumes are sampled temporally into frame regions that receive semantic labels. The semantic labels are then propagated within volumes and a consistent labeling of the shot is finally obtained by joint propagation and re-estimation of the semantic labels between the temporal segments. Finally, we explore the capabilities of the representation for indexing and retrieval tasks. We first consider the context of a region-based indexing framework called the Vector Space Model. We present a study of the model properties and show that the spatiotemporal representation gives more robustness to the visual signatures compared to the traditional keyframe representation. This dissertation concludes by proposing a strategy to compare efficiently object graphs. To this aim we introduce a similarity measure between graphs that we further use to search for a given object.

[2008] Lech Szumilas
Scale and Rotation Invariant Shape Matching
Abstract:
Recognition of objects from images is one of the central research topics of computer vision. The use of shape for recognizing objects has been actively studied since the beginning of object recognition in 1950s. Several authors suggest that object shape is more informative than its appearance – the object appearance properties such as texture and color vary between object instances more than the shape e.g. bottle, caps, cars, airplanes, cows, horses etc. Recent methods are concentrated on extracting shape features and learning the object models directly from images which impose such problems as object occlusion, incomplete and often fragmented object boundaries, varying camera view-points. While these approaches are designed to learn object models from fragmented and incomplete object boundaries, achieving invariance to rotation, scale and affine transformations has not been fully solved. This thesis address the problem of learning object models that use shape properties with full rotational and scale invariance. A new approach is proposed where invariance to image transformations is obtained through invariant matching rather than typical invariant features. This philosophy is especially applicable to shape features, represented by edges detected in images which do not have a specific scale or specific orientation until assembled into an object. Our primary contributions are: a new shape-based image descriptor that encodes a spatial configuration of edge parts, a technique for matching descriptors that is rotation and scale invariant and shape clustering that can extract frequently appearing image structures from training images without a supervision. This thesis also presents an overview of the object recognition field and our other contributions in the area of local appearance based methods, texture detection and image segmentation. Keywords: object recognition, shape, image descriptors, interest points.

[2008] Avik Bhattacharya
Indexing of satellite images using structural information
Abstract:
From the advent of human civilization on our planet to modern urbanization, road networks have not only provided a means for transportation of logistics but have also helped us to cross cultural boundaries. The properties of road networks vary considerably from one geographical environment to another. The networks pertaining in a satellite image can therefore be used to classify and retrieve such environments. In this work, we have defined several such environments, and classified them using geometrical and topological features computed from the road networks occurring in them. Due to certain limitations of these extraction methods there was a relative failure of network extraction in some urban regions containing narrow and dense road structures. This loss of information was circumvented by segmenting the urban regions and computing a second set of geometrical and topological features from them […].
Keywords : Satellite images, road networks, urban regions, classication, indexing.

[2007] Thomas Retornaz
Automatic detection of text from natural scenes. A semntic descriptor for content based image retrieval
Abstract:
Multimedia data bases, both personal and professional, are continuously growing and the need for automatic solutions becomes mandatory. Effort devoted by the research community to content-based image indexing is also growing, but the semantic gap is difficult to cross: the low level descriptors used for indexing are not efficient enough for an ergonomic manipulation of big and generic image data bases. The text present in a scene is usually linked to image semantic context and constitutes a relevant descriptor for content-based image indexing. In this thesis we present an approach to automatic detection of text from natural scenes, which tends to handle the text in different sizes, orientations, and backgrounds. The system uses a non linear scale space based on the ultimate opening operator (a morphological numerical residue). In a first step, we study the action of this operator on real images, and propose solutions to overcome these intrinsic limitations. In a second step, the operator is used in a text detection framework which contains additionally various tools of text categorisation. The robustness of our approach is proven on two different dataset. First we took part to ImagEval evaluation campaign and our approach was ranked first in the text localisation contest. Second, we produced result (using the same framework) on the free ICDAR dataset, the results obtained are comparable with those of the state of the art. Lastly, a demonstrator was carried out for EADS. Because of confidentiality, this work could not be integrated into this manuscript.

[2007] Laurence Boudet
Qualification automatique de modèles 3D de bâtiments à partir d’images aériennes haute résolution

[march 2006] Roger Trias Sanz
[pdf] Semi-automatic high-resolution rural land cover classification

[nov 2006] W. Touhami
Identification et classification automatique de régions d’intérêt dans des images tomographiques : application aux kystes du rein

[dec 2005] Seriy Kosinov
Machine Learning Approach to Semantic Augmentation of Multimedia Documents for Efficient Access and Retrieval

[june 2005] Nesrine Chehata
Modélisation 3D de scènes urbaines à partir d’images satellitaires à très haute résolution

[may 2005] Greet Frederix
Beyond Gaussian Mixture Models: Unsupervised Learning with applications to Image Analysis