Tutoriels

Tutoriel 1: Deep Learning and Neural Networks for computer vision and signal processing

Résumé :

Representation Learning or Deep Learning consists in automatically learning layered and hierarchical representation with various layers abstraction from large amounts of data. This presentation will review the history of the field, the main actors and the major scientific challenges. We will first present a brief introduction into the general challenges in machine learning of high dimensional input, like images, signals and text. We will present common deep models like convolutional neural networks and recurrent networks and various widely used standard tools and problems, like attention mechanisms, transfer learning and learning structured output. Implementing these models in deep learning frameworks (Tensorflow, PyTorch) will be briefly touched. Finally, we will go into more into detail of some selected applications in computer vision and signal processing.

Biographie :

Christian WOLF is associate professor (Maître de Conférences, HDR) at INSA de Lyon and LIRIS, a CNRS laboratory, since sept. 2005. He is interested in computer vision and machine learning, especially the visual analysis of complex scenes in motion: gesture and activity recognition and pose estimation and learning from interactions. His work puts an emphasis on modelling complex interactions of a large amount of variables: Deep Learning, structured models, and graphical models.

He received his MSc in computer science from Vienna University of Technology in 2000, and a PhD in computer science from the National Institut of Applied Science (INSA de Lyon), France, in 2003. In 2012 he obtained the habilitation diploma, also from INSA de Lyon. From september 2004 to august 2005 he was assistant professor at the Louis Pasteur University, Strasbourg. Since September 2017 he is on leave (=en délégation) with INRIA, (Chroma group), at the CITI Laboratory

Tutoriel 2: A guided tour of computational modelling of visual attention

Résumé :

Since the first computational model of visual attention, proposed in 1998 by Itti et al. [1], a lot of progress has been made. Progress concern both the modelling in itself and the way we assess the performance of saliency models. Recently, new advances in machine learning, more specifically in deep learning, have brought a new momentum in this field. In this tutorial, we present saliency models as well as the metrics used to assess their performances. In particular, we will empathize new saliency models which are based on convolutional neural networks. We will present different deep architectures and the different loss functions used during the training process. We will conclude this presentation by introducing saccadic models [2,3] which are a generalization of saliency models
[1] Itti, L., Koch, C., & Niebur, E. (1998). A model of saliency-based visual attention for rapid scene analysis. IEEE Transactions on pattern analysis and machine intelligence, 20(11), 1254-1259.
[2] Le Meur, O., & Liu, Z. (2015). Saccadic model of eye movements for free-viewing condition. Vision research, 116, 152-164.
[3] Le Meur, O., & Coutrot, A. (2016). Introducing context-dependent and spatially-variant viewing biases in saccadic models. Vision research, 121, 72-84.

Biographie :

Dr Olivier Le Meur (PhD 2005) is Associate Professor at the University of Rennes 1 (Ecole Supérieure d'Ingénieurs de Rennes). He supervises the research team PERCEPT at IRISA/Rennes. Olivier Le Meur got his Ph.D. from University of Nantes in 2005 and an HDR (Habilitation à Diriger des Recherches) of the University of Rennes 1 in 2014. Before joining the University in 2009, he was a project leader at the research center of Technicolor R&D. He is the author of more than 50 peer-reviewed publications in international conferences and journals in the field of image processing (IEEE TIP, IEEE PAMI, ICIP), computer vision (ECCV, ACCV) and applied perception (Vision Research). His expertise is in the field of image processing, cognitive sciences and computational modelling of visual attention. For more details, please refer to http://people.irisa.fr/Olivier.Le_Meur/ and http://www-percept.irisa.fr/

Tutoriel 3 : Les normes de codage vidéo HEVC et VVC : tutoriel et état des lieux

Résumé :

Le standard de codage vidéo HEVC, normalisé en 2013 par l’ISO et L’ITU-T, a permis d’obtenir un gain de 50% en compression par rapport à l’état de l’art. Nous présentons les spécificités du standard HEVC, tant du point de vue des outils de compression que des fonctionnalités, en nous focalisant sur les principales nouveautés par rapport aux normes précédentes. Par ailleurs, L’ISO et l’ITU-T travaillent actuellement à la construction de son successeur, VVC (pour Versatile Video Codec) qui devrait être normalisé fin 2020. Nous présenterons également un état des lieux de ce futur standard qui promet un gain de 50% sur HEVC.

Biographie :

Félix Henry est ingénieur diplômé de Telecom SudParis, Évry, France, et Docteur de Telecom ParisTech, Paris, France. Il a débuté sa carrière dans le domaine du codage d’images fixes en 1995 à Canon Research Center, France, où il a activement participé à la construction du standard JPEG2000. Il a rejoint Orange en 2010, où il a été impliqué dans le développement d’outils de compression pour la norme HEVC. Il participe aujourd’hui à la norme VVC. Ses domaines d’intérêt incluent le codage intra, le codage des coefficients transformés, le parallélisme haut niveau, et les approches en rupture. Félix Henry est l’auteur de plus de 150 brevets dans le domaine du traitement des signaux.

Tutoriel 4 : Tatouage et Stéganalyse : approche par ondelettes

Résumé :

Dans cet exposé, nous proposons de présenter la problématique de la sécurisation des données Multimédia (image, vidéo) à travers à la fois l'enfouissement d'information (tatouage) mais aussi la détection de messages malicieux (stéganalyse). Plus précisément nous ferons un bilan des approches utilisant les qualités des représentations basées ondelettes permettant d'identifier les informations saillantes présentes dans des données Multimédia.

Biographie :

The main objective of my work is the definition and the optimization of multi-scale, multi-resolution for all the color images and videos processing chain, by taking into account the vector type of the information and possibly specificities of the human visual system. Mathematical models essentially studied relate to Space-Scale-Frequency techniques.
Space-Scale-Frequency techniques aim at representing information initially present in a nD signal (scalar or vector) in an optimal manner. However, the enlargement of the wavelets transform to higher dimensions than 1 draws some questions on the information description in the transformed space. We have proposed new discrete atomic decompositions for improving the coding of structuring elements of an image or a sequence. In this framework, we have for example studied Quaternionic and Monogenic representation to process color image. New color tools are proposed to analyze the digital spectrum embedded in each of these formalisms and the definition of new approach for Data security (Watermarking, steganalyse).

Personnes connectées : 1