Lectures – Summer School on Image Processing

A tentative list of lecture titles and abstracts is available here. We will be adding additional lecture information upon confirmation from individual speakers. The final schedule will be disclosed once we make all the arrangements.

Processing of spectral images

Matija Milanič, Ph.D.
University of Ljubljana, Faculty of Mathematics and Physics; Jozef Stefan Institute

Spectral imaging is imaging where images of an object or a scene are recorded at more than one spectral band. A common example of spectral imaging is RGB imaging where images are acquired at three different spectral bands: red, green and blue. However, spectral images can have a lot of different spectral bands – in case of hyperspectral imaging few hundreds or thousands bands. Because of the combination of spatial and spectral information, spectral images are typically processed using a different algorithms as monochromatic images.

In this lecture I will focus on overview of the spectral imaging technique and technology, typical spectral image processing pipeline including pre-processing (data normalisation, noise reduction) and processing (feature extraction, classification), all illustrated by examples from medicine, cultural heritage and food quality inspection. At the end, hot-topics in spectral imaging will be presented.

Keywords: multispectral, hyperspectral, medical imaging, cultural heritage

Analysis of sports scenes using deep neural networks

Marina Ivašić Kos, Ph.D.
University of Rijeka, Department of Informatics

This lecture deals with the application of deep neural networks in the analysis of sports scenes such as player detection, tracking and recognition of player activities. Scenes taken during handball games and training will be used as an example.

Handball is a team sport of two teams with 7 players each. It is played in the hall, with the ball according to well-defined rules. Players change roles during the game, from defense to offensive, using various techniques and actions to score or defend a goal. The aim of the analysis of sports handball scenes is multiple, from helping the player and coach in monitoring the performance and progress of players, analysis of techniques, assistance in refereeing and the like. This is a very challenging task for object detectors and trackers, given that players move quickly on the field under artificial lighting with cluttered background, change their position and distance to the camera, often overlap, …

The lecture will present the various experiments involving player and ball detection using state-of-the-art deep convolutional neural networks such as YOLO v3 or Mask R-CNN, player tracking using Hungarian algorithm and Deep Sort, and player action recognition using LSTM. The ability to use additional low-level features such as Optical flow and STIPs to determine the leading player in the game will be also demonstrated.

In conclusion, open questions and challenges will be discussed in the application of deep learning methods in such a dynamic sports environment.

Keywords: object detection, object tracking, action recognition, deep neural networks, handball

Anomaly detection in video: approaches and challenges

Radu Tudor Ionescu, Ph.D.
University of Bucharest

Anomaly detection in video is a challenging computer vision problem, as the classification of an event as normal or abnormal always depends on the context. For instance, driving a truck on the street is considered normal, but, if the truck enters a pedestrian area, the event becomes abnormal. Considering the commonly adopted definition of abnormal events and the reliance on context, it is difficult to obtain a sufficiently representative set of anomalies for all possible contexts, making traditional supervised methods less applicable to abnormal event detection. In this talk, we will present a series of state-of-the-art anomaly detection methods that are trained without direct supervision. The presented methods propose alternative approaches such as designing proxy self-supervised tasks or using pseudo-abnormal examples in an adversarial fashion. Finally, we will discuss open challenges in abnormal event detection.

Keywords: anomaly detection, deep learning, video analysis, convolutional neural networks

Selected topics on biomedical image processing

Erich Sorantin, MD, Ph.D.
Medical University of Graz

Today’s imaging modalities like Multirow Detector Computed Tomography or Magnetic Resonance Tomography offer advanced geometrical and temporal resolution. Thus the generated data are already within the same range as the human genome. Therefore traditional reporting techniques like film or monitor reading are not longer appropriate. Moreover, the high computational power of recent workstations allows the implementation of more complex algorithms in order to assist the reporting radiologist, especially supporting quantitative tasks.
The aim of the lecture is to demonstrate selected applications of biomedical imaging and visualization regarding quantitative assessment of airway stenosis, computer aided diagnosis and virtual imaging with special emphasis on virtual endoscopic techniques as well as virtual operation planing.

Keywords: medical imaging, image processing, 3D

Skeletonization and its applications

Kálmán Palágyi, Ph.D.
University of Szeged

Skeleton-like shape features (i.e., centerlines, medial surfaces, and topological kernels) are frequently used region-based shape features which summarize the general form of objects and represent their topological structures. They play important role in various applications in image processing, pattern recognition, and visualization. I shall define skeletons and present their properties. Then the three major skeletonization techniques will be presented. Finally, some applications will be outlined.

Keywords: shape representation, skeleton, distance transform, voronoi diagram, thinning

Forward and inverse problems in computer graphics and image processing

László Varga, Ph.D.
University of Szeged, Department of Image processing and Computer Graphics

Inverse problems are common in many fields of computer science and image processing. This includes for example tomography, deconvolution, camera calibration, teaching neural networks or simply solving equation systems. The solution of an inverse problem usually starts with having a forward model of the phenomena. We have derived measurements that is a transformed version of what we want to have, and we want to gain the original “un-transformed” data. In deconvolution we have a blurred image, but we would like to get the original sharp version, in tomography we have X-ray attenuations, but we want to get the densities in an object. This leads us to the task of inverting a data transformation. The solution usually starts by understanding the transformation of the data, thus creating a forward model of the distortion. Then we can invert the problem using numerical tools. The lecture will give some examples of such forward problems and their inversions.

Keywords: image processing, tomography, inverse problems, optimization

Mathematical models in image processing

Tibor Lukić, Ph.D.
University of Novi Sad, Faculty of Technical Sciences

Energy minimization models are often used in many image processing problems, such as tomography, image denoising or segmentation. Incorporating a priori information about the solution into the energy minimization model is called regularization. Shape descriptors are often applied as regularization. Several important shape descriptors will be presented and analyzed. Calculus of variations is a mathematical discipline which provides a good basis for solving several segmentation problems, especially with the help of active contour models. The presentation will provide a brief overview of this area.

Keywords: energy minimization, calculus of variations, active contours, shape descriptors, optimization methods

Image reconstruction

Péter Balázs, Ph.D.
University of Szeged, Institute of Informatics

Computerized Tomography (CT) was originally a method of diagnostic radiology to obtain the density distribution within the human body based on X-ray projection samples. From a mathematical point of view it seeks to determine an unknown function defined over the 3D Euclidean space from weighted integrals over subspaces, called projections. Since the values of the function can vary over a wide range, a huge number of projections are needed to ensure an accurate reconstruction. In the first part of the talk we investigate the abovementioned image reconstruction problem.

Outside medicine, there are applications of tomography where the number of projections one can acquire is limited, therefore CT reconstruction methods are no longer successfully applicable. However, there is still a chance to get an accurate reconstruction from just a small number of projections. By exploiting prior knowledge that the range of the image function is discrete and consists of only a small number of known values the reconstruction quality can be enhanced. This leads us to the field of Discrete Tomography that will be discussed in the second part of the talk.

Keywords: image reconstruction, tomography

Deep learning for ophthalmology

Hrvoje Bogunović, Ph.D.
Medical University of Vienna

Ophthalmology is at the forefront of deep learning applications in medicine due to the ability to image the retina quickly and non-invasively. Deep learning recently enabled a first of its kind FDA approved fully autonomous diagnostic system and the field is attracting deep learning giants like Google, DeepMind and Baidu. I will show how deep learning is used for quantification of imaging biomarkers, automated diagnosis and progression prediction of prominent retinal diseases, the leading causes of blindness today.

Keywords: machine learning, medical image analysis, retina, optical coherence tomography

Segmenting medical images

Antal Nagy, Ph.D.
University of Szeged

In my lecture I will overview all the basic information about medical image processing from the image acquisition, possible preprocessing steps, segmentation and final evaluation.

Keywords: medical image processing, medical imaging modalities, segmentation

Medical image segmentation with topological constraints

Ilkay Oksuz, Ph.D.
Istanbul Technical University

Segmentation is the process of assigning a meaningful label to each pixel in an image and is one of the fundamental tasks in image analysis. Significant progress has been made on this problem in recent years by using deep convolutional neural networks (CNN), which are now the basis for most newly developed segmentation algorithms. Typically, a CNN is trained to perform image segmentation in a supervised manner using a large number of labelled training cases, i.e. paired examples of images and their corresponding segmentations. For each case in the training set, the network is trained to minimise some loss function, typically a pixelwise measure of dissimilarity (such as the cross-entropy) between the predicted and the ground-truth segmentations. However, errors in some regions of the image may be more significant than others, in terms of the segmented object’s interpretation, or for downstream calculation or modelling. Nonetheless, loss functions that only measure the degree of overlap between the predicted and the ground-truth segmentations are unable to capture the extent to which the large-scale structure of the predicted segmentation is correct, in terms of its shape or topology. In this talk, I will cover the neural network based segmentation methods that can incorporate topology and shape information for medical image segmentation. The examples will focus on computer assisted interventions and cardiac MRI myocardium segmentation.

Keywords: image segmentation, topology, shape, neural networks

Selected topics in video compression: wavelets and autoencoders

André Kaup, Ph.D.
Friedrich-Alexander University Erlangen-Nuremberg

Image and video compression is a key technology in media streaming applications as well as modern communication systems and has gained increasing relevance during the current Corona pandemic as enabling technology for video conferencing systems such as Zoom, WebEx, NetMeeting, and others. In this course we will first shortly review basic image and video compression concepts such as motion compensated hybrid compression, as they have been standardized within JPEG and MPEG. In a second part we will investigate novel graph-based motion compensated wavelet lifting technologies and show how they can be integrated into efficient lossless video compression. We will show that this technology is specifically useful for coding medical hypervolume data. A final third part will focus on recent conditional autoencoder principles for video coding, which make use of neural networks and learning-based compression as a new coding paradigm. We will highlight advantages and compare this technology in terms of its rate distortion performance to current video compression standards.

Keywords: image coding, video compression, wavelet lifting, conditional autoencoding, latent spaces

Deep learning in medical imaging

Ernst Schwartz
Medical University Vienna

Driven by a dramatic increase in the availability of both powerful hardware and large datasets, deep learning has revolutionized the field of computer vision in the last decade, dramatically outperforming previous approaches in diverse areas such as classification, semantic image understanding, segmentation and object detection to name a few. The application of these methods in the domain of medical imaging have shown that it is possible to achieve human-level performance in disease diagnostics and understanding. In this presentation, I will provide a quick overview of the essentials of deep learning on image data before presenting recent applications in the medical domain.

Keywords: deep learning, medical imaging, neural networks, convolution, U-net

Integration of spatial configuration into CNNs

Darko Štern, Ph.D.
Medical University of Graz

In many medical image analysis applications, only a limited amount of training data is available due to the costs of image acquisition and the large manual annotation effort required from experts. Training recent state-of-the-art machine learning methods like convolutional neural networks (CNNs) from small datasets is a challenging task. In this talk, I will present a CNN architecture that learns to split the task into two simpler sub-problems, reducing the overall need for large training datasets. Thus, the CNN dedicates one component to locally accurate but ambiguous predictions, while the other component improves robustness to ambiguities by incorporating the spatial configuration of the anatomy. The efficiency of the proposed deep learning model will be shown for both localization and segmentation tasks.

Keywords: medical image analysis, machine learning, deep learning, localization, segmentation

Integrated 3D visualisation of heterogeneous 3D data & applications

Alexander Bornik, Ph.D.
Ludwig Boltzmann Institute for Archaeological Prospection and Virtual Archaeology

The widespread and joint use of 3D imaging modalities including laser scanners, CT, MRI, image-based modelling leads to an increasing number of heterogeneous 3D datasets. Optimally understanding their mutual information content in general and dataset immanent spatial feature relations requires visualisation techniques capable of producing conjoint visualisations seamlessly combining the visual contributions from the data representations involved, namely 3D models, 3D point clouds, and 3D volumes. Moreover, pre-processing and flexible tools to control the influence of the respective representation are required to avoid clutter.
In the talk I am going to present such an integrated 3D rendering approach including practical applications in legal medicine, forensics, anthropology, and archaeology.

Keywords: conjoint 3D visualisation, heterogeneous data, forensics, archaeology

Sensor modalities for autonomous vehicles

Janez Perš, Ph.D.
University of Ljubljana, Faculty of Electrical Engineering

In the lecture, I will present some of the common sensor modalities that are used in autonomous and semi-autonomous vehicles. After brief introduction to RGB cameras and extension to stereo matching, we will examine LIDAR, automotive radar, thermographic cameras, polarization cameras, inertial sensors, and high-precision GPS. The focus of the lecture will be on the nature and properties of data provided by these sensors with only brief explanation of each sensor’s physics. I will focus on possible use cases for each modality in the fields of image processing and artificial intelligence, as they relate to self-driving vehicles. As a benefit to students, a short dataset, containing the data from the above listed sensors will be provided for download, so that the students can examine the data and develop their own algorithms during or after the summer school. The multimodal dataset was captured by our own multimodal sensor system for water-borne autonomous vehicles.

Keywords: artificial intelligence, autonomous driving, cameras, LIDAR, RADAR

Establishing geometric correspondence between images: from theory to applications

Jiri Matas, Ph.D.
Czech Technical University in Prague

Establishing geometric correspondences in a pair of image, or more generally a collection of image, depicting the same object or scene, is a core component of a number of computer vision problems, such as 3D reconstruction, image retrieval and image-to-image registration.

The difficulty of image matching stems from the range of factors that may change between the observations – the viewpoint, the illumination, the properties or even the modality of the acquisition device; some parts might be occluded. The scene itself may change geometrically and photometrically.

We will first formulate the problem, present the standard image matching pipeline, which currently includes a number of CNN-based modules, as well as some end-to-end recent methods. Finally, we will present selected applications.

Keywords: computer vision, two-view matching