Formazione e ricerca

Attività Formative del Corso di Dottorato

This page shows the courses and classes of the PhD programme for the academic year 2023/2024. Additional courses and classes will be added during the year. Please check for updates regularly!

Non monotonic reasoning

Crediti: 3

Lingua di erogazione: English

Docente:  Matteo Cristani

Sustainable Embodied Mechanical Intelligence

Crediti: 3

Lingua di erogazione: English

Docente:  Giovanni Gerardo Muscolo

Brain Computer Interfaces

Crediti: 3

Lingua di erogazione: English

Docente:  Silvia Francesca Storti

A practical interdisciplinary PhD course on exploratory data analysis

Crediti: 4

Lingua di erogazione: Inglese

Docente:  Prof. Vincenzo Bonnici (Università di Parma)

Multimodal Learning and Applications

Crediti: 5

Lingua di erogazione: Inglese

Docente:  Cigdem Beyan

Introduction to Blockchain

Crediti: 3

Lingua di erogazione: English

Docente:  Sara Migliorini

Autonomous Agents and Multi-Agent Systems

Crediti: 5

Lingua di erogazione: English

Docente:  Alessandro Farinelli

Cyber-Physical System Security

Crediti: 3

Lingua di erogazione: English/Italian

Docente:  Massimo Merro

Foundations of quantum languages

Crediti: 3

Lingua di erogazione: English

Docente:  Margherita Zorzi

Advanced Data Structures for Textual Data

Crediti: 3

Lingua di erogazione: English

Docente:  Zsuzsanna Liptak

AI and explainable models

Crediti: 5

Lingua di erogazione: English

Docente:  Gloria Menegaz, Lorenza Brusini

Automated Software Testing

Crediti: 4

Lingua di erogazione: English

Docente:  Mariano Ceccato

Elements of Machine Teaching: Theory and Appl.

Crediti: 3

Lingua di erogazione: English

Docente:  Ferdinando Cicalese

Introduction to Quantum Machine Learning

Crediti: 4

Lingua di erogazione: English

Docente:  Alessandra Di Pierro

Laboratory of quantum information in classical wave-optics analogy

Crediti: 3

Lingua di erogazione: English

Docente:  Claudia Daffara

Referente

Cigdem Beyan

Crediti

5

Lingua di erogazione

Inglese

Frequenza alle lezioni

Scelta Libera

Sede

VERONA

Obiettivi di apprendimento

For intelligent systems, adeptly interpreting, reasoning, and fusing multimodal information is essential. One of the latest and most promising trends in machine/deep learning research is Multimodal Learning, a multi-disciplinary field focused on integrating and modeling multiple modalities, such as acoustics, linguistics and vision. This course explores fundamental concepts in multimodal learning, including alignment, fusion, joint learning, temporal learning, and representation
learning. Through an examination of recent state-of-the-art papers, the course emphasizes effective computational algorithms tailored for diverse applications. Various datasets, sensing approaches, and computational methodologies will be explored, with discussions on existing limitations and potential future directions. Course evaluation will involve a small project assigned to student groups.

Modalità didattiche

June 2024

Lezioni Programmate

Quando Aula Docente Argomenti
lunedì 17 giugno 2024
14:00 - 18:00
Durata: 04:00
Ca' Vignal 2 - L [67 - 1°] Cigdem Beyan The definition of multimodality, multimodality versus multimedia, heterogeneous and interconnected data, modalities, common sensors, definitions of multimodal machine learning and multimodal artificial intelligence, research tasks: audio-visual speech recognition, affective computing, synthesis, human-human-robot interaction analysis, content understanding,...., multimedia information retrieval, Multimodal technical challenges: a) representation (joint, coordinated), contrastive learning, CLIP, b) Alignment (explicit, implicit), Dynamic time warping, self-attention, cross attention, transformers, why attention is important, Semantic alignment, visual grounding, text grounding, Referring Expression Segmentation. State of the art examples for each challenge.
martedì 18 giugno 2024
14:00 - 18:00
Durata: 04:00
Ca' Vignal 2 - L [67 - 1°] Cigdem Beyan Multimodal learning challenges: c) Translation (example based, generative based), GAN based example, avatar creation, Dall-E, Dall-E 2, Stable diffusion, d) Fusion (late, early fusion), Multimodal kernel learning, graphical models, neural networks, e) co-learning definition, co-learning via representation, f) generation for summarization and creation, multimodal summarization and example approaches, creation evaluation metrics (IS, FID, SID) and their limitations, generation open challenges, g) learning and optimization (overfitting to generalization ratio), gradient blending, h) modality bias, i) fairness, explainability, interpretability.
mercoledì 19 giugno 2024
14:00 - 18:00
Durata: 04:00
Ca' Vignal 2 - L [67 - 1°] Cigdem Beyan Applications: Intro to human behavior understanding. The definition of Social Signal Processing, social signals, verbal and nonverbal communication, and nonverbal cues (body activity, eye gaze, facial expressions, vocal behavior, physical appearance, proxemics), methodologies, toolboxes, libraries used to extract all these nonverbal cues. Types of interactions (joint focused, common focused,...), f-formations, example applications with references, into to open-face, mediapipe, openpose, opensmile. Human-human interaction datasets
giovedì 20 giugno 2024
14:00 - 18:00
Durata: 04:00
Ca' Vignal 2 - L [67 - 1°] Cigdem Beyan SSP examples: a) Emergent leader detection in meeting environments: dataset creation, annotation, used nonverbal cues, results, future work. b) Gaze target detection: unimodal SOTA, multimodal SOTA with depth maps, multimodal SOTA with skeletons and deep maps, privacy-preserving gaze target detection, transformer-based gaze target detection, multi task gaze target detection, c) predicting gaze from egocentric social interactions (dataset creation, methodology, evaluation, future work), d) social group detection (methodology, evaluation). SSP challenges and future directions (privacy preserving, domain adaptation, unsupervised learning,....)
venerdì 21 giugno 2024
14:00 - 18:00
Durata: 04:00
Ca' Vignal 2 - L [67 - 1°] Cigdem Beyan Multimodal activity recognition (HAR): definition, possible sensors, importance, challenges, Approaches and datasets: HAR using RGB camera, HAR using RGB+depth, point-cloud based HAR, Egocentric action recognition datasets. Introducing EGO4D dataset, challenges, methodology: short term object interaction anticipation. Introducing Ego-Exo4 dataset, benchmarks, sensors, tasks. Multimodal emotion recognition: definition of emotions, discrete emotions, Russel theory, cues to represent and predict emotions automatically, datasets from unimodal to multimodal, open questions, rare applications, open research problem. Methodology: Zero-shot multimodal emotion recognition, disentanglement based multimodal emotion recognition.