Master's degree in Data Science

Studying at the University of Verona

Here you can find information on the organisational aspects of the Programme, lecture timetables, learning activities and useful contact details for your time at the University, from enrolment to graduation.

Academic calendar Teaching staff Modules Additional learning activities Attendance modes and venues Graduation Career management Erasmus+ and other experiences abroad

Study Plan

The Study Plan includes all modules, teaching and learning activities that each student will need to undertake during their time at the University.
Please select your Study Plan based on your enrollment year.

1° Year

Modules	Credits	TAF	SSD

Business organisation and management

SECS-P/08

Probability for Data Science

MAT/06

Programming and database

INF/01

2° Year activated in the A.Y. 2023/2024

Modules	Credits	TAF	SSD

Ethics and law of data protection

B/C

IUS/01 ,M-FIL/03

Training

Final exam

Modules	Credits	TAF	SSD

Business organisation and management

SECS-P/08

Probability for Data Science

MAT/06

Programming and database

INF/01

activated in the A.Y. 2023/2024

Modules	Credits	TAF	SSD

Ethics and law of data protection

B/C

IUS/01 ,M-FIL/03

Training

Final exam

Modules	Credits	TAF	SSD

Between the years: 1°- 2°

1 module among the following (a.a. 2023/24: Data protection in business organizations not activated)

Big data epistemology

M-FIL/02

Comparative and transnational law & technology

IUS/02

Cybercrime

IUS/17

Social research

SPS/07

Data protection in business organizations

IUS/04

Between the years: 1°- 2°

2 modules among the following (a.a. 2023/24: Statistical methods for business intelligence not activated)

Business _Analytics (BA)

SECS-P/10

Digital Marketing and Market Research

SECS-P/08

Digital Transformation and IT change

SECS-P/10

Logistics, operations & supply chain

SECS-P/08

Statistical methods for business intelligence

SECS-S/01

Between the years: 1°- 2°

2 modules among the following (a.a. 2023/24: Complex systems and social physics not activated)

Complex systems and social physics

FIS/02

Discrete optimization and decision making

MAT/09

Marketing research for agrifood and natural resources

AGR/01

Statistical models for Data Science

MAT/06

Continuous optimization for data science

MAT/08

Network science and econophysics Not provided

FIS/02

Between the years: 1°- 2°

2 modules among the following

Data Security & Privacy

INF/01

Machine Learning for Data Science

ING-INF/05

Mining Massive Datasets

ING-INF/05

Statistical learning

INF/01

Data visualisation

INF/01

Between the years: 1°- 2°

Activities to be chosen by the student

Legend | Type of training activity (TTA)

TAF (Type of Educational Activity) All courses and activities are classified into different types of educational activities, indicated by a letter.

A Basic activities

B Characterizing activities

C Related or complementary activities

D Activities to be chosen by the student

E Final examination

F Other training activities

S Placements in companies, public or private institutions and professional associations

Teaching code

4S009079

Academic staff

Luca Di Persio, Ilaria Boscolo Galazzo

Coordinator

Luca Di Persio

Credits

Also offered in courses:

Statistical Models of the course Master's degree in Artificial intelligence

Language

English

Scientific Disciplinary Sector (SSD)

MAT/06 - PROBABILITY AND STATISTICS

Period

Semester 1 dal Oct 3, 2022 al Jan 27, 2023.

Lessons timetable

Moodle Seminars 0

Learning objectives

The course will be devoted to the mathematical background necessary to describe, analyze and derive value from datasets, possibly Big Data and unstructured, and to master the main probabilistic models used in the data science field. Starting from basic models, for example regressions, PCA-based predictors, Bayesian statistics, filters, etc., particular emphasis will be placed on mathematically rigorous quantitative approaches aimed at optimizing the data collection, cleaning and organization phases (e.g. series historical data, unstructured data generated in social media, semantic elements, etc.). The mathematical tools necessary to deal with the description of the time series, their analysis and forecasts will also be introduced. The contents of the entire course will be structured in interaction with the study of real problems relating to industrial, economic, social, etc., heterogeneous sectors, using software oriented to probabilistic modeling, for example, Knime, ElasticSearch, Kibana, R AnalyticFlow, Orange , etc.

Prerequisites and basic notions

Regarding both component modules of the entire course: basic notions of Probability Theory, knowledge of the main models of notable discrete and continuous random variables (eg: binomial, Poisson, Gaussian) and their main statistical properties; convergence theorems (eg: law of large numbers, central limit theorem), basic notions of discrete and continuous time stochastic processes (eg: Markov chains, birth and death processes), rudiments of statistical analysis and data (eg : frequency, average, mode, square deviation). Basics of programming in Python, relating in particular to general syntax, data structures, import / export, main graphics for data visualization. Rudiments of the main libraries such as Numpy, Pandas and Matplotlib.

Program

The course program is divided into the following macro-topics

Part 1 [module 1]
1. Time domain analysis
2. Frequency domain analysis
3. Tools for data analysis and cleaning (eg identification of outliers)
4. Methods of maximum verseimilitude, likelihood metrics, fitting density Probability
5. Principal Component Analysis (PCA) [PCA-based regressors / predictors]
6. AR, MA, ARMA, ARIMA, Box-Jenkins, ARCH, GARCH models and their generalizations
7. TIme series decomposition ACF / PACF and connected visualizations
8. Hypothesis tests Gaussian and jump processes / compound processes
9. Decomposition of white noise type processes
10. Bayesian statistics and applications
11. Forecast evaluations via consideration of inferential statistical models, based, eg, on autocovariance and partial autocorrelation, seasonality (SARIMA), variance analysis (ANOVA, MANOVA) , etc.
12. Smoothing techniques, spectral decomposition, polynomial fitting, etc.

Part 2 [module 2]
1. Recalls to programming in Python
2. Manage and view time series
3. Descriptive statistics
4. Analysis in the frequency domain
5. Linear regression for time series
6. Analyze and decompose the principal components of the time series (trend, cycle, seasonality)
7. Forecasting methods: Exponential Smoothing (simple, double, triple)
8. Forecasting methods: AR, MA, ARMA, ARIMA, SARIMA
9. Forecasting methods: ARCH, GARCH and generalizations
10. How to evaluate the different forecasting models

All the above points will be deepened through practical exercises that will require their implementation by appropriate Python codes.
Moreover, the main forecasting methods will be further investigated thanks to the treatment and resolution of real case studies of various types.

Bibliography

Vai alla bibliografia

Visualizza la bibliografia con Leganto, strumento che il Sistema Bibliotecario mette a disposizione per recuperare i testi in programma d'esame in modo semplice e innovativo.

Didactic methods

The course will be divided into lectures, with slides as well as notes sharing, and computer simulations / exercises.

Learning assessment procedures

The final exam consists of two parts: one theoretical, the next practical / implementative. Consequently, the first part of the exam is functional to the verification of the learning of the theoretical concepts characterizing the statistical methods and the connected models and algorithms, at the basis of the IT-computational implementations used to donduct a project that the student will agree with the course teachers.
Latter "case study", together with the discussion of the coding parts created to complete it, will be the subject of the second and final part of the exam.

Students with disabilities or specific learning disorders (SLD), who intend to request the adaptation of the exam, must follow the instructions given HERE

Evaluation criteria

The evaluation of the exam will be carried out by combining the results obtained from the two modules of the course, therefore giving equal importance to the correctness and effectiveness of the solutions adopted in the phase of solving concrete problems due to computer implementations, as well as to understanding of the probabilistic / statistical models underlying them.

Criteria for the composition of the final grade

The final grade will be the result of the joint evaluation of the two theoretical tests and the resolution of the "case study" agreed by the student with the teachers., in accordance with what is expressed in the sections "Examination procedures" and "Evaluation criteria".

Exam language

Inglese / English

Master’s degree

Studying at the University of Verona

Study Plan

Legend | Type of training activity (TTA)

Learning objectives

Prerequisites and basic notions

Program

Bibliography

Didactic methods

Learning assessment procedures

Evaluation criteria

Criteria for the composition of the final grade

Exam language

Univr risponde