Master's degree in Data Science

Studying at the University of Verona

Here you can find information on the organisational aspects of the Programme, lecture timetables, learning activities and useful contact details for your time at the University, from enrolment to graduation.

Academic calendar Teaching staff Modules Additional learning activities Attendance modes and venues Graduation Career management Erasmus+ and other experiences abroad

Study Plan

The Study Plan includes all modules, teaching and learning activities that each student will need to undertake during their time at the University.
Please select your Study Plan based on your enrollment year.

1° Year

Modules	Credits	TAF	SSD

Business organisation and management

SECS-P/08

Probability for Data Science

MAT/06

Programming and database

INF/01

2° Year activated in the A.Y. 2021/2022

Modules	Credits	TAF	SSD

Ethics and law of data protection

B/C

IUS/01 ,M-FIL/03

Training

Final exam

Modules	Credits	TAF	SSD

Business organisation and management

SECS-P/08

Probability for Data Science

MAT/06

Programming and database

INF/01

activated in the A.Y. 2021/2022

Modules	Credits	TAF	SSD

Ethics and law of data protection

B/C

IUS/01 ,M-FIL/03

Training

Final exam

Modules	Credits	TAF	SSD

Between the years: 1°- 2°

1 module among the following (1st year: Big Data epistemology and Social research; 2nd year: Cybercrime, Data protection in business organizations, Comparative and Transnational Law & Technology)

Big data epistemology

M-FIL/02

Comparative and transnational law & technology

IUS/02

Cybercrime

IUS/17

Social research

SPS/07

Data protection in business organizations

IUS/04

Between the years: 1°- 2°

2 courses among the following (1st year: Business analytics, Digital Marketing and market research; 2nd year: Logistics, Operations & Supply Chain, Digital transformation and IT change, Statistical methods for Business intelligence)

Business _Analytics (BA)

SECS-P/10

Digital Marketing and Market Research

SECS-P/08

Digital Transformation and IT change

SECS-P/10

Logistics, operations & supply chain

SECS-P/08

Statistical methods for business intelligence

SECS-S/01

Between the years: 1°- 2°

2 courses among the following (1st year: Complex systems and social physics, Discrete Optimization and Decision Making, 2nd year: Statistical models for Data Science, Continuous Optimization for Data Science, Network science and econophysics, Marketing research for agrifood and natural resources)

Complex systems and social physics

FIS/02

Discrete optimization and decision making

MAT/09

Marketing research for agrifood and natural resources

AGR/01

Statistical models for Data Science

MAT/06

Continuous optimization for data science

MAT/08

Network science and econophysics

FIS/02

Between the years: 1°- 2°

2 courses among the following (1st year: Data Visualisation, Data Security & Privacy, Statistical learning, Mining Massive Dataset, 2nd year: Machine Learning for Data Science)

Data Security & Privacy

INF/01

Machine Learning for Data Science

ING-INF/05

Mining Massive Datasets

ING-INF/05

Statistical learning

INF/01

Data visualisation

INF/01

Between the years: 1°- 2°

Activities to be chosen by the student

Legend | Type of training activity (TTA)

TAF (Type of Educational Activity) All courses and activities are classified into different types of educational activities, indicated by a letter.

A Basic activities

B Characterizing activities

C Related or complementary activities

D Activities to be chosen by the student

E Final examination

F Other training activities

S Placements in companies, public or private institutions and professional associations

Teaching code

4S009079

Academic staff

Luca Di Persio, Ilaria Boscolo Galazzo

Coordinator

Luca Di Persio

Credits

Language

English

Scientific Disciplinary Sector (SSD)

MAT/06 - PROBABILITY AND STATISTICS

Period

Primo semestre dal Oct 4, 2021 al Jan 28, 2022.

Lessons timetable

Moodle Seminars 0

Learning outcomes

The course will be devoted to the mathematical background necessary to describe, analyze and derive value from datasets, possibly Big Data and unstructured, and to master the main probabilistic models used in the data science field. Starting from basic models, for example regressions, PCA-based predictors, Bayesian statistics, filters, etc., particular emphasis will be placed on mathematically rigorous quantitative approaches aimed at optimizing the data collection, cleaning and organization phases (e.g. series historical data, unstructured data generated in social media, semantic elements, etc.). The mathematical tools necessary to deal with the description of the time series, their analysis and forecasts will also be introduced. The contents of the entire course will be structured in interaction with the study of real problems relating to industrial, economic, social, etc., heterogeneous sectors, using software oriented to probabilistic modeling, for example, Knime, ElasticSearch, Kibana, R AnalyticFlow, Orange , etc.

At the end of the course the student has to show to have acquired the following skills:
● know and know how to use the basic tools for the treatment of time series and their indicators, e.g.,
● know and know how to develop forecasting solutions based on statistical inferential models, eg, AR, MA, ARMA, ARIMA, ARIMAX: Box-Jenkins, partial self-variance and autocorrelation, seasonality (SARIMA), analysis in variance (ANOVA, MANOVA), etc .
● knowing how to identify the parameters that characterize a certain population via methods such as error minimization, maximum likelihood, etc.
● know how to estimate / identify / reconstruct characteristics related to first-order analysis, smoothing techniques, spectral decomposition, polynomial fitting, etc.

Program

The course program is divided into the following macro-topics
Time domain analysis
Frequency domain analysis
Data analysis and cleaning tools (e.g. identification of outliers)
Maximum likelihood methods, likelihood metrics, probability density fitting
Principal Component Analysis (PCA) [PCA-based regressors / predictors]
AR, MA, ARMA, ARIMA, Box-Jenkins, ARCH, GARCH models and generalizations
Time series decomposition
ACF / PACF and related "views"
Hypothesis test
Gaussian / jump / compound processes
Decomposition of "white noise" processes
Bayesian statistics and applications
Forecast evaluations via consideration of inferential statistical models, based, e.g.,
on autocovariance and partial autocorrelation, seasonality (SARIMA), variance analysis (ANOVA, MANOVA), etc.
Smoothing techniques, spectral decomposition, polynomial fitting, etc.
Creation of the models referred to in the previous points for the resolution of concrete case studies.
The latter aspect will mainly, but not exclusively, concern Python coding as well as using statistical/probabilistic libraries and software such as, e.g., Knime, ElasticSearch, Kibana, R, TensorFlow, Prophet, AnalyticFlow, Orange, etc.

Bibliography

Vai alla bibliografia

Visualizza la bibliografia con Leganto, strumento che il Sistema Bibliotecario mette a disposizione per recuperare i testi in programma d'esame in modo semplice e innovativo.

Examination Methods

The final exam consists of two parts: theoretical and practical/coding.
Consequently, the first is functional to the learning verification of theoretical concepts characterizing statistical methods and associated models/algorithms, then exploited to solve a project chosen by the student in agreement with the course's lecturers.
Such a "case study", together with the discussion of the coding parts realized to complete it, will be the subject of the second and final part of the exam.

Students with disabilities or specific learning disorders (SLD), who intend to request the adaptation of the exam, must follow the instructions given HERE

Master’s degree

Studying at the University of Verona

Study Plan

Legend | Type of training activity (TTA)

Learning outcomes

Program

Bibliography

Examination Methods

Univr risponde