Thesis Proposals

On this page, you will find some thesis or summer internship ideas (that can be recognized as “Tirocinio formativo”). Feel free to contact any member of the Machine Learning Group if you want to pursue your own ideas for a Machine Learning thesis. We are usually pretty open to students’ interests and ideas!
Follow some ideas.

Graph p-Laplacians Neural Networks for Leaks Prediction/Risk Classification in Water Distribution Networks – Master’s thesis, 6 months.

In the last years the Leaks Detection Problem in Water Distribution Systems (WDS) has gained increasing attention due to the climate change issue and the lack of water resourses especially in the south regions of Italy. One of the most challenging problem is the Leak localization along the WDS network. When dealing with the mathematical modeling of WDS, it is natural to consider a graph-theoretic setting, in which features such as, e.g., pipes, intersections, house services, tanks, pumps, and valves are effectively represented by means of nodes and edges and their related properties. Recent advances in the field of WDS modeling suggest that the dynamics of the main governing phisical quantities of a WDS, namely the water pressures and fluxes , can be effectively described using a weighted graph p-Laplacian-based surrogate modeling. This suggest also that a graph p-Laplacian based Neural Network could be implicitly used as a prominent ML/DL strategy for the Leak Detection Problem within the framework of edges/nodes based graph classification problems.

The thesis goal is to explore and address these challenges by proposing novel techniques in order to highlight and predict the most critical leaking area in the network. This thesis is a collaboration with 2F Water Venture s.r.l. which is a leader company in the field of Leaks Detection.

The candidate will have access to a large variety of graph-structured data such as more than 12000 leak position data, 150000 km of water distribution networks shapefiles, house service densities, traffic data, demographic data, Digital Terrain Models, Satellite Images, pressure/fluxes monitoring and simulations data etc.

The candidate will be introduced to every aspect of the company and will receive all the technical assistance needed to accomplish the project in collaboration with the Data Science team. Concrete opportunities of future job collaborations will be taken into account depending on the inclinations and specific skills of the candidate. Contact: Nicolò Navarin

Physics-informed Graph Neural Networks – Master’s thesis, 6 months.

Contact: Nicolò Navarin

Use of generative AI in the context of citizen access to services – Master’s thesis, 6 months starting from April 2024.
How many times have we found ourselves faced with a Public Administration website that is difficult to navigate and for which we are unable to understand the logic with which to find the right service that meets our needs? We have all certainly noticed the distance between the language of Public Administration and that of citizens. The project aims to create, with the use of generative Artificial Intelligence, a specialized mediator who guides citizens in the use of both information services (informed citizen) and interactive services (active citizen) provided by a Municipality such as that of Padua. The first phase of the project will consist of the creation of a system capable of supporting semantic search scenarios on various types of content (structured data and textual documents), contents which in this first solution will be previously indexed using the company’s own semantic “embedding” features of LLM models. The first objective will be to identify the knowledge base (and therefore the use case) which has the characteristics of content format, number of typologies, and number of instances per typology suitable for the use of the tools currently available for text understanding, semantic indexing, semantic search, and text generation. The user’s requests will be expressed in natural language, and the system will use its linguistic skills to: (I) extract from the context of the individual request (request message, and possibly profile; (II) elements or additional indications generated by the system) the elements that can best characterize and specify the semantics of the question; (III) return the results of the research in natural language (synthesizing them, or integrating them, or reporting the most significant ones).
In the next phase we plan to extend the system’s skills, to make it capable of: (I) provide active support in the role of assistant, with research paths that develop interactively, in a conversational context; (II) integrate into the system the ability to use external data sources ( rest endpoints, databases, real-time update services) to improve, where the system identifies the need or opportunity, the information content of the answers provided to the user.
The participants in the internship will be included in the first phase of a structured project, which in subsequent developments, aims to the creation of complete virtual assistant functions, with target (in their respective areas of interest) at both citizens and public administration operators. In this initial phase, the proposed activities will focus on the creation of semantic search services which, with interactive methods based on simple queries in natural language (for now without the full support of a conversational context), are able to facilitate exploration and search operations in some domains representative of the information, documentary, and procedural skills assets that the public administration holds. In this context, also taking into account the inclinations and specific skills of the participants, the tasks entrusted may include: (I) acquisition and indexing of heterogeneous data, coming from different sources, with import in vector and semantically qualified form; (II) assessment of the semantic database solutions, as well as the available linguistic models (with particular regard to the open source world and the Italian language), and the related parameterizations, to evaluate their effectiveness for the purposes of indexing the contents, interpreting the queries and of the synthesis and return of the results; (III) evaluation of possible solutions for the recognition of specific areas of research and interest declared or implicitly expressed by the user; (IV) contributions to the development, configuration, and deployment activities of application components.
Contact: Alessandro Sperduti

Analytics in the environmental field – Master’s thesis, 6 months starting from April/May 2024.
The smart city is characterized by intelligent management of services and places through the use of technologies that improve the quality of life and well-being of people. The problem of global warming and energy consumption in cities also manifests itself with the creation of critical environmental situations that put health at risk and trigger the acceleration of city warming. Climate heat waves associated with urbanization that do not mitigate these conditions can generate real heat bubbles in some areas of the city. These manifest themselves with heat islands that can reach up to 4-5 degrees higher than other parts of the city. The project aims, on the basis of the data collected by numerous sensors distributed in the city, from cartographic and cadastral databases and from artificial satellites, to create analytics that allow us to support urban regeneration processes aimed at combating urban heating.
In this context, also taking into account the inclinations and specific skills of the participants, the tasks entrusted to the interns may include: (I) Detection and formalization of information needs to support decisions; (II) Definition of analytics; (III) Assessment of the necessary data sources; (IV) Insertion of data sources into the Data Catalog; (V) Analytics development; (VI) Development of predictive simulation models.
Contact: Alessandro Sperduti

Analytics in the context of city mobility – Master’s thesis, 6 months starting from April/May 2024.
The smart city is characterized by intelligent management of services and places through the use of technologies that improve the quality of life and well-being of people. Mobility is certainly one of the most important aspects of living in a city, it affects our working day and otherwise, because good mobility allows us to reduce the level of pollution and does not make us waste time that we could instead dedicate to our interests. The project aims, on the basis of the data collected by numerous sensors distributed in the city, from the databases of mobility ordinances, from data relating to weather conditions, accidents, ongoing road works, and many others, to create analytics in the following domains: (I) Accidents, (II) Vehicle traffic models, and (III) Soft mobility flow models.
In this context, also taking into account the inclinations and specific skills of the participants, the tasks entrusted to the interns may include: (I) Detection and formalization of information needs to support decisions; (II) Definition of analytics; (III) Assessment of the necessary data sources; (IV) Insertion of data sources into the Data Catalog; (V) Analytics development; (VI) Development of predictive simulation models.
Contact: Alessandro Sperduti

Analytics in the field of real estate – Master’s thesis, 6 months starting from April/May 2024.
All of us walking the streets of our city have certainly noticed the large number of unused properties, both residential and commercial. We have all traveled and we have certainly realized, looking at the city maps of booking portals, how many private properties are used for tourist hospitality. On the contrary, off-site students experience the difficulty of finding and maintaining quality accommodation at a sustainable cost. The objective of the project is the development of analytics in the area described above in the context of the City of Padua. The aim is to support the municipal administration in the definition of incentive tools for the making available of unused properties or, where the competence of property management is regional or national, to support a political action to raise awareness for an adequate regulatory intervention to overcome this criticality. Multiple data sets will be available (population distribution, business location, tourist presences, property register, rental contracts, …), appropriately purified of personal information, and therefore usable for the development of analytics.
In this context, also taking into account the inclinations and specific skills of the participants, the tasks entrusted to the interns may include: (I) Detection and formalization of information needs to support decisions; (II) Definition of analytics; (III) Assessment of the necessary data sources; (IV) Insertion of data sources into the Data Catalog; (V) Analytics development; (VI) Development of predictive simulation models.
By way of example, an output could be the creation of dashboards (even with a cartographic component) to analyze the qualitative situation and the degree of use of the real estate assets of the municipal area of Padua.
Contact: Alessandro Sperduti

Internship Aprilia Racing – Master’s thesis with 6-month Internship c/o Data Analysis Strategies and Methods Area – Aprilia MotoGp – Noale (VE)).
This internship aims to implement machine learning algorithms for the optimization of electronic settings in racing motorcycles. Project description: “Use of Machine Learning techniques to predict real-time parameters currently obtained from fluid dynamic simulations, given a geometric variation of a component of the racing motorbike: analysis of the problem, choice of methodology, in-house creation of algorithms in Python and their validation”.
See poster: https://mlg.math.unipd.it/download/Thesis_Aprilia_UniPD-EN_31-01-2023.pdf
Contact: Alessandro Sperduti, Nicolò Navarin

Mining Users’ Attributes from their Public Spotify Data Using Graph Neural Networks – Master’s thesis, 6 months.
The proliferation of data on digital platforms like Spotify offers a unique opportunity to delve into the intricacies of user preferences and behaviors. Spotify’s extensive interconnected data structures encompassing users, songs, playlists, artists, and more present a wealth of information. However, extracting latent user attributes from these intricate network structures poses a significant challenge. Traditional machine learning techniques struggle to unravel the complex relationships and dependencies inherent in such networks, underscoring the need for advanced methodologies such as Graph Neural Networks (GNNs). This thesis proposes harnessing the power of Graph Neural Networks (GNNs) to uncover sensitive user attributes from Spotify data. Specifically, our focus is on discerning recurring musical characteristics associated with users’ attributes, including demographics, habits, or personality traits.
Contact: Luca Pasa, Mauro Conti, Luca Pajola

Hypernetworks for graph-structured data – Master’s thesis, 6 months.
Graph-structured data is ubiquitous in various domains, including social networks, biological networks, and knowledge graphs. Graphs are dynamic, heterogeneous, and can vary significantly in size, making traditional neural network architectures less suitable for effective learning. Hypernetworks, a class of neural networks that generate weights for other networks, have shown promise in capturing complex patterns in high-dimensional data. Hypernetworks offer a unique approach by generating weights dynamically, potentially addressing these challenges. However, their application to graph-structured data is an emerging area of research that requires in-depth exploration. The goal of the thesis is to explore the application of Hypernetworks to graph-structured data, aiming to develop innovative techniques that enhance scalability, flexibility, and performance in graph learning tasks.
Contact: Luca Pasa

Speech enhancement in multi-talker environments – Master’s thesis, 6 months.
In everyday environments, speech signals are often contaminated by various interfering noises and multiple speakers, making it challenging for speech recognition systems and communication devices to function effectively. Multi-talker environments, where multiple speakers are active simultaneously, pose a significant problem for speech enhancement techniques. Traditional speech enhancement methods struggle to effectively isolate target speech from interfering speakers and background noise in multi-talker environments. The thesis project aims to investigate advanced deep-learning approaches to enhance speech signals in such complex acoustic scenarios. The research focuses on developing innovative methods for speech separation, denoising, and quality improvement in multi-talker environments.
Contact: Luca Pasa

Graph Neural Networks for Large-Scale Datasets – Master’s thesis, 6 months.
Graph Neural Networks (GNNs) have emerged as powerful tools for processing graph-structured data, exhibiting promising performance in various applications, but handling large-scale graphs poses significant challenges due to their size and complexity. Existing GNN models often suffer from scalability issues, limiting their applicability to real-world, large-scale datasets. The thesis goal is to explore and address these challenges by proposing novel techniques to optimize GNNs for processing massive graphs efficiently while maintaining high prediction accuracy.
Contact: Luca Pasa

Deep Learning Models for Neuroimaging Analysis – Master’s thesis, 6 months.
Neuroimaging plays a crucial role in understanding the structure, function, and disorders of the brain. With the advent of Deep Learning techniques, there has been a paradigm shift in the field of neuroimaging analysis. Deep Learning models have demonstrated exceptional capabilities in extracting intricate patterns from neuroimaging data, leading to more accurate diagnoses and personalized treatment strategies. This thesis project aims to explore the application of Deep Learning models in neuroimaging analysis, focusing on developing models that can handle diverse imaging modalities, interpret complex brain structures, and provide reliable diagnostic insights. 
Contact: Luca Pasa

Forecasting Stock Prices Using Deep Learning – Master’s thesis, 6 months.
Deep learning, particularly Recurrent Neural Networks (RNNs), has emerged as a potent tool for various predictive tasks, yet its application to stock price prediction presents unique challenges due to the intricate nature of financial markets. While existing deep learning methods exhibit promise, their effectiveness in capturing the nuanced patterns of stock price movements requires further exploration. This thesis aims to investigate the application of RNNs and other deep learning techniques in stock price prediction and explore novel approaches to enhance their predictive accuracy.
Contact: Luca Pasa

MLP-Mixer for graphs – Master’s thesis, 6 months.
A simple MLP architecture, shown to be competitive with CNNs and Transformers on images, applied to graphs.
Contact: Alessandro Sperduti, Nicolò Navarin

Reinforcement Learning for Robotic Hands – Master’s thesis, 6 months.
This thesis project will start from a previous work in which a robotic hand, mounted on a workbench equipped with a camera, learns to sort objects. The starting point is the development of an end-to-end reinforcement learning system instead of the existing multi-step approach.
Contact: Alessandro Sperduti, Nicolò Navarin

Deception detection – Master’s thesis, 6 months.
In this project, the student will develop a deep neural network to analyze audio/video recordings. The task is to discriminate between people that are telling the truth from liars.
Contact: Nicolò Navarin

Deception detection app using Neural Engine (Bachelor’s or Master’s thesis/internship.
We developed an app that records short videos and sends them to a REST service that runs a neural network and returns the probability of the person in the video being a liar. The app displays such probability. The project involves running the neural networks on-device, using the Neural Computing Engines present in regent mobile chips.
Contact: Nicolò Navarin

Natural Language Processing – Master’s thesis, 6 months.
Various projects related to NLP are available
Contact: Giovanni Da San Martino

Virtual Reality – Master’s thesis, 6 months.
This is an exploratory thesis. Development of a virtual reality ambient/character. Study of integration with machine learning techniques to develop intelligent virtual agents.
Contact: Nicolò Navarin

Interpreting classification of single-cell expression data – Master’s Thesis / Internship.
Deep neural networks based on the autoencoder architecture have recently emerged as a powerful tool to classify cells into different subtypes, depending on the repertoire of genes they express. The major limit of these classifiers, though, is the lack of interpretability of their results. Specifically, it is not clear which genes are contributing the most to the separation among cell types. This project would apply recently developed techniques to disentangle the latent space generated by the encoder to identify attributes (i.e. axes of variation) leading to changes in classification. Each axis would thus correspond to a combination of gene contributions directly related to cellular identities. This work is in collaboration with the Sales Lab (Biology Dept.).
Contact: Nicolò Navarin, Gabriele Sales