SoBigData

Schedule

Explore the full event schedule including session details, times and speakers.

Morning sessions

11:00

Welcome Coffee Break

11:30

Opening Remarks

Antinisca Di Marco, Roberto Trasarti

11:45

Presentation of the SoBigData Research Infrastructure

Daniele Di Pompeo, Michele Tucci, Valerio Grossi Download the slides

13:00

Lunch Break

Afternoon sessions

14:00

Contributions in VL-XAI to Address and Democratize Software Fairness

Giordano D'Aloisio

14:30

Urban Digital Twin for Territorial Management - VL Disaster

Gennaro Zanfardino

15:00

Entity Extraction in Clinical Summary of Mammary Malignancy Data using Transformer-Based Models - VL Health

Payel Patra

15:30

How to build a research infrastructure. My experience in the legal unit of SoBigData and my research on data sharing

Lucia Ugolino

16:00

Coffee Break

Contributions in VL-XAI to Address and Democratize Software Fairness, Giordano D'Aloisio

AI-based software systems are employed in every aspect of our lives nowadays. However, the wide adoption of these systems raises concerns about their trustworthiness and fairness. Developing fair AI-based systems is a complex task, with many challenges still open. In this talk, we will present our contributions in the context of the SoBigData VL-XAI to support software fairness. First, we will introduce a method published in the VL to perform bias mitigation in binary and multi-class classification datasets. Secondly, we will present an application deployed in the VL to democratize the benchmarking of machine learning models and fairness-enhancing methods.

Download the slides

Urban Digital Twin for Territorial Management - VL Disaster, Gennaro Zanfardino

In the context of the VL Disaster our research harnesses weather, air quality data, and walkability metrics to provide insights into urban planning's multifaceted impacts, emphasizing the importance of emergency preparedness. Despite the plethora of data, including satellite measurements, accessing it is complex due to the lack of inter-departmental collaboration, standardized data-sharing protocols, and incentives to improve data pipelines. Our efforts in data collection and analysis form foundational components of an 'Urban Scale Digital Twin', pushing the boundaries of traditional urban planning towards a more holistic, data-informed approach that encapsulates the evolving needs of urban residents.

Download the slides

Entity Extraction in Clinical Summary of Mammary Malignancy Data using Transformer-Based Models - VL Health, Payel Patra

Artificial intelligence has revolutionized medical diagnostics and achieved remarkable accuracy in identifying diseases, symptoms, treatments, etc. By enabling personalized treatment plans and advanced medical data analysis, AI facilitates earlier detection of diseases, enhances diagnostic accuracy—particularly from medical imaging—and streamlines healthcare workflows. Early diagnosis and timely intervention for breast cancer can save many lives. AI analyzes data to extract key entities, enabling predictive models that support cancer-free living and improve patient outcomes. We introduce an innovative AI-driven, transformer-based methodology for extracting insights from unstructured Clinical Summary Mammary Malignancy (CSMM) data by fine-tuning BERT-based models including Bio-BERT, BioClinicalBERT, RoBERTa, PubMedBERT, and BlueBERT to annotate clinical notes with breast cancer-specific concepts, constructing a corpus that supports precise Clinical Named Entity Recognition (NER), including age, gender, disease, symptoms, medication, doses, medical history, and cancer stages. The comparative analysis between pre-trained and fine-tuned models in this study reveals that our fine-tuned BioBERT transformer model demonstrates a robust performance, achieving a high F-score of 96.3% in extracting medical entities, confirming their effectiveness in handling domain-specific clinical texts.

Download the slides

How to build a research infrastructure. My experience in the legal unit of SoBigData and my research on data sharing, Lucia Ugolino

SoBigData is the multi-disciplinary research infrastructure for Social Mining and Big Data Analytics. The construction of a research infrastructure engages the jurist from many perspectives. It is a field where legal safeguards are indispensable but at the same time still experimental. National legal systems and the European union are only now trying to adapt to the requirements of free data sharing through appropriate institutions and safeguards to be adapted or created. An infrastructure needs a legal reference model, a statute, an appropriate and detailed governance. Getting there has required a long and gradual path of research and analysis, and here I will describe the main stages, trying to explain how this construction work involved a deep reflection on the circulation and sharing of data.