SoBigData RI @ UNIVAQ

Infrastructure Services and Research for Open Data Science

Overview

The SoBigData.it project aims to strengthen the Italian node of the SoBigData research infrastructure (SoBigData.eu), with the goal of enhancing interdisciplinary and innovative research on the multiple aspects of social complexity by combining data and model-driven approach. The event SoBigData@UnivAQ will take place on March 25, 2025, and aims to present to PhD students and young researchers of UnivAQ the research infrastructure (SoBigData.eu) and some results achieved by the research group at University of L'Aquila participating in the infrastructure. This SoBigData.it training event is part of Activity 5.7: Educational activities supported by UNIVAQ that are part of WP5 - Responsible Data Science and Training and it contributes to the objective Support high level training initiative (O5.3), Promoting diversity and inclusion - period 1 (O5.5.1) and Cultivating new generation of data scientists (O5.4). To attend the event, it is necessary to register using this form at the link or the one in the Registration section.

Schedule

Explore the full event schedule including session details, times and speakers.

Contributions in VL-XAI to Address and Democratize Software Fairness, Giordano D'Aloisio

AI-based software systems are employed in every aspect of our lives nowadays. However, the wide adoption of these systems raises concerns about their trustworthiness and fairness. Developing fair AI-based systems is a complex task, with many challenges still open. In this talk, we will present our contributions in the context of the SoBigData VL-XAI to support software fairness. First, we will introduce a method published in the VL to perform bias mitigation in binary and multi-class classification datasets. Secondly, we will present an application deployed in the VL to democratize the benchmarking of machine learning models and fairness-enhancing methods.

Urban Digital Twin for Territorial Management - VL Disaster, Gennaro Zanfardino

In the context of the VL Disaster our research harnesses weather, air quality data, and walkability metrics to provide insights into urban planning's multifaceted impacts, emphasizing the importance of emergency preparedness. Despite the plethora of data, including satellite measurements, accessing it is complex due to the lack of inter-departmental collaboration, standardized data-sharing protocols, and incentives to improve data pipelines. Our efforts in data collection and analysis form foundational components of an 'Urban Scale Digital Twin', pushing the boundaries of traditional urban planning towards a more holistic, data-informed approach that encapsulates the evolving needs of urban residents.

Entity Extraction in Clinical Summary of Mammary Malignancy Data using Transformer-Based Models - VL Health, Payel Patra

Artificial intelligence has revolutionized medical diagnostics and achieved remarkable accuracy in identifying diseases, symptoms, treatments, etc. By enabling personalized treatment plans and advanced medical data analysis, AI facilitates earlier detection of diseases, enhances diagnostic accuracy—particularly from medical imaging—and streamlines healthcare workflows. Early diagnosis and timely intervention for breast cancer can save many lives. AI analyzes data to extract key entities, enabling predictive models that support cancer-free living and improve patient outcomes. We introduce an innovative AI-driven, transformer-based methodology for extracting insights from unstructured Clinical Summary Mammary Malignancy (CSMM) data by fine-tuning BERT-based models including Bio-BERT, BioClinicalBERT, RoBERTa, PubMedBERT, and BlueBERT to annotate clinical notes with breast cancer-specific concepts, constructing a corpus that supports precise Clinical Named Entity Recognition (NER), including age, gender, disease, symptoms, medication, doses, medical history, and cancer stages. The comparative analysis between pre-trained and fine-tuned models in this study reveals that our fine-tuned BioBERT transformer model demonstrates a robust performance, achieving a high F-score of 96.3% in extracting medical entities, confirming their effectiveness in handling domain-specific clinical texts.

How to build a research infrastructure. My experience in the legal unit of SoBigData and my research on data sharing, Lucia Ugolino

SoBigData is the multi-disciplinary research infrastructure for Social Mining and Big Data Analytics. The construction of a research infrastructure engages the jurist from many perspectives. It is a field where legal safeguards are indispensable but at the same time still experimental. National legal systems and the European union are only now trying to adapt to the requirements of free data sharing through appropriate institutions and safeguards to be adapted or created. An infrastructure needs a legal reference model, a statute, an appropriate and detailed governance. Getting there has required a long and gradual path of research and analysis, and here I will describe the main stages, trying to explain how this construction work involved a deep reflection on the circulation and sharing of data.

Location

Centro Congressi "Luigi Zordan"

Piazza S. Basilio, 3, 67100 L'Aquila AQ - Aula C

Organizing Committee:

Antinisca Di Marco
Antinisca Di Marco
Associate Professor in Computer Science
Michele Tucci
Michele Tucci
Assistant Professor at DISIM
Daniele Di Pompeo
Daniele Di Pompeo
Assistant Professor at DISIM
Katiuscia Pellone
Katiuscia Pellone
Research Scholarship at DISIM

Designed and Organized by:

With the contribution of: