Course Overview: The ultimate goal of CINECA's vision of a federated cloud-enabled infrastructure making population-scale genomic and biomolecular data accessible across international borders is to enable large-scale federated data analysis responsibly and securely. This will require integrating and harmonizing diverse, large human cohort data using community standards. Data harmonization within and across cohorts adds value to the data for downstream analysis and interpretation and facilitates cross-cohort meta-analysis.
This workshop aims to discuss ways to address common challenges in cohort data harmonization, work towards practical steps to address them, and share best practices. We welcome any cohort with plans for prospective or retrospective data harmonization, enthusiastic about sharing their experience and learning from others' perspectives in cohort data discovery and analysis.
Topics to be covered:
● Data cleaning and curation
● ELSI considerations in merging data
● Data collection standards, ontology terminology and interoperability standards, metadata
models
● Data storage standards
● Data harmonization
● Sharing cohort summary data
● Do basic data cleaning
● Understand what data standards & ontologies exist for clinical data
● Map their cohort metadata to a data model
● Understand existing approaches to and algorithms for data harmonization
● Prepare summary data from their cohorts
Link to application form: https://bit.ly/cineca-data-harmonization-registration