Difference between revisions of "Project Overview"

From ECRIN-MDR Wiki
Jump to navigation Jump to search
Line 1: Line 1:
 +
=The ECRIN Clinical Research Metadata Repository (MDR)=
 +
 
=== The need: To make clinical research data and documents FAIR===
 
=== The need: To make clinical research data and documents FAIR===
 
In recent years there has been a growing acceptance that to accurately assess the results of trials and other clinical research, and in particular to combine the results from different trials in meta-analyses, it is much better to have access to the original source data, the “individual participant data” (IPD), as well as the result summaries found in published papers. <br>
 
In recent years there has been a growing acceptance that to accurately assess the results of trials and other clinical research, and in particular to combine the results from different trials in meta-analyses, it is much better to have access to the original source data, the “individual participant data” (IPD), as well as the result summaries found in published papers. <br>
 
In addition, to make sure that the IPD can be fully understood and properly analysed, a variety of other study documents (protocols, analysis plans, etc.) are required. As a result, under pressure from funders and journal editors, more and more researchers are making such material (generically, “clinical trial data objects”) available for sharing with others. The datasets are rarely freely available - instead a variety of access mechanisms (e.g. individual request and review, membership of pre-authorised groups, or web based self-attestation), are used in combination with different access types (e.g. download versus in-situ perusal). Furthermore the various data objects are stored in a wide variety of different locations: a rapidly growing number of general and specialised data repositories, trial registries, publications, the original researchers’ institutions, etc. <br>
 
In addition, to make sure that the IPD can be fully understood and properly analysed, a variety of other study documents (protocols, analysis plans, etc.) are required. As a result, under pressure from funders and journal editors, more and more researchers are making such material (generically, “clinical trial data objects”) available for sharing with others. The datasets are rarely freely available - instead a variety of access mechanisms (e.g. individual request and review, membership of pre-authorised groups, or web based self-attestation), are used in combination with different access types (e.g. download versus in-situ perusal). Furthermore the various data objects are stored in a wide variety of different locations: a rapidly growing number of general and specialised data repositories, trial registries, publications, the original researchers’ institutions, etc. <br>
The researcher or reviewer wishing to locate relevant data objects for a study is therefore faced with a bewildering mosaic of possible source locations and access mechanisms, and this problem of ‘discoverability’ will almost certainly become much worse in the future as more and more materials are made available for sharing. Systems are therefore required to make the data, and associated documents, generated by clinical research more FAIR: Findable, Accessible, Inter-Operable, and Re-usable. The MDR is designed to be one such system.<br><br>
+
The researcher or reviewer wishing to locate relevant data objects for a study is therefore faced with a bewildering mosaic of possible source locations and access mechanisms, and this problem of ‘discoverability’ will almost certainly become much worse in the future as more and more materials are made available for sharing. Systems are therefore required to make the data and associated documents generated by clinical research more FAIR: Findable, Accessible, Inter-Operable, and Re-usable. The MDR is designed to be one such system.<br><br>
  
=== Aims of the Project ===
+
=== Central Aim of the MDR===
The principal aim of the project is to promote the FAIRness of clinical data, by making the data objects generated from clinical research easier to locate and by describing how each of those data objects can be accessed, providing direct links to them where that is possible. The central idea is to develop systems that can collect the ''metadata'' about the data objects, including object provenance, location and access details, from a variety of source systems (e.g. trial registries, data repositories, bibliographic systems) and aggregate it into a single  '''MetaData Repository''' (or MDR). The MDR is designed to assemble, standardise and display the metadata about clinical studies and the data objects generated by them, on a global scale, and provide access to that metadata through a single system, accessed via a web portal.
+
The principal aim of the MDR project is therefore to promote the FAIRness of clinical data, by making the data objects generated from clinical research easier to locate and by describing how each of those data objects can be accessed, providing direct links to them where that is possible. The central idea is to develop systems that can collect the ''metadata'' about the data objects, including object provenance, location and access details, from a variety of source systems (e.g. trial registries, data repositories, bibliographic systems) and aggregate it into a single  '''MetaData Repository''' (or MDR). The MDR is designed to assemble, standardise and display the metadata about clinical studies and the data objects generated by them, on a global scale, and provide access to that metadata through a single system, accessed via a web portal. It is designed to make that metadata easier to search, allowing filtering and selection of studies and associated data objects.
 
<br><br>
 
<br><br>
  
=== Implementation ===
+
=== Implementation and Partners===
 
The MDR system has been designed and developed by [https://www.ecrin.org/ '''ECRIN'''] (the European Clinical Research Infrastructure Network), in collaboration with ([https://www.onedata.org/#/home '''ONEDATA''' ]) and [http://www.bo.infn.it/ '''INFN'''] (Istituto Nazionale di Fisica Nucleare) at Bologna. Development of the whole project has been within the H2020 [http://www.extreme-datacloud.eu/the-project/ '''eXtreme - DataCloud'''](XDC) project, funded by the EU under grant agreement 777367.
 
The MDR system has been designed and developed by [https://www.ecrin.org/ '''ECRIN'''] (the European Clinical Research Infrastructure Network), in collaboration with ([https://www.onedata.org/#/home '''ONEDATA''' ]) and [http://www.bo.infn.it/ '''INFN'''] (Istituto Nazionale di Fisica Nucleare) at Bologna. Development of the whole project has been within the H2020 [http://www.extreme-datacloud.eu/the-project/ '''eXtreme - DataCloud'''](XDC) project, funded by the EU under grant agreement 777367.
 
<br><br>
 
<br><br>
 
Metadata from a variety of data sources have been collected by ECRIN using different modalities (e.g. DB download, import of XML files through an API, scraping of web pages) and stored in a relational DB on the test bed server at INFN. Data is then exported as json file metadata to the OneData file management system and indexed via Elastic Search to make it available to the web portal. <br>
 
Metadata from a variety of data sources have been collected by ECRIN using different modalities (e.g. DB download, import of XML files through an API, scraping of web pages) and stored in a relational DB on the test bed server at INFN. Data is then exported as json file metadata to the OneData file management system and indexed via Elastic Search to make it available to the web portal. <br>
 
For further details about these processes please see the Methodology and Individual data source sections.
 
For further details about these processes please see the Methodology and Individual data source sections.

Revision as of 11:03, 28 October 2020

The ECRIN Clinical Research Metadata Repository (MDR)

The need: To make clinical research data and documents FAIR

In recent years there has been a growing acceptance that to accurately assess the results of trials and other clinical research, and in particular to combine the results from different trials in meta-analyses, it is much better to have access to the original source data, the “individual participant data” (IPD), as well as the result summaries found in published papers.
In addition, to make sure that the IPD can be fully understood and properly analysed, a variety of other study documents (protocols, analysis plans, etc.) are required. As a result, under pressure from funders and journal editors, more and more researchers are making such material (generically, “clinical trial data objects”) available for sharing with others. The datasets are rarely freely available - instead a variety of access mechanisms (e.g. individual request and review, membership of pre-authorised groups, or web based self-attestation), are used in combination with different access types (e.g. download versus in-situ perusal). Furthermore the various data objects are stored in a wide variety of different locations: a rapidly growing number of general and specialised data repositories, trial registries, publications, the original researchers’ institutions, etc.
The researcher or reviewer wishing to locate relevant data objects for a study is therefore faced with a bewildering mosaic of possible source locations and access mechanisms, and this problem of ‘discoverability’ will almost certainly become much worse in the future as more and more materials are made available for sharing. Systems are therefore required to make the data and associated documents generated by clinical research more FAIR: Findable, Accessible, Inter-Operable, and Re-usable. The MDR is designed to be one such system.

Central Aim of the MDR

The principal aim of the MDR project is therefore to promote the FAIRness of clinical data, by making the data objects generated from clinical research easier to locate and by describing how each of those data objects can be accessed, providing direct links to them where that is possible. The central idea is to develop systems that can collect the metadata about the data objects, including object provenance, location and access details, from a variety of source systems (e.g. trial registries, data repositories, bibliographic systems) and aggregate it into a single MetaData Repository (or MDR). The MDR is designed to assemble, standardise and display the metadata about clinical studies and the data objects generated by them, on a global scale, and provide access to that metadata through a single system, accessed via a web portal. It is designed to make that metadata easier to search, allowing filtering and selection of studies and associated data objects.

Implementation and Partners

The MDR system has been designed and developed by ECRIN (the European Clinical Research Infrastructure Network), in collaboration with (ONEDATA ) and INFN (Istituto Nazionale di Fisica Nucleare) at Bologna. Development of the whole project has been within the H2020 eXtreme - DataCloud(XDC) project, funded by the EU under grant agreement 777367.

Metadata from a variety of data sources have been collected by ECRIN using different modalities (e.g. DB download, import of XML files through an API, scraping of web pages) and stored in a relational DB on the test bed server at INFN. Data is then exported as json file metadata to the OneData file management system and indexed via Elastic Search to make it available to the web portal.
For further details about these processes please see the Methodology and Individual data source sections.