Difference between revisions of "MDR Data Sources"

From ECRIN-MDR Wiki
Jump to navigation Jump to search
(ISRCTN)
(ClinicalTrials.gov)
 
(9 intermediate revisions by the same user not shown)
Line 3: Line 3:
  
 
===ClinicalTrials.gov===
 
===ClinicalTrials.gov===
 +
ClinicalTrials.gov ('''[https://clinicaltrials.gov/ CTG]''') is by far the largest Trial Registry in the world, with (as of December 2020) about 360,000 records of individual studies, conducted in over 200 countries – the great majority being interventional clinical trials. It is maintained by the National Library of Medicine (NLM) at the National Institutes of Health (NIH), while the information on ClinicalTrials.gov is provided and updated by the sponsor or principal investigator of the clinical study.<br/>
 +
Studies are generally submitted to the Web site (that is, registered) when they begin, and the information on the site is updated throughout the study. In an increasing number of cases, results of the study are submitted after the study ends.<br/>
 +
ClinicalTrials.gov was created as a result of the Food and Drug Administration Modernization Act of 1997 (FDAMA). The NIH and the FDA worked together to develop the site, which was made available to the public in February 2000. The results database was made available to the public in September 2008.
 +
<br/><br/>
 +
The organisation is currently changing its API and the associated file structure, with the new version currently in Beta but expected to replace the current one in the near future (further details [https://clinicaltrials.gov/api/gui here]). The decision was made to use the new API version and file structure, to try and minimise future changes. The CTG web site provides a description of the new API, as both [https://clinicaltrials.gov/api/info/study_structure?fmt=XML XML] and [https://clinicaltrials.gov/api/info/study_structure?fmt=JSON  json], but a simpler description of the XML file, in the context of MDR use, is available at '''[[ClinicalTrials.gov SourceMetadata]]'''. Most of the definitions for the data elements are contained in the definition pages maintained by ClinicalTrials.gov, one for each of [https://prsinfo.clinicaltrials.gov/definitions.html protocol], [https://prsinfo.clinicaltrials.gov/results_definitions.html results] and [https://prsinfo.clinicaltrials.gov/expanded_access_definitions.html expanded access] elements.
 +
 +
====Terms and Conditions of use====
 +
Terms and conditions are specified under Use of ClinicalTrials.gov Data (see: https://clinicaltrials.gov/ct2/about-site/terms-conditions). These are listed below, together with an explanatory response from the ECRIN MDR.<br/>
 +
<br/>
 +
1) "Neither the United States Government, U.S. Department of Health and Human Services, National Institutes of Health, National Library of Medicine, nor any of its agencies, contractors, subcontractors or employees of the United States Government make any warranties, expressed or implied, with respect to data contained in the database, and, furthermore, assume no liability for any party's use, or the results of such use, of any part of the database."<br/>
 +
''Please note that ECRIN makes a similar disclaimer for '''all''' data displayed in the MDR.''<br/>
 +
<br/>
 +
2) "In any publication or distribution of these data, you should: a) Attribute the source of the data as ClinicalTrials.gov; b) Update the data such that they are current at all times; c) Clearly display the date the data were processed by ClinicalTrials.gov; d) State any modifications made to the content of the data, along with a complete description of the modifications."<br/>
 +
''Please note: a) data in the MDR is updated on a periodic base, with the frequency of update increasing as the system develops; b) the intention is to provide a 'pop-up' over displayed data with provenance information, including source and date of data capture. This is planned for version 1.0 of the MDR.''<br/>
 +
<br/>
 +
3) "You shall not assert any proprietary rights to any portion of the database, or represent the database or any part thereof to anyone as other than a United States Government database." <br/>
 +
''We do not, and instead intend to make the provenance of all data clear (see 2 above).''<br/>
 
<br/>
 
<br/>
 +
4) "You shall not use any email addresses extracted from our database for marketing or other promotional purposes."<br/>
 +
''We do not.''<br/>
 +
<br/>
 +
5) "The ClinicalTrials.gov data carry an international copyright outside the United States and its Territories or Possessions. Some ClinicalTrials.gov data may be subject to the copyright of third parties; you should consult these entities for any additional terms of use." <br/>
 +
''We recognise that copyright, but point out that all the data displayed in the system is obtained from files made available publicly and expressly for the purpose of sharing ClinicalTrials.gov data.<br/>
 
<br/>
 
<br/>
  
 
===Pubmed===
 
===Pubmed===
 +
'''[https://www.ncbi.nlm.nih.gov/pubmed PubMed]''' is run by the US National Library of Medicine (itself part of the National Institutes of Health) and comprises more than 30 million citations for biomedical literature, covering a huge range of life science journals and online books.<br>
 +
Citations may include links to full-text content stored in PubMed Central and, through a 'link-out' mechanism, on publisher web sites.<br>
 +
In the context of the MDR, data collection from PubMed is concerned with identifying and harvesting those citations that can be explicitly linked to clinical studies, and which can therefore be listed as data objects associated with those studies. Fortunately, the National Library of Medicine (NLM) makes available an API that can be used to identify the relevant data and download it as XML.<br/>
 +
The processing of PubMed data is different in several ways from the processing of other data sources. This is explained in greater detail on the '''[[Processing PubMed Data]]''' page, which also includes reference to the details of the source data's structure.
 +
 +
====Terms and conditions of use====
 +
Terms and conditions can be found under National Library of Medicine Terms and Conditions (see: https://www.nlm.nih.gov/databases/download/terms_and_conditions.html) and NLM Copyright Information (see: https://www.nlm.nih.gov/copyright.html). The key points, and an explanatory response from the ECRIN MDR, are listed below.<br/>
 +
<br/>
 +
1) "Attribution and Rights for Government Works:  Works produced by the U.S. government are not subject to copyright protection in the United States. Any such works found on National Library of Medicine (NLM) Web sites may be freely used or reproduced without permission in the U.S. Please acknowledge NLM as the source of the information by including the phrase “Courtesy of the U.S. National Library of Medicine” or “Source: U.S. National Library of Medicine.” "<br/>
 +
''Please note that a) all data retrieved using the NLM's API is publicly available, and b) the intention is to provide a 'pop-up' over displayed data with provenance information, including source and date of data capture. This is planned for version 1.0 of the MDR.''<br/>
 +
<br/>
 +
2) "Copyright Protections for Non-Government Works: When using NLM Web sites, you may encounter documents, illustrations, photographs, or other content contributed by or licensed from private individuals, companies, or organizations that may be protected by U.S. and international copyright laws. You can sometimes tell if content is copyrighted if it has the copyright symbol, the name of the copyright holder, or the statement "All rights reserved." However, a copyright notice is not required by law and therefore not all copyrighted content is necessarily marked in this way."<br/>
 +
''Please note that we do not extract or display any content related to the Abstract sections of a PubMed record, as the Abstract material is protected by the copyright of the journal publisher.''<br/>
 +
<br/>
 +
3) "Users who republish or redistribute the data (services, products or raw data) agree to: a) maintain the most current version of all distributed data, or b) make known in a clear and conspicuous manner that the products/services/applications do not reflect the most current/accurate data available from NLM."<br/>
 +
''Please note: a) data in the MDR is updated on a periodic base, with the frequency of update increasing as the system develops; b) the intention is to provide a 'pop-up' over displayed data with provenance information, including source and date of data capture. This is planned for version 1.0 of the MDR.''<br/>
 +
<br/>
 +
4) "These data are produced with a reasonable standard of care, but NLM makes no warranties express or implied, including no warranty of merchantability or fitness for particular purpose, regarding the accuracy or completeness of the data. Users agree to hold NLM and the U.S. Government harmless from any liability resulting from errors in the data. NLM disclaims any liability for any consequences due to use, misuse, or interpretation of information contained or not contained in the data."
 +
<br/>
 +
''Please note that ECRIN makes a similar disclaimer for '''all''' data displayed in the MDR.''<br/>
 
<br/>
 
<br/>
 
<br/>
 
<br/>

Latest revision as of 16:58, 23 November 2020

Introduction

This page lists the data sources used by the MDR and provides a very brief descriptive paragraph about each, as well as a link to the source's web site. In a few cases it also links to more detailed information about the data processing associated with a source. The main purpose of the page, however, is to list the published terms and conditions of use of each source and indicate how the MDR believes itself to be in compliance with them.

ClinicalTrials.gov

ClinicalTrials.gov (CTG) is by far the largest Trial Registry in the world, with (as of December 2020) about 360,000 records of individual studies, conducted in over 200 countries – the great majority being interventional clinical trials. It is maintained by the National Library of Medicine (NLM) at the National Institutes of Health (NIH), while the information on ClinicalTrials.gov is provided and updated by the sponsor or principal investigator of the clinical study.
Studies are generally submitted to the Web site (that is, registered) when they begin, and the information on the site is updated throughout the study. In an increasing number of cases, results of the study are submitted after the study ends.
ClinicalTrials.gov was created as a result of the Food and Drug Administration Modernization Act of 1997 (FDAMA). The NIH and the FDA worked together to develop the site, which was made available to the public in February 2000. The results database was made available to the public in September 2008.

The organisation is currently changing its API and the associated file structure, with the new version currently in Beta but expected to replace the current one in the near future (further details here). The decision was made to use the new API version and file structure, to try and minimise future changes. The CTG web site provides a description of the new API, as both XML and json, but a simpler description of the XML file, in the context of MDR use, is available at ClinicalTrials.gov SourceMetadata. Most of the definitions for the data elements are contained in the definition pages maintained by ClinicalTrials.gov, one for each of protocol, results and expanded access elements.

Terms and Conditions of use

Terms and conditions are specified under Use of ClinicalTrials.gov Data (see: https://clinicaltrials.gov/ct2/about-site/terms-conditions). These are listed below, together with an explanatory response from the ECRIN MDR.

1) "Neither the United States Government, U.S. Department of Health and Human Services, National Institutes of Health, National Library of Medicine, nor any of its agencies, contractors, subcontractors or employees of the United States Government make any warranties, expressed or implied, with respect to data contained in the database, and, furthermore, assume no liability for any party's use, or the results of such use, of any part of the database."
Please note that ECRIN makes a similar disclaimer for all data displayed in the MDR.

2) "In any publication or distribution of these data, you should: a) Attribute the source of the data as ClinicalTrials.gov; b) Update the data such that they are current at all times; c) Clearly display the date the data were processed by ClinicalTrials.gov; d) State any modifications made to the content of the data, along with a complete description of the modifications."
Please note: a) data in the MDR is updated on a periodic base, with the frequency of update increasing as the system develops; b) the intention is to provide a 'pop-up' over displayed data with provenance information, including source and date of data capture. This is planned for version 1.0 of the MDR.

3) "You shall not assert any proprietary rights to any portion of the database, or represent the database or any part thereof to anyone as other than a United States Government database."
We do not, and instead intend to make the provenance of all data clear (see 2 above).

4) "You shall not use any email addresses extracted from our database for marketing or other promotional purposes."
We do not.

5) "The ClinicalTrials.gov data carry an international copyright outside the United States and its Territories or Possessions. Some ClinicalTrials.gov data may be subject to the copyright of third parties; you should consult these entities for any additional terms of use."
We recognise that copyright, but point out that all the data displayed in the system is obtained from files made available publicly and expressly for the purpose of sharing ClinicalTrials.gov data.

Pubmed

PubMed is run by the US National Library of Medicine (itself part of the National Institutes of Health) and comprises more than 30 million citations for biomedical literature, covering a huge range of life science journals and online books.
Citations may include links to full-text content stored in PubMed Central and, through a 'link-out' mechanism, on publisher web sites.
In the context of the MDR, data collection from PubMed is concerned with identifying and harvesting those citations that can be explicitly linked to clinical studies, and which can therefore be listed as data objects associated with those studies. Fortunately, the National Library of Medicine (NLM) makes available an API that can be used to identify the relevant data and download it as XML.
The processing of PubMed data is different in several ways from the processing of other data sources. This is explained in greater detail on the Processing PubMed Data page, which also includes reference to the details of the source data's structure.

Terms and conditions of use

Terms and conditions can be found under National Library of Medicine Terms and Conditions (see: https://www.nlm.nih.gov/databases/download/terms_and_conditions.html) and NLM Copyright Information (see: https://www.nlm.nih.gov/copyright.html). The key points, and an explanatory response from the ECRIN MDR, are listed below.

1) "Attribution and Rights for Government Works: Works produced by the U.S. government are not subject to copyright protection in the United States. Any such works found on National Library of Medicine (NLM) Web sites may be freely used or reproduced without permission in the U.S. Please acknowledge NLM as the source of the information by including the phrase “Courtesy of the U.S. National Library of Medicine” or “Source: U.S. National Library of Medicine.” "
Please note that a) all data retrieved using the NLM's API is publicly available, and b) the intention is to provide a 'pop-up' over displayed data with provenance information, including source and date of data capture. This is planned for version 1.0 of the MDR.

2) "Copyright Protections for Non-Government Works: When using NLM Web sites, you may encounter documents, illustrations, photographs, or other content contributed by or licensed from private individuals, companies, or organizations that may be protected by U.S. and international copyright laws. You can sometimes tell if content is copyrighted if it has the copyright symbol, the name of the copyright holder, or the statement "All rights reserved." However, a copyright notice is not required by law and therefore not all copyrighted content is necessarily marked in this way."
Please note that we do not extract or display any content related to the Abstract sections of a PubMed record, as the Abstract material is protected by the copyright of the journal publisher.

3) "Users who republish or redistribute the data (services, products or raw data) agree to: a) maintain the most current version of all distributed data, or b) make known in a clear and conspicuous manner that the products/services/applications do not reflect the most current/accurate data available from NLM."
Please note: a) data in the MDR is updated on a periodic base, with the frequency of update increasing as the system develops; b) the intention is to provide a 'pop-up' over displayed data with provenance information, including source and date of data capture. This is planned for version 1.0 of the MDR.

4) "These data are produced with a reasonable standard of care, but NLM makes no warranties express or implied, including no warranty of merchantability or fitness for particular purpose, regarding the accuracy or completeness of the data. Users agree to hold NLM and the U.S. Government harmless from any liability resulting from errors in the data. NLM disclaims any liability for any consequences due to use, misuse, or interpretation of information contained or not contained in the data."
Please note that ECRIN makes a similar disclaimer for all data displayed in the MDR.


EUCTR

The EU Clinical Trials Register (EU CTR) is run by the European Medicines Agency (EMA) and contains information on interventional clinical trials on medicines conducted in the European Union (EU), or the European Economic Area (EEA) which started after 1 May 2004. The Register enables searches for information in the EudraCT database. It does not provide information on non-interventional clinical trials of medicines, or clinical trials for surgical procedures, medical devices or psychotherapeutic procedures.
The information that appears on the EU Clinical Trials Register is originally provided by the company or organisation responsible for the clinical trial. The information is loaded into the EudraCT database by the national competent authority. The data on the results of these trials are entered into the database by the sponsors themselves and are published in this Register once the sponsors have validated the data.
The EU Clinical Trials Register currently (04/20) displays about 37,000 clinical trials.

Terms and conditions of use

The EMA's legal terms and conditions are available at https://www.ema.europa.eu/en/about-us/legal-notice#european-medicines-web-portal-section.
The key statements are listed below, together with a response from the ECRIN MDR indicating compliance.

1) "...the Agency accepts no responsibility or liability whatsoever (including, but not limited to, any direct or consequential loss or damage that might occur to you and/or any other third party) arising out of, or in connection with, the information on this site, including information relating to the documents it publishes."
Please note that ECRIN makes a similar disclaimer for all data displayed in the MDR.

2) "The contents of these webpages are © EMA [1995-2020]. In particular, unless otherwise stated, the Agency, according to current European Union and international legislation1, is the owner of copyright and other intellectual property rights for documents and other content published on this website. Information and documents made available on the Agency's webpages are public and may be reproduced and/or distributed, totally or in part, irrespective of the means and/or the formats used, for non-commercial and commercial purposes, provided that the Agency is always acknowledged as the source of the material. Such acknowledgement must be included in each copy of the material.Citations may be made from such material without prior permission, provided the source is always acknowledged."
Please note the intention is to provide a 'pop-up' over displayed data with provenance information, including source and date of data capture. This is planned for version 1.0 of the MDR.

3) "The Agency encourages organisations and individuals to create links to its website under the following conditions: Links must not be used in a defamatory context; Linked information must not be changed in any way; Linked information to ema.europa.eu should not be displayed in a manner which suggests endorsement of any commercial product or service; Linked information to ema.europa.eu should not be displayed alongside advertising."
In compliance with this, the ECRIN MDR provides only simple links back to the relevant EU CTR page and adds no further comment or information, other than the provenance data to be added, as described above.

ISRCTN

ISRCTN is a registry and curated database containing the basic set of data items deemed essential to describe a study at inception, as per the requirements set out by the World Health Organization (WHO) International Clinical Trials Registry Platform (ICTRP) and the International Committee of Medical Journal Editors (ICMJE) guidelines. All study records in the database are freely accessible and searchable and have been assigned an ISRCTN ID.
The registry (ISRCTN) was launched in 2000, in response to the growing body of opinion in favour of prospective registration of randomised controlled trials (RCTs). Originally ISRCTN stood for 'International Standard Randomised Controlled Trial Number'; however, over the years the scope of the registry has widened beyond randomized controlled trials to include any study designed to assess the efficacy of health interventions in a human population. This includes both observational and interventional trials. It is now called simply ISRCTN, and is run by the publishing company BioMed Central (BMC), part of Springer Nature. Currently (04/20) it includes about 20,000 records.

Terms and conditions of use

Although the terms and conditions do not seem to be easily findable on the ISRCTN website, the home page makes the explicit statement:
"The Terms and Conditions enable anyone to cite with attribution the details in each study record, and encourage unrestricted use of all metadata generated during the process of registration, updating or reporting."
To provide the attribution, the intention is to provide a 'pop-up' over displayed data in the MDR with provenance information, including source and date of data capture. This is planned for version 1.0 of the MDR.

WHO

The mission of the WHO International Clinical Trials Registry Platform (ICTRP) is to ensure that a complete view of research is accessible to all those involved in health care decision making.
It does this by collecting and aggregating clinical trial registry records from all major registries, globally, and then making them all accessible through a single web portal. This ICTRP Search Portal therefore provides a searchable database containing the trial registration data sets made available by data providers around the world meeting criteria for content and quality control. There are currently 18 such providers.

Terms and conditions of use

The terms and conditions of data use are provided at https://www.who.int/ictrp/search/download/en/. The statements are listed below, together with a comment indicating the compliance, where necessary, of the ECRIN MDR.

1) "WHO ICTRP Database: The World Health Organization, through its International Clinical Trials Registry Platform, has developed the ICTRP database to provide patients, family members and members of the public current information about clinical research studies. The ICTRP database contains information about clinical trials being conducted throughout the world. This data is provided to the ICTRP by registries conforming to WHO standards. ICTRP is updated weekly. Trial data posted on this search portal are not endorsed by WHO, but are provided as a service to our users. In no event shall the World Health Organization be liable for any damages arising from the use of the data in the ICTRP database. None of the information obtained through use of the search portal should in any way be used in clinical care without consulting a physician or licensed health professional. WHO is not responsible for the accuracy, completeness and/or use made of the content displayed for any trial record."
Please note that ECRIN makes a similar disclaimer for all data displayed in the MDR.

2) "Availability: ICTRP data are publicly available to all requesters, at no charge."
The data is available on the web site (though in 04/20 the web site is not operating because of the demand caused by the COVID-19 pandemic), as XML files and as CSV files. The latter are freely available for download on the WHO ICTRP web site.

3) "Neither the WHO ICTRP, nor any of its data providers, make any warranties, expressed or implied, with respect to data contained in the database, and, furthermore, assume no liability for any party's use, or the results of such use, of any part of the database;"
Please note that, as in 1) above, ECRIN makes a similar disclaimer for all data displayed in the MDR.

4) "In any publication or distribution of these data (1) you should, to the extent possible and appropriate, attribute the source of the data as WHO ICTRP; (2) you should update the data such that they are current at all times; (3) you should clearly display the date the data were processed by WHO ICTRP;"
Please note that the intention is to provide a 'pop-up' over displayed data with provenance information, including source and date of data capture. This is planned for version 1.0 of the MDR.

5) "You shall not assert any proprietary rights to any portion of the ICTRP database."
We do not.

6) "You shall not use information extracted from the ICTRP database for marketing, promotional or commercial purposes."
We do not.

7) "You shall not use the name or emblem of WHO in association with the use of data from the ICTRP database."
We do not.

BioLINCC

The National Heart, Lung and Blood Institute (NHLBI) is the coming together of two entities: the NHLBI Biologic Specimen Repository (NHLBI Biorepository), managed by the NHLBI, Division of Blood Diseases and Resources (DBDR), Transfusion Medicine and Cellular Therapeutics Branch, and the NHLBI Data Repository, managed by the NHLBI, Division of Cardiovascular Sciences (DCVS), Epidemiology Branch. These two programs have always had a similar mission, namely to enhance and facilitate further research in cardiovascular, pulmonary and hematologic conditions by providing access to qualified investigators to stored biospecimen and data collections.

Terms and conditions of use

Terms & conditions are summarised under Trademark, Branding, and Logo (see: https://www.nhlbi.nih.gov/about/contact/trademark-branding-and-logo). This states that:

1) "Most of the information on the NHLBI website is in the public domain and can be used without restriction. The NHLBI asks only that no changes be made to the publications, videos, images, or other formatted multimedia products and that the material, as well any NHLBI webpage links, not be used in any direct or indirect product endorsement or advertising. Although organizations may add their own logos to published NHLBI materials, organizations may not use the NHLBI logo on other print or digital materials without the Institute’s permission. (...)
Please use the following language to cite the source of the materials: Source: National Heart, Lung, and Blood Institute; National Institutes of Health; U.S. Department of Health and Human Services."
In compliance with this, the ECRIN MDR makes no changes to the data nor is involved in any 'direct or indirect product endorsement or advertising'. We do not include the NHLBI logo. Please note it is also the intention is to provide a 'pop-up' over displayed data with provenance information, including source and date of data capture. This is planned for version 1.0 of the MDR.

2) The same web page also states that:
"Unless noted otherwise, information posted on the NHLBI website within the nhlbi.nih.gov domain is considered to be in the public domain. You may link to the NHLBI website from your website without permission.
Other organizations are free to establish links to NHLBI online resources. When establishing links to the Institute’s website, you may not position or create the possible impression that the NHLBI is endorsing or promoting any particular organization, material, product, service, content, or information."
In compliance with this, the ECRIN MDR provides only simple links back to the relevant NHLBI page and adds no further comment or information, other than the provenance data to be added, as described above.

Yoda

The Yale University Open Data Access (YODA) Project is an effort by a group of academically-based clinical researchers to facilitate access to participant-level clinical research data and/or comprehensive reports of clinical research, such as full Clinical Study Reports (CSRs), a level of detail not customarily found in journal publications, with the aim of promoting scientific research that may advance science or lead to improvements in individual and public health and healthcare delivery. The YODA Project is guided by the following core principles, which reflect the overall mission of the project to promote open science by:

  • Promoting the sharing of clinical research data to advance science and improve public health and healthcare
  • Promoting the responsible conduct of research
  • Ensuring good stewardship of clinical research data
  • Protecting the rights of research participants

The YODA Project is currently collaborating with the following Data Partners to facilitate access to their clinical trial program data: Johnson & Johnson, Medtronic, Inc., Queen Mary University of London, SI-BONE, Inc.
To participate, each Data Partner must transfer full jurisdiction over data access to the YODA Project.

Terms and conditions of use

The Yoda website does not seem to include any general 'terms and conditions' about the use of the data displayed on the web site, although it has very full material about applying for and using the data described on the web site.
To ensure compliance with data usage policies, the Yoda project team were contacted directly, to check that the web scraping procedures in action were acceptable to them. Confirmation was received that this was the case.