ClinicalTrials.gov SourceMetadata
Contents
- 1 Introduction
- 2 Overall structure
- 3 The Protocol Section
- 3.1 The Identification Module
- 3.2 The Status Module
- 3.3 The SponsorCollaborators Module
- 3.4 The Oversight Module
- 3.5 The Description Module
- 3.6 The Conditions Module
- 3.7 The Design Module
- 3.8 The ArmsInterventions Module
- 3.9 The Outcomes Module
- 3.10 The Eligibility Module
- 3.11 The ContactsLocations Module
- 3.12 The References Module
- 3.13 The IPDSharingStatement Module
- 4 The Results Section
- 5 The Document Section
- 6 The Derived Section
Introduction
The structure of the new API varies slightly, depending on whether the XML files are downloaded in a block, or are retrieved using an API query. Most elements of the structure are the same, but
a) the downloaded files do not have a "<?xml version "1.0"?>" statement at their head. One needs to be added to each file before most tools will recognise the file as valid xml.
b) The downloaded files start at the level of an individual study, i.e. the root element is FullStudy.
c) The API retrieved files contain one or more FullStudy records within a FullStudyList, which is simply a sequence of FullStudy elements. The FullStudyList is itself embedded in the FullStudyResponse root element. This has additional elements that hold information about the query and the records found against it.
This section describes the structure of the FullStudy element, which contains only three element types: Struct, List and Field.
Fields are simple name - value pairs. A Field element has a name attribute that indicates what the value is about, and then the element's value (always a string) gives the Field's value, e.g.
<Field Name="Gender">All</Field> <Field Name="MinimumAge">70 Years</Field> <Field Name="MaximumAge">82 Years</Field>
Lists are exactly that, i.e. sequences of Fields, or Structs, or Lists. They have a name attribute to indicate what the List is about and then the list of child elements, e.g.
<List Name="ConditionMeshList"> <Struct Name="ConditionMesh"> … … </Struct> <Struct Name="ConditionMesh"> … … </Struct> <Struct Name="ConditionMesh"> … … </Struct> </List>
Structs are complex elements that contain one or more fields, and / or one or more lists, and / or one or more structs. Again a name attribute indicates what the Struct is about, e.g.
<Struct Name="DesignModule"> <Field Name="StudyType">Interventional</Field> <List Name="PhaseList"> <Field Name="Phase">Not Applicable</Field> </List> <Struct Name="DesignInfo"> <Field Name="DesignAllocation">Randomized</Field> <Field Name="DesignInterventionModel">Parallel Assignment</Field> <Field Name="DesignPrimaryPurpose">Prevention</Field> <Struct Name="DesignMaskingInfo"> <Field Name="DesignMasking">Triple</Field> <List Name="DesignWhoMaskedList"> <Field Name="DesignWhoMasked">Participant</Field> <Field Name="DesignWhoMasked">Care Provider</Field> <Field Name="DesignWhoMasked">Investigator</Field> </List> </Struct> </Struct> <Struct Name="EnrollmentInfo"> <Field Name="EnrollmentCount">1400</Field> <Field Name="EnrollmentType">Anticipated</Field> </Struct> </Struct>
Structs and Lists together therefore create the structure and hierarchy within the xml document, in which the Field values are inserted.
Overall structure
The FullStudy element has 4 top level Struct elements, designated as Sections…
<Struct Name="ProtocolSection"> … … </struct> <Struct Name="ResultsSection"> … … </struct> <Struct Name="DocumentSection"> … … </struct> <Struct Name="DerivedSection"> … … </struct>
The protocol section is present for all records, the presence of the other three depends on whether relevant content is available. Each of the top level sections is split up into further Structs, called Modules. The top level structure of FullStudy is shown below:
<Struct Name="ProtocolSection"> <Struct Name="IdentificationModule"> … … </struct> <Struct Name="StatusModule"> … … </struct> <Struct Name="SponsorCollaboratorsModule"> … … </struct> <Struct Name="OversightModule"> … … </struct> <Struct Name="DescriptionModule"> … … </struct> <Struct Name="ConditionsModule"> … … </struct> <Struct Name="DesignModule"> … … </struct> <Struct Name="ArmsInterventionsModule"> … … </struct> <Struct Name="OutcomesModule"> … … </struct> <Struct Name="EligibilityModule"> … … </struct> <Struct Name="ContactsLocationsModule"> … … </struct> <Struct Name="ReferencesModule"> … … </struct> <Struct Name="IPDSharingStatementModule"> … … </struct> </struct> <Struct Name="ResultsSection"> <Struct Name="ParticipantFlowModule"> … … </struct> <Struct Name="BaselineCharacteristicsModule"> … … </struct> <Struct Name="OutcomeMeasuresModule"> … … </struct> <Struct Name="AdverseEventsModule"> … … </struct> <Struct Name="MoreInfoModule"> … … </struct> </struct> <Struct Name="DocumentSection"> <Struct Name="LargeDocumentModule"> … … </struct> </struct> <Struct Name="DerivedSection"> <Struct Name="MiscInfoModule"> … … </struct> <Struct Name="ConditionBrowseModule"> … … </struct> <Struct Name="InterventionBrowseModule"> … … </struct> </struct>
Each of these modules are described in more detail below, with the emphasis on those that are worth extracting for the MDR (if not always for direct mapping). The Field descriptions are taken (or adapted) from the official ClinicalTrials.gov descriptions (especially at https://prsinfo.clinicaltrials.gov/definitions.html).
For each module, module level Fields are described first, then the Lists and Structures and the fields within them.
The Protocol Section
The Identification Module
Field Name = "NCTId"
The unique code ('NCT' plus an 8 digit number) assigned by clinicaltrials.gov to each clinical study. Obviously required and acts as the source data identifier (sd_id), linking the records of each study in the extracted data tables. It will also be combined with various base URLs to indicate the links to the study's protocol entry on clinical trials.gov, and the results entry if one exists.
Field Name = "BriefTitle"
A short title of the clinical study written in language intended for the lay public. The title should include information on the participants, condition being evaluated, and intervention(s) studied. (Limit: 300 characters.). Needs to be extracted as an 'alternate title'.
Field Name = "OfficialTitle"
The title of the clinical study, corresponding to the title of the protocol. (Limit: 600 characters.). Needs to be extracted as the default study title.
Field Name = "Acronym"
An acronym or abbreviation used publicly to identify the clinical study, if any. (Limit: 14 characters.). Needs to be extracted as an 'alternate title'.
Struct Name = "OrgStudyIdInfo"
This includes the following Fields:
- OrgStudyId – defined as any unique identifier assigned to the protocol by the sponsor. (Limit: 30 characters.)
- OrgStudyIdType – type will always be sponsor's protocol id. Rarely used. Extract to check content.
- OrgStudyIdDomain – rarely used, as domain would be the organisation identified elsewhere
- OrgStudyIdLink - for research funded by NIH grants, appears to link to the reporting page on the NIH website where further grant information is displayed.
The data needs to be extracted, as an 'additional identifier'.
Struct Name = "Organization"
This provides information on the organisation (the trial sponsor) that provided the Org Study Id. It has two Fields:
- OrgFullName: The name of the organisation. (Limit: 160 characters.)
- OrgClass: e.g. 'INDUSTRY', 'OTHER'
Both these should be extracted (although the value of the OrgClass may be limited), even though the information should also be supplied elsewhere, as the lead sponsor organisation is given in the sponsor collaborators module. Comparison of the two versions of this data may be instructive.
List Name = "SecondaryIdInfoList"
If it exists, this list comprises one or more Structs called "SecondaryIdInfo", each of which is very similar to the 'primary' organisation as described in the 'OrgStudyIdInfo' Struct. Each Struct contains the fields:
- SecondaryId – The Id itself (Limit: 30 characters.) This includes any study identifiers assigned by other clinical trial registries. If the clinical study is funded in whole or in part by a U.S. Federal Government agency, the complete grant or contract number must be submitted as a Secondary ID.
- SecondaryIdType - A description of the type of Secondary ID. Can be one of:
- U.S. National Institutes of Health (NIH) Grant/Contract Award Number: Include activity code, institute code, and 6-digit serial number. Other components of the full award number (type code, support year, and suffix) are optional.
- Other Grant/Funding Number: Identifier assigned by a funding organization other than the U.S. NIH; also required to enter the name of the funding organization.
- Registry Identifier: Number assigned by a clinical trial registry (for example, a registry that is part of the World Health Organization [WHO] Registry Network); also required to enter the name of the clinical trial registry.
- EudraCT Number: Identifier assigned by the European Medicines Agency Clinical Trials Database (EudraCT).
- Other Identifier: Also required to enter a brief description of the identifier (for example, the name of organization that issued the identifier).
- SecondaryIdDomain - If a Secondary ID Type of "Other Grant/Funding Number," "Registry Identifier," or "Other Identifier" is selected, the name of the funding organization, clinical trial registry, or organization that issued the identifier. (Limit: 119 characters).
- SecondaryIdLink – for research funded by NIH grants, appears to link to the reporting page on the NIH website where further grant information is displayed..
List Name = "NCTIdAliasList"
This refers to NCT 'Alias identifier's, which were used historically for some studies. It does not need to be extracted.
The Status Module
Field Name = "StatusVerifiedDate"
The date on which the responsible party last verified the clinical study information in the entire ClinicalTrials.gov record for the clinical study, even if no additional or updated information is being submitted. Extract for possible use in provenance data.
Field Name = "OverallStatus"
Overall Recruitment Status - The recruitment status for the clinical study as a whole, based upon the status of the individual sites. If at least one facility in a multi-site clinical study has an Individual Site Status of "Recruiting," then the Overall Recruitment Status for the study must be "Recruiting." One of:
- Not yet recruiting: Participants are not yet being recruited
- Recruiting: Participants are currently being recruited, whether or not any participants have yet been enrolled
- Enrolling by invitation: Participants are being (or will be) selected from a predetermined population
- Active, not recruiting: Study is continuing, meaning participants are receiving an intervention or being examined, but new participants are not currently being recruited or enrolled
- Completed: The study has concluded normally; participants are no longer receiving an intervention or being examined (that is, last participant’s last visit has occurred)
- Suspended: Study halted prematurely but potentially will resume
- Terminated: Study halted prematurely and will not resume; participants are no longer being examined or receiving intervention
- Withdrawn: Study halted prematurely, prior to enrollment of first participant
Needs to be extracted and then coded as study status
Field Name = "LastKnownStatus"
No definition available – not expected to be extracted (but may need to be investigated first)
Field Name = "DelayedPosting"
No definition available – not expected to be extracted (but may need to be investigated first)
Field Name = "WhyStopped"
A brief explanation of the reason(s) why such clinical study was stopped (for a clinical study that is "Suspended," "Terminated," or "Withdrawn" prior to its planned completion as anticipated by the protocol). Not extracted
Field Name = "StudyFirstSubmitDate"
Field Name = "StudyFirstSubmitQCDate"
Field Name = "ResultsFirstSubmitDate"
Field Name = "ResultsFirstSubmitQCDate"
Field Name = "DispFirstSubmitDate"
Field Name = "DispFirstSubmitQCDate"
Field Name = "LastUpdateSubmitDate"
Most of these internally generated dates, relating to submission and QC checks, do not need to be extracted.
The exception is the StudyFirstSubmitDate, which has been taken as the date the NCT Id is assigned (though this is not known for certain).
Struct Name = "StartDateStruct"
Includes the two fields
- StartDate – The estimated date on which the clinical study will be open for recruitment of participants, or the actual date on which the first participant was enrolled.
- StartDateType – 'Actual' or 'Anticipated'.
Not extracted as the MDR is not designed to keep study dates, just the data object dates.
Struct Name = "PrimaryCompletionDateStruct"
Includes the two fields
- PrimaryCompletionDate – The date that the final participant was examined or received an intervention for the purposes of final collection of data for the primary outcome, whether the clinical study concluded according to the pre-specified protocol or was terminated. In the case of clinical studies with more than one primary outcome measure with different completion dates, this term refers to the date on which data collection is completed for all of the primary outcomes.
- PrimaryCompletionDateType – 'Actual' or 'Anticipated'.
Neither data point extracted.
Struct Name = "CompletionDateStruct"
Includes the two fields
- CompletionDate – The date the final participant was examined or received an intervention for purposes of final collection of data for the primary and secondary outcome measures and adverse events (for example, last participant’s last visit), whether the clinical study concluded according to the pre-specified protocol or was terminated.
- CompletionDateType – 'Actual' or 'Anticipated'.
Neither data point extracted.
Struct Name = "StudyFirstPostDateStruct"
Includes the two fields
- StudyFirstPostDate – Date the registry entry first posted on the CTG site.
- StudyFirstPostDateType – 'Actual' or 'Anticipated'. Not extracted but used to decide if the associated date should be used.
The date should be extracted if it has 'Actual' status, as the date the data object was made available.
Struct Name = "ResultsFirstPostDateStruct"
Includes the two fields
- ResultsFirstPostDate – For the Results registry entry, needs to be extracted if Actual as the date the data object was made available.
- ResultsFirstPostDateType – 'Actual' or 'Anticipated'. Not extracted but used to decide if the associated date should be used.
The existence of a results section should be checked if this date is extracted.
Struct Name = "DispFirstPostDateStruct"
Includes the two fields
- DispFirstPostDate – Unclear what this date refers to.
- DispFirstPostDateType – 'Actual' or 'Anticipated'.
Needs to be extracted for investigation but unlikely to be mapped.
Struct Name = "LastUpdatePostDateStruct"
Includes the two fields
- LastUpdatePostDate – Date of last update of record.
- LastUpdatePostDateType – 'Actual' or 'Anticipated'. Not extracted but used to decide if the associated date should be used.
The date should be extracted if Actual, as the date the record was last updated. Also needs to be used when filtering records to identify new edits, but this should probably be done in the context of an API call.
Struct Name = "ExpandedAccessInfo"
Includes the two fields
- HasExpandedAccess – Whether there is expanded access to the investigational product for patients who do not qualify for enrollment in a clinical trial. One of Yes, No, Unknown.
- ExpandedAccessNCTId – If expanded access is available, the NCT number of the expanded access record.
- ExpandedAccessStatusForNCTId – No definition available but unlikely to need extraction.
May be useful to extract to cross check against those NCT Ids and ensure a study is not counted twice, if 'HasExpandedAccess' is yes. The other CTG record would represent a potential 'related data object'.
The SponsorCollaborators Module
Struct Name = "ResponsibleParty"
This Struct contains the following fields
- ResponsiblePartyType – An indication of whether the responsible party (i.e. for the CTG data) is the sponsor, the sponsor-investigator, or a principal investigator designated by the sponsor to be the responsible party. One of Sponsor, Principal Investigator, or Sponsor-Investigator.
The sponsor may designate a principal investigator as the responsible party if such principal investigator meets all of the following requirements: is responsible for conducting the study; has access to and control over the data from the study; has the right to publish the results of the study; and has the ability to meet all of the requirements for submitting and updating clinical study information.
If the Responsible Party, by Official Title is either "Principal Investigator" or "Sponsor-Investigator", the following is required:
- ResponsiblePartyInvestigatorFullName – Name of the investigator, including first and last name
- ResponsiblePartyInvestigatorTitle – The official title of the investigator at the primary organizational affiliation (Limit: 254 characters.)
- ResponsiblePartyInvestigatorAffiliation – Primary organizational affiliation of the individual; (Limit: 160 characters.)
- ResponsiblePartyOldNameTitle – Unclear if the 'Old' refers to system usage or means 'Previous'. Requires investigation. Listed as the official title, and name of the investigator, including first and last name
- ResponsiblePartyOldOrganization – Unclear if the 'Old' refers to system usage or means 'Previous'. Requires investigation. Listed as Primary organizational affiliation of the individual; (Limit: 160 characters.)
This data should be extracted, though further processing and mapping will need to be developed after examining data.
Struct Name = "LeadSponsor"
- LeadSponsorName – The name of the entity or the individual who is the sponsor of the clinical study. (Limit: 160 characters.).
- LeadSponsorClass – A very broad classification of the organisation type. May be too coarse a category to be useful but extract for now.
Both data points need to be extracted and compared with the organisation listed as providing the sponsor's Id.
List Name = "CollaboratorList"
This is a list of Structs named "Collaborator", representing other organizations (if any) providing support. Support may include funding, design, implementation, data analysis or reporting. The responsible party is responsible for confirming all collaborators before listing them. (Limit: 160 characters.). Each Collaborator Struct has the same fields as those of the LeadSponsor Struct:
- CollaboratorName – The name of the entity providing support. Unfortunately the nature of the support is not required so this data may be of limited value.
- CollaboratorClass – A very broad classification of the organisation type. May be too coarse a category to be useful but extract for now.
Extract the data for now and investigate.
The Oversight Module
Field Name = "OversightHasDMC"
Indicates whether a data monitoring committee has been appointed for this study. Not extracted.
Field Name = "IsFDARegulatedDrug"
Indicates that a clinical study is studying a drug product (including a biological product) regulated by the FDA. Does not appear to be a consistent indicator of a CTIMP study and in any case intervention keyword data is a better guide to studies involving drugs. Not extracted.
Field Name = "IsFDARegulatedDevice"
Indicates that a clinical study is studying an approved device product. As above, does not appear to be a consistent indicator of a device study and intervention keyword data is more informative. Not extracted.
Field Name = "IsUnapprovedDevice"
Indication that at least one device product studied in the clinical study has not been previously approved or cleared by the U.S. Food and Drug Administration (FDA) for one or more uses. Select one. Not extracted.
Field Name = "IsPPSD"
Indicates the study includes a U.S. FDA-regulated device product is a pediatric postmarket surveillance of a device product. Not extracted.
Field Name = "IsUSExport"
Whether any drug product (including a biological product) or device product studied in the clinical study is manufactured in the United States or one of its territories and exported for study in a clinical study in another country. Not extracted.
The Description Module
Field Name = "BriefSummary"
A short description of the clinical study, including a brief statement of the clinical study's hypothesis, written in language intended for the lay public.
(Limit: 5000 characters.). Extract as possible description element of the Registry entry data object (i.e. "The registry entry relates to...").
Field Name = "DetailedDescription"
Extended description of the protocol, including more technical information (as compared to the Brief Summary), if desired. Do not include the entire protocol; do not duplicate information recorded in other data elements, such as Eligibility Criteria or outcome measures.
(Limit: 32,000 characters.). Not extracted as this level of detail not required.
The Conditions Module
List Name = "ConditionList">
A list of a single field
- Condition
Representing the name(s) of the disease(s) or condition(s) studied in the clinical study, or the focus of the clinical study. Appropriate descriptors from NLM's Medical Subject Headings (MeSH) controlled vocabulary thesaurus or terms should be used. Extracted as a study topic, type condition, but not extracted if the same term has already been found in the (MESH) browse list.
List Name = "KeywordList">
A list of a single field
- Keyword
Representing words or phrases that best describe the protocol. Keywords help users find studies in the database. Appropriate descriptors from NLM's Medical Subject Headings (MeSH) controlled vocabulary thesaurus or terms should be used. Extract as a study topic.
The Design Module
Field Name = "StudyType"
The nature of the investigation or investigational use for which clinical study information is being submitted. Can be
- Interventional (clinical trial): Participants are assigned prospectively to an intervention or interventions according to a protocol to evaluate the effect of the intervention(s) on biomedical or other health related outcomes.
- Observational: Studies in human beings in which biomedical and/or health outcomes are assessed in pre-defined groups of individuals. Participants in the study may receive diagnostic, therapeutic, or other interventions, but the investigator does not assign specific interventions to the study participants. This includes when participants receive interventions as part of routine medical care, and a researcher studies the effect of the intervention.
- Patient Registry: An observational study that is also considered to be a Patient Registry. This type of study should only be registered once in the Protocol Registration and Results System (PRS), by the sponsor responsible for the primary data collection and analysis.
- Expanded Access: An investigational drug product (including biological product) available through expanded access for patients who do not qualify for enrollment in a clinical trial. Expanded Access includes all expanded access types: (1) for individual patients, including emergency use; (2) for intermediate-size patient populations; and (3) under a treatment IND or treatment protocol.
For extraction as study type.
Field Name = "PatientRegistry"
Not clear if this is a Yes / No or the name of a Registry (no documentation) – for investigation and therefore initially for extraction.
Field Name = "TargetDuration"
For Patient Registries, the anticipated time period over which each participant is to be followed. A number is required plus a unit of time (years, months, weeks, days). Not extracted.
Struct Name = "ExpandedAccessTypes"
Has 3 fields but usage unclear, possibly Yes / No
- ExpAccTypeIndividual – For individual participants, including for emergency use
- ExpAccTypeIntermediate – For intermediate-size participant populations
- ExpAccTypeTreatment – Under a treatment IND or treatment protocol
Very rarely used, and not extracted.
List Name = "PhaseList"
In the XML structure a list although the completion instructions say select one. Contains a single field
- Phase
For a clinical trial of a drug product (including a biological product), the numerical phase of such clinical trial, taken from
- N/A: Trials without phases (for example, studies of devices or behavioral interventions).
- Early Phase 1 (Formerly listed as "Phase 0"): Exploratory trials, involving very limited human exposure, with no therapeutic or diagnostic intent (e.g., screening studies, microdose studies). See FDA guidance on exploratory IND studies for more information.
- Phase 1: Includes initial studies to determine the metabolism and pharmacologic actions of drugs in humans, the side effects associated with increasing doses, and to gain early evidence of effectiveness; may include healthy participants and/or patients.
- Phase 1/Phase 2: Trials that are a combination of phases 1 and 2.
- Phase 2: Includes controlled clinical studies conducted to evaluate the effectiveness of the drug for a particular indication or indications in participants with the disease or condition under study and to determine the common short-term side effects and risks.
- Phase 2/Phase 3: Trials that are a combination of phases 2 and 3.
- Phase 3: Includes trials conducted after preliminary evidence suggesting effectiveness of the drug has been obtained, and are intended to gather additional information to evaluate the overall benefit-risk relationship of the drug.
- Phase 4: Studies of FDA-approved drugs to delineate additional information including the drug's risks, benefits, and optimal use.
For extraction, for possible use within Description and / or keywords element
<Struct Name = "DesignInfo">
Includes a variety of Fields, Lists and Structs…
- DesignAllocation – The method by which participants are assigned to arms in a clinical trial. Can be
- N/A (not applicable): For a single-arm trial
- Randomized: Participants are assigned to intervention groups by chance
- Nonrandomized: Participants are expressly assigned to intervention groups through a non-random method, such as physician choice
For extraction, for possible use within Description and / or keywords element
- DesignInterventionModel – The strategy for assigning interventions to participants. Can be
- Single Group: Clinical trials with a single arm
- Parallel: Participants are assigned to one of two or more groups in parallel for the duration of the study
- Crossover: Participants receive one of two (or more) alternative interventions during the initial phase of the study and receive the other intervention during the second phase of the study
- Factorial: Two or more interventions, each alone and in combination, are evaluated in parallel against a control group
- Sequential: Groups of participants are assigned to receive interventions based on prior milestones being reached in the study, such as in some dose escalation and adaptive design studies
For extraction, for possible use within Description and / or keywords element
- DesignInterventionModelDescription – Provide details about the Interventional Study Model. Not extracted.
- DesignPrimaryPurpose – The main objective of the intervention(s) being evaluated by the clinical trial. Can be one of
- Treatment: One or more interventions are being evaluated for treating a disease, syndrome, or condition.
- Prevention: One or more interventions are being assessed for preventing the development of a specific disease or health condition.
- Diagnostic: One or more interventions are being evaluated for identifying a disease or health condition.
- Supportive Care: One or more interventions are evaluated for maximizing comfort, minimizing side effects, or mitigating against a decline in the participant's health or function.
- Screening: One or more interventions are assessed or examined for identifying a condition, or risk factors for a condition, in people who are not yet known to have the condition or risk factor.
- Health Services Research: One or more interventions for evaluating the delivery, processes, management, organization, or financing of healthcare.
- Basic Science: One or more interventions for examining the basic mechanism of action (for example, physiology or biomechanics of an intervention).
- Device Feasibility: An intervention of a device product is being evaluated in a small clinical trial (generally fewer than 10 participants) to determine the feasibility of the product; or a clinical trial to test a prototype device for feasibility and not health outcomes. Such studies are conducted to confirm the design and operating specifications of a device before beginning a full clinical trial.
- Other: None of the other options applies.
For extraction, for possible use within Description and / or keywords element
List Name = "DesignObservationalModelList" (in DesignInfo)
Includes the single field
- DesignObservationalModel – The Primary strategy for participant identification and follow-up. One of
- Cohort: Group of individuals, initially defined and composed, with common characteristics (for example, condition, birth year), who are examined or traced over a given time period.
- Case-Control: Group of individuals with specific characteristics (for example, conditions or exposures) compared to group(s) with different characteristics, but otherwise similar.
- Case-Only: Single group of individuals with specific characteristics.
- Case-Crossover: Characteristics of case immediately prior to disease onset (sometimes called the hazard period) compared to characteristics of same case at a prior time (that is, control period).
- Ecologic or Community Studies: Geographically defined populations, such as countries or regions within a country, compared on a variety of environmental (for example, air pollution intensity, hours of sunlight) and/or global measures not reducible to individual level characteristics (for example, healthcare system, laws or policies median income, average fat intake, disease rate).
- Family-Based: Studies conducted among family members, such as genetic studies within families or twin studies and studies of family environment.
- Other: Explain in Detailed Description.
For extraction, for possible use within Description and / or keywords element
List Name = "DesignTimePerspectiveList" (in DesignInfo)
Has a single field
- DesignTimePerspective – For observational studies, describes the temporal relationship of observation period to time of participant enrollment. One of:
- Retrospective: Look back using observations collected predominantly prior to subject selection and enrollment
- Prospective: Look forward using periodic observations collected predominantly following subject enrollment
- Cross-sectional: Observations or measurements made at a single point in time, usually at subject enrollment
- Other: Explain in Detailed Description
For extraction, for possible use within Description and / or keywords element
Struct Name = "DesignMaskingInfo (in DesignInfo)
A Struct providing Masking (blinding) information with various fields and Lists. Includes
- DesignMasking – The party or parties involved in the clinical trial who are prevented from having knowledge of the interventions assigned to individual participants.
- DesignMaskingDescription – Information about other parties who may be masked in the clinical trial, if any. Limit: 1000 characters.
- DesignWhoMaskedList (List) with single Field DesignWhoMasked. May be Participant, Care Provider, Investigator, Outcomes Assessor (The individual who evaluates the outcome(s) of interest), No Masking.
Extract DesignMasking as potentially useful within a constructed description
Struct Name = "BioSpec"
Includes the fields
- BioSpecRetention – Indicates whether samples of material from research participants are retained in a biorepository. May be: None Retained / Samples With DNA / Samples Without DNA
- BioSpecDescription – Specify all types of biospecimens to be retained (e.g., whole blood, serum, white cells, urine, tissue). Limit: 1000 characters.
Extract BioSpecRetention as potentially useful within a constructed description
Struct Name = "EnrollmentInfo"
Includes the fields
- EnrollmentCount – The estimated total number of participants to be enrolled (target number) or the actual total number of participants that are enrolled in the clinical study. Note: "Enrolled" means a participant’s, or their legally authorized representative’s, agreement to participate in a clinical study following completion of the informed consent process.
- EnrollmentType – Actual or Anticipated
Extract both elements as potentially useful within a constructed description
The ArmsInterventions Module
List Name = "ArmGroupList"
Includes a Struct called "ArmGroup", that includes the fields
- ArmGroupLabel – The short name used to identify the arm. (Limit: 62 characters.)
- ArmGroupType – The role of each arm in the clinical trial. May be one of Experimental, Active Comparator, Placebo Comparator, Sham Comparator, No Intervention, Other
- ArmGroupDescription – If needed, additional descriptive information (including which interventions are administered in each arm) to differentiate each arm from other arms in the clinical trial. (Limit: 999 characters.)
- ArmGroupInterventionList – A List with the single Field "ArmGroupInterventionName", acting – presumably as the cross matrix between Arm Groups and Interventions.
Not extracted – too much detail for easy summarising
List Name = "InterventionList"
Includes a Struct called " Intervention", that includes the fields
- InterventionType - For each intervention studied in the clinical study, the general type of intervention. One of Drug: Including placebo, Device: Including sham, Biological/Vaccine, Procedure/Surgery, Radiation, Behavioral: For example, psychotherapy, lifestyle counselling, Genetic: Including gene transfer, stem cell and recombinant DNA, Dietary Supplement: For example, vitamins, minerals, Combination Product: Combining a drug and device, a biological product and device; a drug and biological product; or a drug, biological product, and device, Diagnostic Test: For example, imaging, in-vitro, Other
- InterventionName – Definition: A brief descriptive name used to refer to the intervention(s) studied in each arm of the clinical study. A non-proprietary name of the intervention must be used, if available. If a non-proprietary name is not available, a brief descriptive name or identifier must be used. (Limit: 200 characters.)
- InterventionDescription – Details that can be made public about the intervention, other than the Intervention Name(s) and Other Intervention Name(s), sufficient to distinguish the intervention from other, similar interventions studied in the same or another clinical study. For example, interventions involving drugs may include dosage form, dosage, frequency, and duration. (Limit: 1000 characters.)
- InterventionArmGroupLabelList – A List with a single Field, "InterventionArmGroupLabel". Not clear what this indicates.
- InterventionOtherNameList – A List with a single Field, "InterventionOtherName" – presumed to be alternate names for the intervention.
This data was investigated for possible use, as a source of key words, but was found to be too detailed and often expressed in technical / internal terminology. Its usefulness was therefore limited and it was decided not to extract it.
The Outcomes Module
Consists of 3 lists, each of which is comprised of similar Structs
List Name = "PrimaryOutcomeList"
A sequence of "PrimaryOutcome" Structs, each of which has the Fields
- PrimaryOutcomeMeasure – Name of the specific primary outcome measure.
- PrimaryOutcomeDescription – Description of the metric used to characterize the specific primary outcome measure, if not included in the primary outcome.
- PrimaryOutcomeTimeFrame – Time point(s) at which the measurement is assessed for the specific metric used. The description of the time point(s) of assessment must be specific to the outcome measure and is generally the specific duration of time over which each participant is assessed (not the overall duration of the study).
Too much detail for the MDR's purpose - not extracted.
List Name = "SecondaryOutcomeList"
A sequence of "SecondaryOutcome" Structs, each of which has the Fields
- SecondaryOutcomeMeasure – Name of the specific primary outcome measure.
- SecondaryOutcomeDescription – Description of the metric used to characterize the specific primary outcome measure, if not included in the primary outcome measure title.
- SecondaryOutcomeTimeFrame – Time point(s) at which the measurement is assessed for the specific metric used. The description of the time point(s) of assessment must be specific to the outcome measure and is generally the specific duration of time over which each participant is assessed (not the overall duration of the study).
Not for extraction.
List Name = "OtherOutcomeList"
A sequence of "OtherOutcome" Structs, each of which has the Fields
- OtherOutcomeMeasure – Name of the specific primary outcome measure.
- OtherOutcomeDescription – Description of the metric used to characterize the specific primary outcome measure, if not included in the primary outcome measure title.
- OtherOutcomeTimeFrame – Time point(s) at which the measurement is assessed for the specific metric used. The description of the time point(s) of assessment must be specific to the outcome measure and is generally the specific duration of time over which each participant is assessed (not the overall duration of the study).
Not for extraction.
The Eligibility Module
Field Name = "EligibilityCriteria"
A limited list of criteria for selection of participants in the clinical study, provided in terms of inclusion and exclusion criteria and suitable for assisting potential participants in identifying clinical studies of interest. Use a bulleted list for each criterion below the headers "Inclusion Criteria" and "Exclusion Criteria".
Too much detail for extraction.
Field Name = "HealthyVolunteers"
Indication that participants who do not have a disease or condition, or related conditions or symptoms, under study in the clinical study are permitted to participate in the clinical study. Yes or No. Not extracted.
Field Name = "Gender"
The sex of the participants eligible to participate in the clinical study. Extract to assess usefulness as possibly part of description element.
Field Name = "GenderBased"
If applicable, indicate whether participant eligibility is based on gender. Note: "Gender" means a person's self-representation of gender identity.
- Yes: Eligibility is based on gender
- No: Eligibility is not based on gender
Not for extraction.
Field Name = "GenderDescription
If eligibility is based on gender, provide descriptive information about Gender criteria. Not for extraction.
Field Name = "MinimumAge"
The numerical value, if any, for the minimum age a potential participant must meet to be eligible for the clinical study. Unit of Time – one of Years, Months, Weeks, Days, Hours, Minutes, N/A (=No limit). Extract to assess usefulness as possibly part of description element.
Field Name = "MaximumAge"
The numerical value, if any, for the maximum age a potential participant can be to be eligible for the clinical study.
Unit of Time – one of Years, Months, Weeks, Days, Hours, Minutes, N/A (=No limit). Extract to assess usefulness as possibly part of description element.
List Name = "StdAgeList"
Consists of a list of a single Field
- StdAge – but no documentation on what this might be. Not extracted.
Field Name = "StudyPopulation"
(For observational studies only) A description of the population from which the groups or cohorts will be selected (for example, primary care clinic, community sample, residents of a certain town). Not for extraction.
Field Name = "SamplingMethod"
(For observational studies only) Indicates the method used for the sampling approach and explain in the Detailed Description. One of Probability Sample or Non-Probability Sample. Not for extraction.
The ContactsLocations Module
List Name = "CentralContactList"
Contains a Struct called " CentralContact ", which has Fields…
- CentralContactName – First Name & Middle Initial & Last Name or Official Title & Degree
- CentralContactRole – the role is being the contact for enrolment data
- CentralContactPhone – Not extracted
- CentralContactPhoneExt – Not extracted
- CentralContactEMail – Extracted for internal use but not mapped to the MDR
No part of the list is extracted
List Name = "OverallOfficialList"
Contains a Struct called "OverallOfficial ", which has Fields…
- OverallOfficialName – First Name & Middle Initial & Last Name & Degree
- OverallOfficialAffiliation – Full name of the official's organization. If none, specify Unaffiliated.
- OverallOfficialRole – One of Study Chair, Study Director, Study Principal Investigator
This data should be extracted, though may require further processing
List Name = "LocationList"
Contains a Struct called "Location", for the location of clinical sites involved in the study.
It has Fields…
- LocationFacility – Full name of the organization where the clinical study is being conducted.
- LocationStatus – The recruitment status of each participating facility in a clinical study. One of Not yet recruiting, Recruiting, Enrolling by invitation
- LocationCity
- LocationState – Required for U.S. locations (including territories of the United States)
- LocationZip – Required for U.S. locations (including territories of the United States)
- LocationCountry
- LocationContactList – A lit with structured contact details
No part of the list is extracted
The References Module
This module has 3 Lists, each of which contains a sequence of Structs.
List Name = "ReferenceList"
Contains a Struct called "Reference", which has Fields…
- ReferencePMID – PMID for the citation in MEDLINE
- ReferenceType – Indicates if the reference provided reports on results from this clinical study. Types include 'background', 'derived' and 'result'.
- ReferenceCitation – A bibliographic reference in NLM's MEDLINE format
- And a List called "RetractionList", which contains a Struct called "Retraction". This has Fields
- RetractionPMID
- RetractionSource
The references of type 'result' should be extracted and then the PMID IDs used to access further details. Any retraction lists should also be extracted (although these are rare).
List Name = "SeeAlsoLinkList
Contains a Struct called "SeeAlsoLink", which has Fields…
- SeeAlsoLinkLabel – Title or brief description of the linked page
- SeeAlsoLinkURL – Complete URL, including http:// or https://
The Link and Label can be extracted though only a small proportion are useful objects, e.g. point to study websites.
The free text nature of the link label makes it difficult to select the useful links automatically - it may have to be done manually after extraction.
List Name = "AvailIPDList"
Contains a Struct called "AvailIPD", which lists the IPD and supporting documents currently available. It has the Fields…
- AvailIPDId – No definition available
- AvailIPDType – The type of data set or supporting information being shared. May be Individual Participant Data Set, Study Protocol, Statistical Analysis Plan, Informed Consent Form, Clinical Study Report, Analytic Code, Other (specify)
- AvailIPDURL – The web address used to request or access the data set or supporting information. (Limit: 3999 characters.)
- AvailIPDComment – The web address used to request or access the data set or supporting information.
This data needs extracting as it indicates a possible data object. In many cases detailed access arrangements are not provided - the user is directed to a web site instead (often CSDR).
The IPDSharingStatement Module
Refers to future plans for IPD sharing - and is a series of related statements.
When a IPD sharing statement is made the data can be usefully extracted as concatenated statements, that together could form an additonal study attribute. The record last verified data can be used to add an '(as of...)' statement at the end.
Field Name = "IPDSharing"
Yes, No or Undecided.
Field Name = "IPDSharingDescription"
An overall statement describing planned approach. Often the only statement completed (if this section is completed at all).
Field Name = "IPDSharingTimeFrame"
Description of probable time frame.
Field Name = "IPDSharingAccessCriteria"
Description of possible criteria to be applied.
Field Name = "IPDSharingURL"
Actual or planned URL for data sharing information and / or access.
List Name = "IPDSharingInfoTypeList"
Has a single field, repeatedas necessary within the list
- IPDSharingInfoType
Lists the information types to be made available - needs to be concatenated into a list.
The Results Section
The results section includes a lot of very detailed information concerning the study's results, but none of this is required for the MDR.
It would be worth checking the presence of the various modules for those studies with Result dates posted.
It is therefore suggested that a simple flag is set to indicate the presence (or not) of each of:
- The ParticipantFlowModule
- The BaselineCharacteristicsModule
- The OutcomeMeasuresModule
- The AdverseEventsModule
- The MoreInfoModule
The Document Section
The LargeDocument Module
This deals with uploaded documents, and comprises a single List ("LargeDocList") of the Struct "LargeDoc ". The Struct has the Fields:
- LargeDocTypeAbbrev – Type of uploaded study document. One of Study Protocol, Statistical Analysis Plan (SAP), Informed Consent Form (ICF), Study Protocol with SAP and/or ICF
- LargeDocHasProtocol – Presumed to be a boolean, needs checking
- LargeDocHasSAP – Presumed to be a boolean, needs checking
- LargeDocHasICF – Presumed to be a boolean, needs checking
- LargeDocLabel – Not documented, presumed to be the document title (needs checking)
- LargeDocDate – The date on which the uploaded document was most recently updated and, if needed, approved by a human subjects protection review board.
- LargeDocUploadDate – The date the document was uploaded
- LargeDocFilename – The File Name (format needs checking)
All of this information needs to be extracted and examined further.
The Derived Section
Unfortunately documentation for this section seems to be hard to find, and the interpretation of the Fields may need to change.
There are three modules, which may include some useful information.
The MiscInfo Module
Field Name = "VersionHolder"
Includes a date. May be useful to extract as a useful check that a registry entry has been changed.
List Name = "RemovedCountryList"
Seems unlikely to be useful - Not extracted.
The ConditionBrowse Module
This allows Conditions to be more formally defined using MESH terms and codes.
The ConditionMeshListr is extracted before the 'ordinary' condition list as it gives more standardised information. Terms in the non MESH coded list are not extracted unless they are additional to the terms listed here.
List Name = "ConditionMeshList"
A List of a Struct called ="ConditionMesh", which has the fields
- ConditionMeshId – Mesh id for condition
- ConditionMeshTerm – Mesh term for condition
Extracted but needs to be merged with the Condition data from elsewhere in the record
List Name = "ConditionAncestorList"
A List of a Struct called ="ConditionAncestor", which has the fields
- ConditionAncestorId – Mesh Id of the immediate ancestors of condition terms
- ConditionAncestorTerm – Name of the immediate ancestors of condition terms
Not extracted
List Name = "ConditionBrowseLeafList"
A List of a Struct called ="ConditionBrowseLeaf", which has the fields
- ConditionBrowseLeafId – Mesh id of the listed condition
- ConditionBrowseLeafName – Mesh name of the listed condition
- ConditionBrowseLeafAsFound – Original name as given in the submission
- ConditionBrowseLeafRelevance – high or low
Not extracted - the data is too detailed and its significance is unclear.
List Name = "ConditionBrowseBranchList"
A List of a Struct called ="ConditionBrowseBranch", which has the fields
- ConditionBrowseBranchAbbrev – A Mesh abbreviation of the main Intervention branches covered by the conditions
- ConditionBrowseBranchName – Name of the main Intervention branches covered by the interventions
Not extracted
The InterventionBrowse Module
This allows Interventions to be more formally defined using MESH terms and codes.
List Name = "InterventionMeshList"
A List of a Struct called ="InterventionMesh", which has the fields
- InterventionMeshId – Mesh id for Intervention
- InterventionMeshTerm – Mesh term for Intervention
Extracted and used as the source of Intervention data for the study (data from the investigations module being too detailed).
List Name="InterventionAncestorList"
A List of a Struct called ="InterventionAncestor", which has the fields
- InterventionAncestorId – Mesh Id of the immediate ancestors of intervention terms
- InterventionAncestorTerm – Name of the immediate ancestors of intervention terms
Not extracted
List Name="InterventionBrowseLeafList"
A List of a Struct called ="InterventionBrowseLeaf", which has the fields
- InterventionBrowseLeafId – Mesh id of the listed intervention
- InterventionBrowseLeafName – Mesh name of the listed intervention
- InterventionBrowseLeafAsFound – Original name as given in the submission
- InterventionBrowseLeafRelevance – high or low
Not extracted - the data is too detailed and its significance is unclear.
List Name="InterventionBrowseBranchList"
A List of a Struct called ="InterventionBrowseBranch", which has the fields
- InterventionBrowseBranchAbbrev – A Mesh abbreviation of the main Intervention branches covered by the interventions
- InterventionBrowseBranchName – Name of the main Intervention branches covered by the interventions
Not extracted