Difference between revisions of "Dataset de-identification levels"

From ECRIN-MDR Wiki
Jump to navigation Jump to search
Line 21: Line 21:
 
The categories are used alongside 5 more specific boolean data points that allows specific de-identification measures to be indicated, and a textual description that should be used to give details of the de-identification process (or reference a URL or other data object where such details can be found). The 5 specific data items are:
 
The categories are used alongside 5 more specific boolean data points that allows specific de-identification measures to be indicated, and a textual description that should be used to give details of the de-identification process (or reference a URL or other data object where such details can be found). The 5 specific data items are:
 
* ''Direct Identifiers removed?''
 
* ''Direct Identifiers removed?''
*
+
* ''US HIPAA de-identification criteria applied?''
*
+
* ''Dates rebased or replaced by integers?''
*
+
* ''Narrative text fields removed?''
*
+
* ''k-anonymisation achieved?''

Revision as of 18:05, 11 September 2020

These categories indicates the level of de-identification that has been applied. The possible values are:

id name description source
0 Not known No clear information available about the de-identification, if any, applied to the data. ECRIN
1 No de-identification Confirmed that no de-identification measures have been applied to the data set. ECRIN
2 De-identfication applied Some de-identification measures have been applied. Details should be described in comments and / or indicated in the linked boolean fields, or in separate documents. ECRIN
3 De-identfication applied, primary outcomes re-assessed Some de-identification measures have been applied and are described. In addition the data has been re-analysed against the primary outcomes and the results described. ECRIN

The categories are used alongside 5 more specific boolean data points that allows specific de-identification measures to be indicated, and a textual description that should be used to give details of the de-identification process (or reference a URL or other data object where such details can be found). The 5 specific data items are:

  • Direct Identifiers removed?
  • US HIPAA de-identification criteria applied?
  • Dates rebased or replaced by integers?
  • Narrative text fields removed?
  • k-anonymisation achieved?