---
layout: default
title: Appendices
parent: Decision Tree
has_children: false
nav_order: 5
---

# Open Data Process Diagram - Appendices
{: .fs-8 }

All you need!(we think)
{: .fs-6 .fw-300 }


<!-- [Process 1 - Ethics, consent and data management](./decision-tree.md#process-1-ethics-consent-and-data-management)
[Process 2 - Protected features of the data](./decision-tree.md#process-2-protected-features-of-the-data)
[Process 3 - Deidentification](./decision-tree.md#process-3-deidentification)
[Process 4 - Metadata](./decision-tree.md#process-4-metadata)
[Process 5 - Sharing and attribution](./decision-tree.md#process-5-sharing-and-attribution) -->



## Contents
- [Appendix 1 - DICOM Headers to be scrubbed](#dicom-scrub)
- [Appendix 2 - BIDS Compliance](#BIDS)
- [Appendix 3 - Scrubbing .json files](#json-scrub)
- [Appendix 4 - Quality Control](#qc)
- [Appendix 5 - Defacing](#deface)
- [Appendix 6 - Upload to XNAT](#upload)
- [Appendix 7 - Data freeze](#data-freeze)
- [Appendix 8 - Digital object identifiers (DOI)](#doi)
- [Appendix 9 - Data usage agreement (DUA)](#dua)
- [Appendix 10 - Advice from Research Data Oxford](#rdo)
- [Relevant policy and guidance](#policy)
- [Acknowledgements](#acknowledgements)


<a name="dicom-scrub"></a>
## Appendix 1 - DICOM Headers to be scrubbed


> Relevant processes:
> - [Process 3 - Deidentification](./decision-tree.md#process-3-deidentification)


List of DICOM identifying fields that need to be scrubbed included in [McGill Centre for Integrative Neuroscience Open Science Guidance Open Science Data Preparation Checklist](http://loris.ca/MCINOpenScienceGuidance_DataPrepChecklist.pdf).

Fields to be scrubbed include anything which is identifiable as participant information or unique to the scanning session or data acquisition site.

Fields identified by McGill have been checked against DICOM fields appearing in data acquired on WIN 3T (OHBA) for human
participant (identifiable fields have been censored here using "x" placeholders).


|**Field name**|**In WIN dicom example (if field present)**|
|--------------|-------------------------------------------|
PatientAddress                      | n/a|
PatientAge                          | 042Y|
PatientBirthDate+D10                | 1975xxxx|
PatientBirthName                    | n/a|
PatientID                           | W3T\_2016\_xx\_xxx|
PatientMotherBirthName              | n/a|
PatientName                         | W3T\_2016\_xx\_xxx|
PatientReligiousPreference          | n/a|
PatientSex                          | F|
PatientTelephoneNumbers             | n/a|
Patient Size                        | 1.62|
Patient Weight                      | 64|
OtherPatientIDs                     | n/a|
OtherPatientName                    | n/a|
OtherPatientNames                   | n/a|
AcquisitionDate                     | 2018xxxx|
AcquisitionTime                     | 90931.46|
ContentDate                         | 2018xxxx|
ContentTime                         | 384 values: 091507.445000,| 091507.448000, 091507.450000, 091507.452000\...
DeviceSerialNumber                  | 66093|
FrameOfReferenceUID                 | 1.3.12.2.1107.5.2.43.66093.1.2018xxxx090420048.0.0.0|
InstanceCreationDate                | 2018xxxx|
InstanceCreationTime                | 384 values: 091507.445000,| 091507.448000, 091507.450000, 091507.452000\...
InstitutionAddress                  | Warneford Lane| StreetNo,Oxford,District,GB,OX3 7JX
InstitutionalDepartmentName         | BE3597|
Institution Name                    | Warneford Hospital|
IssuerOfPatientID                   | n/a|
MediaStorageSOPInstanceUID          | n/a|
OperatorsName                       | n/a|
PerformedProcedureStepDescription   | n/a|
PerformedProcedureStepID            | 0|
PerformedProcedureStepStartDate     | 2018xxxx|
PerformedProcedureStepStartTime     | 85830.014|
PerformingPhysicianName             | Blank|
PhysiciansOfRecord                  | n/a|
ReferencedSOPInstanceUID            | n/a|
ReferringPhysicianName              | Blank|
RequestedProcedureDescription       | n/a|
RequestingPhysician                 | Blank|
SeriesDate                          | 2018xxxx|
SeriesInstanceUID                   | 2 values:| 1.3.12.2.1107.5.2.43.66093.2018xxxx09150468746204326.0.0.0, 1.3.12.2.1107.5.2.43.66093.2018xxxx09093178111004205.0.0.0
SeriesTime                          | 2 values: 091507.443000, 091508.436000|
SOPInstanceUID                      | 384 values:| 1.3.12.2.1107.5.2.43.66093.2018xxxx09150469069804346, ...
StationName                         | AWP66093|
StudyDate                           | 2018xxxx|
StudyDescription                    | OHBA Projects\^2018\_108 RESTAND|
StudyInstanceUID                    | MR2018xxxx085649|
StudyTime                           | 85829.922|

<a name="BIDS"></a>
## Appendix 2 - BIDS Compliance

> Relevant processes:
> - [Process 3 - Deidentification](./decision-tree.md#process-3-deidentification)
> - [Process 4 - Metadata](./decision-tree.md#process-4-metadata)

Data should be confirmed as BIDS compliant using the BIDS-validator docker image available from
[https://github.com/bids-standard/bids-validator\#docker-image](https://github.com/bids-standard/bids-validator#docker-image).
This could be installed on XNAT. *to-do: check if BIDS validator already installed*

<a name="json-scrub"></a>
## Appendix 3 - Scrubbing .json files

> Relevant processes:
> - [Process 3 - Deidentification](./decision-tree.md#process-3-deidentification)

.json sidecar files are generated for each nifti image to conform with BIDS specification. These files contain essential metadata which describes the acquisition of the data. They may also contain potentially identifiable information relating to the participant which should not be distributed openly.

<a name="qc"></a>
## Appendix 4 - Quality Control

> Relevant processes:
> - [Process 3 - Deidentification](./decision-tree.md#process-3-deidentification)

Quality control (QC) assessments should be completed as a matter of course for research data, and as a courtesy for those wishing to reuse your data. QC can be performed on BIDS formatted data using mriqc. Mriqc is available as a container on XNAT. *to-do: check if mriqc container already installed*

<a name="deface"></a>
## Appendix 5 - Defacing

> Relevant processes:
> - [Process 3 - Deidentification](./decision-tree.md#process-3-deidentification)

Consider using fsl\_deface. Other defacing tools are available.

Consider running QC on defaced images using
[https://github.com/raamana/visualqc/blob/master/docs/VisualQC\_TrainingManual\_v1p4.pdf](https://github.com/raamana/visualqc/blob/master/docs/VisualQC_TrainingManual_v1p4.pdf)

<a name="upload"></a>
## Appendix 6 - Upload to XNAT

> Relevant processes:
> - [Process 5 - Sharing and attribution](./decision-tree.md#process-5-sharing-and-attribution)

TBD (not exhaustive)

1.  Default access groups (e.g. uploading researcher, PI and XNAT Admin)

2.  Verification and management of external account (whole process to be defined for external users) -- Look for external (acceptable) models of this.

3.  Process for verifying that the mandatory procedures have been completed (e.g. DICOMS scrubbed) before data are made publicly available.

<a name="data-freeze"></a>
## Appendix 7 - Data freeze

> Relevant processes:
> - [Process 5 - Sharing and attribution](./decision-tree.md#process-5-sharing-and-attribution)

The data which are shared publically should remain unchanged once they have been released. Projects which are longitudinal or are comprised of large cohorts may want to consider generating multiple seperates releases. 

The data freeze process as applied for the Dementias Platform UK (DPUK) is computationally expensive -- multiple freeze generates duplicates of data. Alternative methods could be investigated (e.g. ZFS snapshots).

Consider developing a cost model of the platform to indicate to researchers the cost of snapshots carried by WIN. A cost model would also be useful for project and infrastructure grant applications.

<a name="doi"></a>
## Appendix 8 - Digital object identifiers (DOI)

> Relevant processes:
> - [Process 5 - Sharing and attribution](./decision-tree.md#process-5-sharing-and-attribution)

A DOI is essential to ensure that the data shared can be effectively attributed to the authors via [FAIR
principles](https://www.go-fair.org/fair-principles/). Oxford Research Online (ORA) is the University's preference for
generating a DOI.

To create a DOI for your data go to [Deposit Data in ORA](https://deposit.ora.ox.ac.uk/dashboard/my/works?locale=en) and "Add a New Work". Fill in as many fileds as you feel will be sufficient to identify your data. Please ensure the below are included to:
- In the field "bibliographic detail" > "Digital Storage location": Enter the URL of your project on XNAT.
- In the field "contributors", add a new contrbutor of the type "data steward" and enter the email address win.xnat-admin@psych.ox.ac.uk

Major updates to XNAT (likely over the medium to long term) will potentially lead to changes in URL. If/when this occurs, we will work with ORA to update the "Digital Storage Location" of your record to reflect the new URL. This is only possible if the XNAT admin email address is provided as a collaborator. 

<a name="dua"></a>
## Appendix 9 - Data usage agreement (DUA)

> Relevant processes:
> - [Process 5 - Sharing and attribution](./decision-tree.md#process-5-sharing-and-attribution)


Research Services (<research.services@admin.ox.ac.uk>.) can provide template agreements for data sharing.

A range of template data sharing and data access agreements (suitable for research from all four academic divisions) are available. While these agreements are designed to address key issues, they will need to be tailored to the specific needs of a particular project. Research Services can advise on this, and also offer to check drafts as they are prepared before producing a final signature-ready copy.

### Example DUA from [Donders Repository]([https://data.donders.ru.nl/)

I request access to the data collected in the digital repository of the \<DEPARTMENT\>, part of the \<INSTITUTION\>, established at \<CITY\>, \<COUNTRY\> (hereinafter referred to as the \<INSTITUTION SHORTNAME\>).

By accepting this agreement, I become the data controller (as defined under the GDPR) of the data that I have access to, and am responsible that I access these data under the following terms:

1.  I will comply with all relevant rules and regulations imposed by my institution and my government. This agreement never has prevalence over existing general data protection regulations that are applicable in my country.

2.  I will not attempt to establish or retrieve the identity of the study participants. I will not link these data to any other database in a way that could provide identifying information. I shall not request the pseudonymisation key that would link these data to an individual's personal information, nor will I accept any additional information about individual participants under this Data Use Agreement.

3.  I will not redistribute these data or share access to these data with others, unless they have independently applied and been granted access to these data, i.e., signed this Data Use Agreement. This includes individuals in my institution.

4.  \[OPTIONAL\] When sharing secondary or derivative data (e.g. group statistical maps or templates), I will only do so if they are on a group level, and cannot be deduced information from individual participants.

5.  I will reference the specific source of the accessed data when publicly presenting any results or algorithms that benefited from their use:

    a.  Papers, book chapters, books, posters, oral presentations, and all other presentations of results derived from the data should acknowledge the origin of the data as follows: "Data were provided (in part) by \<Research centre/University Department\> \<University, Country\>".

    b.  Authors of publications or presentations using the data should cite relevant publications describing the methods developed and used by the \<Research centre/University Department\> to acquire and process the data. The specific publications that are appropriate to cite in any given study will depend on what the data were used and for what purposes. When applicable, a list of publications will be included in the collection.  Neither the \<Research centre/University Department\> or \<University\>, nor the researchers that provide this data will be liable for any results and/or derived data. They shall not be included as an author of publications or presentations without consent.

6.  Failure to abide by these guidelines will result in termination of my privileges to access these data.

<a name="rdo"></a>
## Appendix 10 -- Advice from Research Data Oxford


-   Define what a breach looks like (how, why, impact) and demonstrate that you have shown due diligence in preventing it.

-   How will you manage accusations of disclosure?

-   The consent you have is the cornerstone of your agreement with the participant. Make sure consent appropriately describes all relevant procedures including deidentification and managed access.  Agree that (brain) MRI data cannot be considered anonymous and therefore falls under GDPR.

-   The DUA should preclude commercial use.

-   Access to data should be monitored and reported on (for accountability and auditing)

Research Data would be pleased to partner with us on this project. They are interested in growing their experience of working directly with specific examples.

<a name="policy"></a>
## Relevant policy and guidance


### University of Oxford Policy on the Management of Data Supporting Research Outputs

<https://researchdata.ox.ac.uk/university-of-oxford-policy-on-the-management-of-data-supporting-research-outputs/>

3.0 Responsibilities of the researcher

3.1 Principal Investigators hold day-to-day responsibility for the effective management of research data generated within or obtained from their research, including by their research groups. This shall include understanding and complying with the requirements of any relevant contract with or grant to the University that includes provisions regarding the ownership, preservation and dissemination of research data.

3.2 Researchers will protect confidential, personal and sensitive personal research data in accordance with legal and ethical requirements related to the research they conduct.

3.3 Researchers will make every reasonable effort to keep an accurate and comprehensive record of their research, including documenting clear procedures for the collection, storage, use, reuse, access and retention or deletion of the research data associated with their records.

3.6 Researchers should strongly consider depositing their data supporting outputs in an appropriate data repository along with sufficient descriptive metadata (a data record) to ensure that it can be found and understood. Where data is deposited somewhere other than the University's institutional data repository (the Oxford Research Archive for Data, or ORA-Data), a data record should also be created in ORA-Data which describes and points to the data.

<a name="acknowledgements"></a>
## Acknowledgements

Created using materials from: Das et al, 2019, \"MCIN Open Science Guidance: Data Preparation Checklist\"
[https://loris.ca/MCINOpenScienceGuidance\_DataPrepChecklist.pdf](https://loris.ca/MCINOpenScienceGuidance_DataPrepChecklist.pdf)

Inspired by Huijser, Dorien, Achterberg, Michelle, Wierenga, Lara, Van \'t Veer, Anna, Klapwijk, Eduard, Van Erkel, Raymond, & Hettne, Kristina. (2020, June 19). MRI data sharing guide. Zenodo. http://doi.org/10.5281/zenodo.3822290