Skip to content

IO Working Group Meeting Notes

Stephen R. Aylward edited this page Jul 6, 2020 · 6 revisions

Focus

The goal of the I/O working group is to define how data is read into and written out from memory in MONAI. Such input and output requires consideration of (a) the research and application workflows in which MONAI will operate, (b) the importance of effectively utilizing all available data for deep learning development and evaluation, and (c) the critical significance of understanding and preserving the physical space that is represented by a medical image so that its clinical validity is preserved.

We have defined three broad sets of requirements / use-cases for input and output for MONAI:

  1. research I/O,
  2. reproducible I/O, and
  3. clinical I/O.

Research I/O requirements are concerned with common image and data file formats and libraries. Reproducible I/O requirements are concerned with forming a comprehensive description of the training and testing data, parameters, and models used in an experiment, so that the experiment can be repeated. Clinical I/O requirements are concerned with interfacing MONAI with clinical systems such as PACS and health records systems. To address these requirements, we are investigating how MONAI should integrate with and contribute to existing “third-party” libraries, rather than develop new solutions. Example third-party libraries being considered include ITK, MLFlow, PyDICOM, GDCM, XNat, and FHIR. Our guiding principles during those considerations are (a) don’t reinvent the wheel, (b) focus MONAI's energy and contributions on deep learning methods, not on supporting I/O methods, and (c) provide clinical relevance

Members

Background

The motivation for the following proposals is based on MONAI 0.2 release. In this release, MONAI supports PNG and NIFTI, however, those readers have two major shortcomings. One, the readers are file-format specific. The file type being read must be known when the python script is being written. So, if you want to read a PNG instead of a NIFTI, you must change the python code to use the PNG reader. Two, a MONAI image read via PNG will have different meta data than a MONAI image read via NIFTI, e.g., NIFTI tags and PNG tags are different and no effort is made to enforce a standard dictionary when a file is read, so if you want to know, for example, the date an image was acquired, it is likely to require accessing one tag if the file was a NIFTI and a different tag if the file was PNG. So, changing file format being read may require changes throughout a python script.

Tasks and Proposed Solutions

1. Research I/O

Goal: Support reading and writing common research image and data file formats: NRRD, Nifti, DICOM (objects), JPG, PNG, TIFF, MetaIO, ...

Proposal: MONAI should offer an extensible framework for I/O, and the default reader within that framework should use ITK.

The framework should allow researchers to define their own I/O methods. The framework should make use of ITK for standard research image formats such as NIFTI, NRRD, GPL, JPG, TIFF, MetaIO, IPL, VTK, Simulate, SiemensVision, GIF, GE, and DICOM (via GDCM).

Extensions to the framework should (a) use the data dictionary terms commonly used by ITK (based on DICOM images read by GDCM), and (b) preserve an image’s clinical relevance (orientation, spacing, origin)

Implementation considerations:

  1. An object factory based on filename suffix will work for some formats, but not all. Asking ITK if it can read the image and on failure trying other methods may be a viable approach.
  2. ITK provides functions for in-place conversion of ITK images to numpy arrays or PyTorch tensors. See this link.

image = itk.imread(filename)

# View only of itk.Image, data is not copied

np_view = itk.array_view_from_image(image)

Future extensions to consider:

  • BIDS (brain imaging data structure)
  • NDWB (neurodata without boarders)

2. Reproducible I/O

Goal: Generate comprehensive descriptions of the training and testing data, parameters, and models used in an experiment So that the experiment can be repeated.

Proposal: To be determined

This proposal is a work-in-progress. Tentatively MONAI's dataset design may address these requirements. It may be necessary to incorporate MD5/checksums to ensure reproducibility

Tools / examples to consider include: HDF5 extensions,

3. Clinical I/O

Goal: Simplify interfacing MONAI with clinical systems such as PACS and health records systems: DICOM communications, HL7, ...

Proposal: Tentative: Provide examples of integrations by do not attempt to provide a definitive / comprehensive solution

Consider, in particular, the FHIR (Fast Healthcare Interoperability Resources) standard as a driving example.

Meeting Notes

Clone this wiki locally