Skip to content

A robust general framework for I/O #166

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 23 commits into from
May 14, 2015
Merged

A robust general framework for I/O #166

merged 23 commits into from
May 14, 2015

Conversation

ctrueden
Copy link
Member

This branch introduces a framework for I/O, particularly random access I/O. Where data comes from is now fully extensible: create a new Location type—and optionally a corresponding DataHandle plugin that wraps it—to define new paradigms of data access and storage.

Main points of the design:

  • All data is identified by Location objects of a particular type.
  • The new DataTypeService is a WrapperService that matches up Location objects with DataHandle plugin wrappers that know how to provide random access into the location's bytes etc.
    • Initially, a FileHandle is provided for access to FileLocations (which are in turn backed by java.io.File).
    • Adding the other DataHandle plugins later will be relatively straightforward, especially after porting over SCIFIO's io.scif.io.StreamHandle abstract class layer.
  • For better type richness, all concrete IOPlugin implementations will operate on Location objects instead of String. This branch holds off on that change for the moment though, since it is a breaking API change.

Benefits for SCIFIO:

  • SCIFIO Format plugins will be able to operate on Location objects instead of plain Strings.
  • For Format plugins that prefer to operate on the DataHandle level, they can ask the DataHandleService for a DataHandle corresponding to a particular Location, then inspect the data itself at various phases (any or all of Checker, Parser, Reader and Writer).
  • Some Format plugins may opt to accept only a specific type of Location, ignoring the DataHandle API. For example, the OMEROFormat of ImageJ-OMERO will accept only OMEROLocation objects, extracting the relevant information from the OMEROLocation (server name, user name, session ID, etc.) and using it to make the specified OMERO connection and extract the pixels and metadata that way.
  • All of io.scif.io can likely be removed from the codebase, in favor of this framework. (And in general: eventually, all of io.scif.common should be migrated to SJC.)

See also these issues, some of which will be resolved by this work:

ctrueden added 16 commits May 4, 2015 13:54
This very same class has a handy method for achieving
the "caller's responsibility", so let's advertise it.
Surprisingly, the PluginService API does not have a method to create a
plugin from its class. You can only create one from its associated
PluginInfo. So instantiating a plugin becomes a two-step process:

1. Ask the PluginService for the PluginInfo of a given plugin Class.
2. Tell the PluginService to create an instance from that PluginInfo.

And this process is not fully type-safe, since the
PluginInfo#createInstance(PluginInfo) method only knows the object is
of type PT, not P.

To help address this limitation, each PTService now has a handy
create(Class<P extends PT>) method for creating an instance of that
plugin class. This is useful, since using "new" directly is not enough;
the plugin will not have the application context injected as needed.
Instead, throw an exception with a hint.

This will hopefully avoid logic errors in downstream code,
which attempt to create new instances of singleton plugins.
The private method called "wrap" really just located an appropriate
wrapper object. It did not actually populate it with the data.
When calling the create method, after a match is found, the data needs
to actually get populated by calling the set method. Otherwise, the
create method will return an unpopulated wrapper object.

This bug was not discovered until now because the only core type of
WrapperPlugin, the InputWidget, overrode the create method with its
own implementation that populated the data properly.
Instead, let's just return null, since null can only mean one thing:
no suitable wrapper was found.

This commit also makes the appropriate changes downstream to adjust
for the update in behavior.
A data "location" is a path (source/destination) that points to data.
Many data locations, such as files on disk and remote URLs, can be
represented as URIs, so the Location interface provides a getURI()
method to easily obtain such a representation.
@ctrueden ctrueden added this to the 3.0.0 milestone May 12, 2015
@ctrueden
Copy link
Member Author

@hinerm I would greatly appreciate your feedback. This work was the main blocker for merging the other 3.0.0 PRs, so if this looks good we can go ahead and finish up the SJC3 release very soon.

ctrueden added 7 commits May 14, 2015 12:06
This implementation is a restructuring of SCIFIO's io.scif.io package,
to fit within the SciJava plugin framework.

This commit introduces a new plugin type, DataHandle, that defines all
the methods for random access reading and writing. The DataHandle
interface is a rich SciJava plugin type that extends the
java.io.DataInput, java.io.DataOutput and java.io.Closeable interfaces.

Furthermore, the DataHandle interface also defines additional API
methods that make it very straightforward to adapt DataHandle objects
into InputStreams and OutputStreams (it cannot extend InputStream and
OutputStream, because they are abstract classes, not interfaces).

The actual adaptation to streams is done using the DataHandleInputStream
and DataHandleOutputStream wrapper objects, simply by delegating to the
various DataHandle methods of the same name.

Overall, the mapping from the old io.scif.io API is as follows:

* IRandomAccess + IStreamAccess -> org.scijava.io.DataHandle
* RandomAccessInputStream -> org.scijava.io.DataHandleInputStream
* RandomAccessOutputStream -> org.scijava.io.DataHandleOutputStream
* Location -> subsumed into the org.scijava.io.Location type hierarchy
* HandleException -> unnecessary; IOException et. al are good enough
* FileHandle, URLHandle, etc. -> same names, implementing DataHandle

Note that the mapping is rather loose, with several layers consolidated
or eliminated. Additional commits will readd features, such as specific
DataHandle plugins, as appropriate to achieve needed functionality.
The IOService is high-level; the DataHandleService is low-level.
The io.scif.common.DataTools class (originally from Bio-Formats at
loci.common.DataTools) contains many useful methods for working with
primitive numerical types, and arrays of those types. In particular,
it includes methods for converting between types, endianness, and
signedness. These utility methods will continue to be extremely
useful for the various DataHandle implementations, especially for
proper support for both big- and little-endian byte ordering.

The class was split as follows:
* org.scijava.util.Bytes - primitive numeric methods
* org.scijava.util.StringUtils - utility methods for strings
* org.scijava.util.ArrayUtils - utility methods for arrays

Regarding provenance of the code: I originally wrote much of it for
the Bio-Formats library; Melissa Linkert wrote a lot as well, with
additional contributions by Chris Allan and probably others on the
OME team. The relevant code is all BSD-2 licensed; I have added an
appropriate extra legal notice at the top of each affected file.
These were migrated from SCIFIO, just like the methods they are testing.
This is a redesigned version of SCIFIO's old FileHandle class.

It delegates to java.io.RandomAccessFile for the bulk of its
implementation.
These tests are agnostic to the type of DataHandle, which
should make it easier to get full test coverage of every
concrete DataHandle implementation, as they are introduced.
@hinerm
Copy link
Member

hinerm commented May 14, 2015

Looks good to me. Such nice atomic changes!

ctrueden added a commit that referenced this pull request May 14, 2015
A robust general framework for I/O
@ctrueden ctrueden merged commit ae75cfa into master May 14, 2015
@ctrueden ctrueden deleted the robust-io branch May 14, 2015 18:04
@ctrueden ctrueden removed this from the 3.0.0 milestone May 14, 2015
@ctrueden
Copy link
Member Author

I merged it, without the breaking IOPlugin API changes. I have started a separate PR for those: #167.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants