Integration Options

Overview

This page is an attempt at defining how teams will integrate. Based on a brief meeting with team World, we think there are three levels of integration that we could offer.

Level 1: CSV files only
Level 2: JSON files
Level 3: Service calls

The minimum requirement is Level 1. We'll have some mechanism where component A from a team will export the proscribed format and component B from another team will be able to parse that format and use the data.

CSV Format

The following rules must be supported by all CSV parsers and formatters

lines beginning with a # are comments and not part of the data
all rows must have the same number of columns
null values are represented by all whitespace chars for the cell
a value read from a cell is trimmed before becoming data
timestamps are represented as ISO 8601 Date-Time values

Example

# This is a comment
#
# The next two rows have the same data since whitespace is trimmed
0,  0,  1,  2, 3 , Compound-1 ,  2015-01-02T10:30:20Z
0,0,1,2,3,Compound-1,2015-01-02T10:30:20Z
#
# The next row doesn't have a compound, just empty space for that col
0,  0,  1,  2, 3 ,  ,  2015-01-02T10:30:20Z

Plate Map

Models the user's layout of a plate. This format is the intended output of the Plate Map Wizard feature from another team but prior to the point where it is merged with the Compound Mappings File. The plate mapping process allows the user to create a logical mapping of wells on a plate with symbolic names for their contents.

There will be a special label value named "compound" that tells us which compound to use.

Column	Data Type	Required	Description
row	int	yes	row field for the well coordinate
col	int	yes	col field for the well coordinate
well-type	COMP, POS, NEG, EMPTY	yes	indicates the type of the substance in the well
label-name	string	no	user supplied label name
label-value	string	no	user supplied label value

You can specify multiple labels for a well by repeating the well's coordinates in 2 or more rows OR by repeating the label's name/value pairings.

Example

    #row, col, well-type, (label-name, label-value)+
    #
    # This is a sparse CSV. We don't need to define entries for every single cell.
    #
    0, 0, NEG, compound, A1
    0, 0, NEG, compound, B1
    0, 1, COMP, compound, C1, other, foo
    0, 2, COMP, compound, C1
    0, 3, COMP, compound, E1
    0, 49, EMPTY, , 
    2, 0, COMP, compound, G1
    2, 1, COMP, compound, G1
    2, 2, COMP, compound, H1
    2, 3, COMP, compound, H1
    2, 49, EMPTY,,
    50,50,EMPTY,,

Implementation Details

Resource	Description
`edu.harvard.we99.services.io.PlateMapCSVReader`	Reads the Plate Map format
`edu.harvard.we99.services.io.PlateMapCSVReaderTest`	Unit test converting the CSV format into WE99 Domain Objects
`/PlateMapCSVReaderTest/plate-mappings.csv`	Test input file
`platemapping.xml`	Bean IO mapping config file

Plate Map with Doses

TBD - update the wiki to define this format

Plate with Doses and Compounds

Models the user's layout of a plate and also includes details on the contents of the wells. This format is the intended output of the Plate Map Wizard feature from another team. The plate mapping process allows the user to create a logical mapping of wells on a plate with symbolic names for their contents. A secondary data file provides the mapping of the symbolic names to actual compounds and doses. This interchange format represents the combination of those two elements in order to provide an input file that is suitable for creating a new Plate instance for an experiment.

Column	Data Type	Required	Description
row	int	yes	row field for the well coordinate
col	int	yes	col field for the well coordinate
well-type	COMP, POS, NEG, EMPTY	yes	indicates the type of the substance in the well
label-name	string	no	user supplied label name
label-value	string	no	user supplied label value
compound	string	yes	name of the compound in the well
quantity	int	yes	works with units to specify the amount of the substance
units	MILLIMOLAR, MICROMOLAR, NANOMOLAR, PICOMOLAR	yes	works with quantity to specify the amount of the substance

TBD:

the only standard label name is 'compound'
rows are additive
coordinate and well type are required and well type cannot change
labels and compounds are optional and additive to a map of labels/compounds each keyed by their names

Example

    #row, col, well-type, label-name, label-value, Compound, quantity, [units, defaults to MICRO]
    #
    # This is a sparse CSV. We don't need to define entries for every single cell.
    #
    0, 0, NEG, temp, 20, H20, 5.0, MICROMOLAR
    0, 0, NEG, temp, 20, NaCl, 1.0, MICROMOLAR
    0, 1, COMP, A1, 123, Cx, 1.0, MICROMOLAR
    0, 2, COMP, A1, 123, Cx, 1.0, MICROMOLAR
    0, 3, COMP, A1, 123, Cx, 1.0, MICROMOLAR
    0, 49, EMPTY,,,,
    2, 0, COMP, B, 123, Cx, 1.0, MICROMOLAR
    2, 1, COMP, B, 123, Cx, 1.0, MICROMOLAR
    2, 2, COMP, B, 123, Cx, 1.0, MICROMOLAR
    2, 3, COMP, B, 123, Cx, 1.0, MICROMOLAR
    2, 49, EMPTY,,,,
    50,50,EMPTY,,,,

Example with repeated values

There are two options for specifying multiple Compounds or multiple Labels within a well mapping. The first is shown above where we repeat the entry for the 0,0 well with different compounds.

The second option is shown below. In this option the CSV repeats the entries for the Compound fields to specify both water and salt for the well at 0,0.

    #row, col, well-type, label-name, label-value, Compound, quantity, units
    0, 0, NEG, A, 123, H20, 5, PPM, NaCl, 1, PPM

Implementation Details

Resource	Description
`edu.harvard.we99.services.io.PlateCSVReader`	Reads the Plate format
`edu.harvard.we99.services.io.PlateCSVReaderTest`	Unit test converting the CSV format into WE99 Domain Objects
`/PlateCSVReaderTest/input.csv`	Test input file for a single plate
`/PlateCSVReaderTest/input-multi.csv`	Test input file for multiple plates
`plate.xml`	Bean IO mapping config file

Plate Result: Assay Result Interchange Format (ARIF)

TODO:

split label into label-name, label-value
remove measuredAt

Models the output from a device. The device would have accepted the PlateMap CSV above and output something like the records defined here.

Column	Data Type	Required	Description
row	int	yes	row field for the well coordinate
col	int	yes	col field for the well coordinate
value	double	yes	value computed by the device
label	string	no	label for the value. Some devices may compute multiple values for a well so the label is useful for disambiguating.
measuredAt	iso8601	no	timestamp for when the sample was taken

Example

    # row, col, value, labels measuredAt
    0, 0, 0.0, 0.1, 0.2, A, 2015-01-02T10:20:30.100Z
    0, 0, 0.0, 0.1, 0.2, , 2015-01-02T10:20:30.100Z
    0, 0, 0.0, 0.1, 0.2, , 2015-01-02T10:20:30.100Z
    #
    0, 1, 1.0, ,2015-01-02T10:20:30.100Z
    0, 2, 2.0, ,2015-01-02T10:20:30.100Z
    0, 3, 3.0, ,2015-01-02T10:20:30.100Z
    0, 4, 4.0, ,2015-01-02T10:20:30.100Z
    #
    1, 0, 10.0, ,2015-01-02T10:20:30.100Z
    1, 1, 11.0, ,2015-01-02T10:20:30.100Z
    1, 2, 12.0, ,2015-01-02T10:20:30.100Z
    1, 3, 13.0, ,2015-01-02T10:20:30.100Z
    1, 4, 14.0, ,2015-01-02T10:20:30.100Z

Implementation Details

Resource	Description
`edu.harvard.we99.services.io.PlateResultCSVReader`	Reads the ARIF format
`edu.harvard.we99.services.io.PlateResultCSVReaderTest`	Unit test converting ARIF format into WE99 Domain Objects
`/PlateResultServiceCSVTest/results-single.csv`	Test input file for a single plate
`/PlateResultServiceCSVTest/results-multi.csv`	Test input file for multiple plates
`resultsmapping.xml`	Bean IO mapping config file

Plate Result: Matrix Format

This format is based on the sample files from the course web site. Note that these files are not necessarily CSV formatted. See the table below for how they are laid out.

Format	Plates	Rows	Cols	Delim	Description
Envision	single	16	24	whitespace	Lots of metadata at the top of the file. Each of the wells is identified by a letter row (A-P) and a column header (01-24)
HTS	single	16	24	whitespace	Data only with row identifier (a-p) and column header (1-24)
Kinase	single	16	24	comma	Data only with row identifier (A-P) and column header (1-24)
Multiplate	multi	16	24	whitespace	Data only with row identifier (A-P) and column header (1-24)

Implementation Details

Resource	Description
`edu.harvard.we99.services.io.MatrixParser`	Reads the source file into a PlateResult
`edu.harvard.we99.services.io.MatrixParserTest`	Unit test converting the raw file into WE99 Domain Objects
`/MatrixParserTest`	See this folder for samples of each of the files above and the resulting JSON
`edu.harvard.we99.services.io.PlateResultCollector`	Interface that in conjunction with the MatrixParser to collect the results into a single plate or multiple plates based on the implementation that is passed in. This allows us to support loading multiple sample results into a single plate or load multiple plate results at once.

Home

Project Layout

Architecture

Documentation

Administrative

Plate Map Editor

[Application Flow] (https://github.com/massfords/we99/wiki/Application-Flow-Mockups)
Mockups
Dose Response Curves

Integration Options

Overview

CSV Format

Example

Plate Map

Example

Implementation Details

Plate Map with Doses

Plate with Doses and Compounds

Example

Example with repeated values

Implementation Details

Plate Result: Assay Result Interchange Format (ARIF)

Example

Implementation Details

Plate Result: Matrix Format

Implementation Details

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Project Layout

Architecture

Documentation

Administrative

Plate Map Editor

Clone this wiki locally