Skip to content
This repository was archived by the owner on Jan 15, 2024. It is now read-only.

Adding New Instrument File Types

sq-intersect edited this page Nov 7, 2012 · 2 revisions

Background

ACData has the capability to extract metadata from various instruments such as the RamanStation, NMR and Potentiostat. Each Instrument record has associated Instrument File Types, e.g A Potentiostat instrument outputs a .txt, .ifi, .ifw, .ofw and a .frp file. Each of these files has its own file type and each file type can be parsed a certain way to extract the relevant metadata for the Potentiostat instrument.

Explaining the Parser

Superusers can easily add new instruments via the admininterface in ACData. However, adding new instrument file types will require developing new parsers. The current parsers are stored in the lib folder.

A parser will have to be written for each file type that requires metadata extraction. The parsers can extend KeyValuePairFileParser, which looks for metadata that have the key and value on the same line. Some of these parsers include:

The above parsers use the lists CORE_TAGS, EXTENDED_TAGS, SUPPLIED_TAGS and TAG_MAPPINGS to generate the different types of metadata. Given a instrument's file output:

  • CORE_TAGS is the list of tags to be displayed as primary metadata for the dataset.
  • EXTENDED_TAGS is the list of tags to be displayed as secondary metadata for the dataset.
  • SUPPLIED_TAGS is the list of tags not present in the file that should be prompted for in ACData when creating a dataset.
  • TAG_MAPPINGS is a hash which maps the specified tags into a human readable form. For example, in the NMR parser, '##$TI' => "Sample Name". Core or extended tags without tag mappings will be imported into ACData as it is in the uploaded file.

An example of Potentiostat metadata that will be parsed by the PotentiostatFileParser is:

Date: 01-10-09
Time: 13:10:54

Exp. Conditions:

Cyclic voltammetry

which would output as:

Date Time Experiment Mode
01-10-09 13:10:54 Cyclic voltammetry

As seen in PotentiostatFileParser, "Date", "Time" and "Experiment Mode" are in CORE_TAGS. However, "Experiment Mode" has a tag mapping from "Exp. Conditions" to "Experiment Mode". Also, Date and Time are easily parsed accordingly because they are on the same line. Meanwhile, in the do_file_specific_parsing method, there is a regular expression to look for the value of "Exp. Conditions", which is on a different line.

In addition, some of the instrument files uploaded to ACData may share the same extension but have a different instrument file type, such as a Potentiostat .txt file and a Capillary Porometer .txt file. To ensure the metadata is extracted correctly, each parser has a method recognise? to find a unique element of the file to identify what type of instrument file it is.

Below shows some examples of instrument file type descriptions. The parser_name is the class name of the parser, filter is the list of extensions delimited by ';' and visualisation_handler is used to display the visualisation on the dataset page.

In config/instrument_file_types.yml:

    - 
      name: JCAMP-DX (v4)
      filter: dx; jdx
      parser_name: JDX4FileParser
      visualisation_handler: JCAMPDXVisualisation
    -
      name: JCAMP-DX (v5)
      filter: dx; jdx
      parser_name: JDX5FileParser
      visualisation_handler: JCAMPDXVisualisation

There are different versions of JCAMP-DX files and by specifying the class of the file parser, ACData will be able to detect which version the uploaded file is. It will only try to parse files with the extension .dx and .jdx.

    -
      name: SP (RamanStation)
      filter: sp; spa; spc; spg
      parser_name: SpFileParser

Note that the above does not have a visualisation_handler, so it will not be a candidate for displaying a visualisation on the dataset page.

    -
      name: Potentiostat (.txt)
      filter: txt
      parser_name: PotentiostatFileParser
      visualisation_handler: PotentiostatVisualisation
    -
      name: Potentiostat (.ifi)
      filter: ifi

The Potentiostat instrument has unique files such as the .ifi, which do have have metadata extracted to ACData, but can still be associated with the instrument so that duplicates are not uploaded.

Creating a Parser

If you are creating a parser for a simple metadata file where all the metadata is a key and a value on the same line, create a new Ruby class in the lib folder, like so:

lib/custom_parser.rb

class CustomParser < KeyValuePairFileParser

  HEADER_BUFFER = 1000

  # these are the tags CustomParser will use to extract metadata into ACData.
  CORE_TAGS = [
      "Date",
      "Time",
      "Experiment Name",
      "Exp Code"
  ]
  
  EXTENDED_TAGS= []

  # these are tags that aren't found in the metadata and are expected to be filled in by the user at dataset creation.
  SUPPLIED_TAGS = [
      "Electrolyte Medium",
      "Concentration of the Ionic Medium"
  ]

  # stores Exp Code as Experiment Code in ACData
  TAG_MAPPINGS = {"Exp Code" => "Experiment Code"}

  def initialize
    super(CORE_TAGS, EXTENDED_TAGS, SUPPLIED_TAGS, TAG_MAPPINGS)
  end

  def recognise?(file_path)
    File.open(file_path) do |file|
      header = file.read(HEADER_BUFFER)
      !header.match(/Start\spotential/).nil?
    end
  end

  ...

  def do_file_specific_parsing(iostream)
    metadata = {}
    # parse some extra metadata here
    metadata
  end
end

Adding File Types to ACData

On a First Time Deploy

The config/instrument_file_types.yml file is only used at the initial seed process. So if new instrument file types are to be included, you should define your file types in there and its respective instrument in the config/instruments.yml file.

When you deploy the server for the first time, it will seed all the values into the database.

On Deployed Systems

However, to add a new file type to an already deployed server, you have to create a new InstrumentFileType record in the database manually. If parsing is required, the parser file will need to be copied to the lib folder.

One way you can do this is by running the following:

# change the path to the where the code is stored in the server.
> cd /path/to/acdata/
> RAILS_ENV=production rails console
# In the rails console
$> InstrumentFileType.create!(:name => "Custom File Type", :filter => "custom ; cstm", :parser_name => "CustomParser", :visualisation_handler => nil)
# This creates an instrument file type called "Custom File Type", parses files `file.custom` and `file.cstm`, 
# uses the `CustomParser` as shown in the example and does not have a visualisation.
$> exit

Restart the server. After restarting, the instrument file type can then be associated with an instrument via the admin interface.

Clone this wiki locally