Skip to content

Register new tool

Maria Paola Ferri edited this page Feb 28, 2025 · 7 revisions

Installing and Instantiating a Non-Interactive Tool in openVRE

This documentation outlines the process for adding a non-interactive tool to the openVRE platform. In this case, we will use the SEQIO Tool as an example. The SEQIO Tool processes a FASTA file and extracts sequences based on a list of IDs provided in a separate text file, generating a new FASTA file with the selected sequences.

The tool is designed to be dockerized and non-interactive, meaning that the user provides the required input files (FASTA file and ID list file) and submits the job to the system. Once the job completes, the user can download the resulting FASTA file that contains the extracted sequences.

The installation process includes the following steps:

  1. Setting up the tool directory.
  2. Modifying input and output PHP files for proper configuration.
  3. Updating asset paths and images for the tool interface.
  4. Adding the tool configuration to the MongoDB database.
  5. Ensuring consistency across file_type and data_type collections in MongoDB.

The working directory for this tutorial is the dockerized_vre from the previous step of installing the Dockerized version of VRE and Creating a Dockerized VRE adapted tool.


Step 1: Set Up the Tool Directory

  1. Navigate to the tools directory in your openVRE system:
cd volumes/openVRE/tools/
  1. Create a new directory for your tool (e.g., seqio_tool):
mkdir seqio_tool
cd seqio_tool

2a. Move the seqio_tool_template dir

An already existing directory for the seqio_tool, rename the directory for the Tool to be available on the platform.

mv seqio_tool_template seqio_tool
  1. Copy the necessary files:
  • Copy the input.php and output.php files from the tool_skeleton directory into your new seqio_tool directory.
  • Copy the assets directory from tool_skeleton to your seqio_tool directory as well.
  1. Your directory structure should look like this:
volumes/openVRE/tools/seqio_tool/
├── input.php
├── output.php
└── assets/
    └── home/
        └── index.html
        └── img.png
        └── logo.png

Step 2: Modify input.php for Tool-Specific Requirements

1.Modify the input.php file to suit your tool’s input parameters. This file handles the configuration of the input files for the tool. 2. Update the tool ID at the beginning of the file:

// get tool details
$toolId = "seqio_tool";
  1. Modify the input file sections based on the tool’s requirements. For the SEQIO Tool, it requires two files:
  • A FASTA file (input_fasta).
  • A Text file with IDs (ids_txt).

4.** Modify the file input handling **(e.g., input_fasta and ids_txt). Update the code like this:

<?php if( $_REQUEST["op"] == 0 ) {  ?>
    <div class="col-md-6">
						<?php $ff = matchFormat_File($tool['input_files']['input_fasta']['file_type'], $inPaths); ?>
						<?php InputTool_printSelectFile($tool['input_files']['input_fasta'], $rerunParams['input_fasta'], $ff[0], false, true); ?>
					</div>
					<div class="col-md-6">
						<?php $ff = matchFormat_File($tool['input_files']['ids_txt']['file_type'], $inPaths); ?> 
						<?php InputTool_printSelectFile($tool['input_files']['ids_txt'], $rerunParams['ids_txt'], $ff[0], false, true); ?>
					</div>
<?php } ?>

5.** Modify the file types and data types** in MongoDB, which will be used for the fasta and txt files, as well as for the outputs. These changes are done in MongoDB (described in Step 4).

Step 3: Modify assets and HTML (index.html)

1.** Modify index.html **in the assets/home/ directory:

  • Update all paths from tool_skeleton to the actual tool directory (seqio_tool).
  • Modify the image files, if necessary. If the tool has specific logos or images, replace img.png and logo.png with the corresponding files. If no new images are provided, the default ones will be used.

2.Example changes in index.html:

<img src="/assets/seqio_tool/img.png" alt="SEQIO Tool">
<img src="/assets/seqio_tool/logo.png" alt="SEQIO Tool Logo">

Step 4: Add Tool Configuration to MongoDB

  1. Create a new entry for the tool in the MongoDB tools collection. You can either add this entry directly using MongoDB Compass or manually edit the tools.json file located in front_end/openVRE/install/database/tools.json.

  2. Insert the following tool configuration: Modify the tool_skeleton one, based on your tool requirments:

{
  "_id": "tool_skeleton",
  "name": "Tool Skeleton",
  "title": "Generic Tool for Task Execution",
  "short_description": "A tool for performing a specific computational task.",
  "long_description": "This tool is designed to perform a task based on the provided input files and user arguments. The task can be anything from data processing to analysis, depending on tool configuration.",
  "url": "https://example.com/tool_skeleton",
  "publication": "",
  "owner": {
    "author": "Author Name",
    "institution": "Institution Name",
    "contact": "author@example.com",
    "url": "https://institution.com"
  },
  "status": 1,
  "external": true,
  "keywords": ["task", "data", "analysis"],
  "infrastructure": {
    "memory": 4,
    "cpus": 2,
    "executable": "/path/to/executable",
    "container_image": "container_image_name",
    "clouds": {
      "my_cloud": {
        "launcher": "docker_SGE",
        "default_cloud": true,
        "queue": "default.q",
        "executable": "/path/to/executable",
        "executable_type": "docker"
      }
    }
  },
  "input_files": [
    {
      "name": "input_file_1",
      "description": "Description of the first input file.",
      "file_type": ["FILE_TYPE_1"],
      "data_type": ["DATA_TYPE_1"],
      "required": true,
      "allow_multiple": false
    },
    {
      "name": "input_file_2",
      "description": "Description of the second input file.",
      "file_type": ["FILE_TYPE_2"],
      "data_type": ["DATA_TYPE_2"],
      "required": true,
      "allow_multiple": false
    }
  ],
  "output_files": [
    {
      "name": "output_file_1",
      "file_type": "FILE_TYPE_1",
      "data_type": "OUTPUT_DATA_TYPE_1",
      "meta_data": {
        "visible": true,
        "description": "Description of the output file.",
        "tool": "tool_skeleton"
      },
      "required": true,
      "allow_multiple": false
    }
  ],
  "arguments": [
    {
      "name": "argument_name_1",
      "description": "Description of the argument.",
      "type": "integer",
      "min": 1,
      "max": 100
    }
  ],
  "sites": [
    {
      "site_id": "hpc_environment",
      "status": 0
    },
    {
      "site_id": "local",
      "status": 0
    }
  ]
}

For the *seqio_tool case:

{
    "_id": "seqio_tool",
    "name": "SEQIO Tool",
    "title": "SEQIO Tool for FASTA Extraction",
    "short_description": "A tool for extracting sequences from FASTA files based on IDs.",
    "long_description": "SEQIO Tool is designed to extract specific sequences from a FASTA file using a list of IDs provided in a separate text file.",
    "url": "https://example.com/seqio_tool",
    "publication": "",
    "owner": {
      "author": "Maria Paola Ferri",
      "institution": "Barcelona Supercomputing Center",
      "contact": "maria.ferri@bsc.es",
      "url": ""
    },
    "status": 0,
    "external": true,
    "keywords": [
      "fasta",
      "extraction",
      "sequence"
    ],
    "infrastructure": {
      "memory": 12,
      "cpus": 1,
      "executable": "/home/vre_template_tool/VRE_RUNNER",
      "container_image": "seqio_tool",
      "clouds": {
        "local": {
          "launcher": "docker_SGE",
          "default_cloud": true,
          "queue": "local.q",
          "executable": "/home/vre_template_tool/VRE_RUNNER",
          "executable_type": "docker"
        }
      }
    },
    "input_files": [
      {
        "name": "input_fasta",
        "description": "Input FASTA file containing sequences.",
        "help": "Provide the FASTA file from which to extract sequences.",
        "file_type": [
          "FASTA"
        ],
        "data_type": [
          "sequence_data"
        ],
        "required": true,
        "allow_multiple": false
      },
      {
        "name": "ids_txt",
        "description": "Text file containing the IDs of sequences to extract.",
        "help": "Provide a text file with sequence IDs, one per line.",
        "file_type": [
          "TXT"
        ],
        "data_type": [
          "id_list"
        ],
        "required": true,
        "allow_multiple": false
      }
    ],
    "input_files_public_dir": [],
    "input_files_combinations": [],
    "arguments": [
      {
        "name": "min_lenght",
        "description": "Minimum length for sequences to be extracted.",
        "help": "Specify a number between 30 and 100.",
        "type": "string",
        "min": 30,
        "max": 100
      }
    ],
    "has_custom_viewer": false,
    "output_files": [
      {
        "name": "output_fasta",
        "required": true,
        "allow_multiple": false,
        "file": {
          "file_type": "FASTA",
          "data_type": "extracted_sequences",
          "meta_data": {
            "visible": true,
            "description": "FASTA file containing the extracted sequences based on the provided IDs.",
            "tool": "seqio_tool"
          },
          "file_path": "output_fasta.fasta"
        }
      }
    ],
    "sites": [
      {
        "site_id": "local",
        "status": 1
      }
    ]
  }
}
  1. This configuration includes:
  • Tool metadata such as the name, description, and owner.
  • Infrastructure details, including cloud configuration.
  • Input and output files with associated file types and data types.
  • Required arguments for the tool, such as min_length, which are gonna come out on the workspace directly.
  1. If necessary, **update the file_type and data_type collections **in MongoDB:
  • Add new file types for FASTA and TXT if they are not already present:
{
  "_id": "FASTA",
  "extension": ["fasta"]
},
{
  "_id": "TXT",
  "extension": ["txt"]
}
  • Add new data types like sequence_data and id_list:
{
  "_id": "sequence_data",
  "name": "sequence_data",
  "file_types": ["FASTA"]
}
{
  "_id": "id_list",
  "name": "id_list",
  "file_types": ["TXT"]
}
{
  "_id": "extracted_sequences",
  "name": "extracted_sequences",
  "file_types": [
    "FASTA"
  ]
}

Step 5: Restart the MongoDB Service

To update the new MongoDB configuration, it is necessary to restart the service:

docker-compose up -d mongo_seed

Step 6: Final Checks and Testing

  1. Ensure that the tool directory is correctly set up with the necessary files and configurations.
  2. Test the tool by submitting a job with a sample FASTA file and ID list, verifying that the output FASTA file is generated correctly.
  3. Check MongoDB to ensure the tool configuration, file types, and data types are consistent.

MongoDB JSON schema

{
    "_id": "seqio_tool",
    "name": "SEQIO Tool",
    "title": "SEQIO Tool for FASTA Extraction",
    "short_description": "A tool for extracting sequences from FASTA files based on IDs.",
    "long_description": "SEQIO Tool is designed to extract specific sequences from a FASTA file using a list of IDs provided in a separate text file.",
    "url": "https://example.com/seqio_tool",
    "publication": "",
    "owner": {
      "author": "Maria Paola Ferri",
      "institution": "Barcelona Supercomputing Center",
      "contact": "maria.ferri@bsc.es",
      "url": ""
    },
    "status": {
      "$numberInt": "1"
    },
    "external": true,
    "keywords": [
      "fasta",
      "extraction",
      "sequence"
    ],
    "infrastructure": {
      "memory": {
        "$numberInt": "12"
      },
      "cpus": {
        "$numberInt": "4"
      },
      "executable": "/home/vre_template_tool/VRE_RUNNER",
      "container_image": "mapoferri/techton-seq-tool",
      "clouds": {
        "my_on_premises_cloud": {
          "launcher": "docker_SGE",
          "default_cloud": true,
          "queue": "local.q",
          "executable": "/home/vre_template_tool/VRE_RUNNER",
          "executable_type": "docker"
        }
      }
    },
    "input_files": [
      {
        "name": "input_fasta",
        "description": "Input FASTA file containing sequences.",
        "help": "Provide the FASTA file from which to extract sequences.",
        "file_type": [
          "FASTA"
        ],
        "data_type": [
          "sequence_data"
        ],
        "required": true,
        "allow_multiple": false
      },
      {
        "name": "ids_txt",
        "description": "Text file containing the IDs of sequences to extract.",
        "help": "Provide a text file with sequence IDs, one per line.",
        "file_type": [
          "TXT"
        ],
        "data_type": [
          "id_list"
        ],
        "required": true,
        "allow_multiple": false
      }
    ],
    "input_files_public_dir": [],
    "input_files_combinations": [],
    "arguments": [
      {
        "name": "min_length",
        "description": "Minimum length for sequences to be extracted.",
        "help": "Specify a number between 30 and 100.",
        "type": "integer",
        "min": 30,
        "max": 100
      }
    ],
    "has_custom_viewer": false,
    "output_files": [
      {
        "name": "output_fasta",
        "required": true,
        "allow_multiple": false,
        "file": {
          "file_type": "FASTA",
          "data_type": "extracted_sequences",
          "meta_data": {
            "visible": true,
            "description": "FASTA file containing the extracted sequences based on the provided IDs.",
            "tool": "seqio_tool"
          }
        }
      }
    ],
    "sites": [
      {
        "site_id": "local",
        "status": {
          "$numberInt": "1"
        }
      }
    ]
  }

Clone this wiki locally