-
Notifications
You must be signed in to change notification settings - Fork 3
Register new tool
This documentation outlines the process for adding a non-interactive tool to the openVRE platform. In this case, we will use the SEQIO Tool as an example. The SEQIO Tool processes a FASTA file and extracts sequences based on a list of IDs provided in a separate text file, generating a new FASTA file with the selected sequences.
The tool is designed to be dockerized and non-interactive, meaning that the user provides the required input files (FASTA file and ID list file) and submits the job to the system. Once the job completes, the user can download the resulting FASTA file that contains the extracted sequences.
The installation process includes the following steps:
- Setting up the tool directory.
- Modifying input and output PHP files for proper configuration.
- Updating asset paths and images for the tool interface.
- Adding the tool configuration to the MongoDB database.
- Ensuring consistency across file_type and data_type collections in MongoDB.
The working directory for this tutorial is the dockerized_vre from the previous step of installing the Dockerized version of VRE and Creating a Dockerized VRE adapted tool.
- Navigate to the tools directory in your openVRE system:
cd volumes/openVRE/tools/
-
Create a new directory for your tool (e.g.,
seqio_tool):
mkdir seqio_tool
cd seqio_tool
2a. Move the seqio_tool_template dir
An already existing directory for the seqio_tool, rename the directory for the Tool to be available on the platform.
mv seqio_tool_template seqio_tool- Copy the necessary files:
- Copy the
input.phpandoutput.phpfiles from the tool_skeleton directory into your new seqio_tool directory. - Copy the
assetsdirectory from tool_skeleton to your seqio_tool directory as well.
- Your directory structure should look like this:
volumes/openVRE/tools/seqio_tool/
├── input.php
├── output.php
└── assets/
└── home/
└── index.html
└── img.png
└── logo.png
1.Modify the input.php file to suit your tool’s input parameters. This file handles the configuration of the input files for the tool.
2. Update the tool ID at the beginning of the file:
// get tool details
$toolId = "seqio_tool";
- Modify the input file sections based on the tool’s requirements. For the SEQIO Tool, it requires two files:
- A FASTA file (
input_fasta). - A Text file with IDs (
ids_txt).
4.** Modify the file input handling **(e.g., input_fasta and ids_txt). Update the code like this:
<?php if( $_REQUEST["op"] == 0 ) { ?>
<div class="col-md-6">
<?php $ff = matchFormat_File($tool['input_files']['input_fasta']['file_type'], $inPaths); ?>
<?php InputTool_printSelectFile($tool['input_files']['input_fasta'], $rerunParams['input_fasta'], $ff[0], false, true); ?>
</div>
<div class="col-md-6">
<?php $ff = matchFormat_File($tool['input_files']['ids_txt']['file_type'], $inPaths); ?>
<?php InputTool_printSelectFile($tool['input_files']['ids_txt'], $rerunParams['ids_txt'], $ff[0], false, true); ?>
</div>
<?php } ?>
5.** Modify the file types and data types** in MongoDB, which will be used for the fasta and txt files, as well as for the outputs. These changes are done in MongoDB (described in Step 4).
1.** Modify index.html **in the assets/home/ directory:
- Update all paths from
tool_skeletonto the actual tool directory (seqio_tool). - Modify the image files, if necessary. If the tool has specific logos or images, replace
img.pngandlogo.pngwith the corresponding files. If no new images are provided, the default ones will be used.
2.Example changes in index.html:
<img src="/assets/seqio_tool/img.png" alt="SEQIO Tool">
<img src="/assets/seqio_tool/logo.png" alt="SEQIO Tool Logo">
-
Create a new entry for the tool in the MongoDB
toolscollection. You can either add this entry directly using MongoDB Compass or manually edit thetools.jsonfile located infront_end/openVRE/install/database/tools.json. -
Insert the following tool configuration: Modify the
tool_skeletonone, based on your tool requirments:
{
"_id": "tool_skeleton",
"name": "Tool Skeleton",
"title": "Generic Tool for Task Execution",
"short_description": "A tool for performing a specific computational task.",
"long_description": "This tool is designed to perform a task based on the provided input files and user arguments. The task can be anything from data processing to analysis, depending on tool configuration.",
"url": "https://example.com/tool_skeleton",
"publication": "",
"owner": {
"author": "Author Name",
"institution": "Institution Name",
"contact": "author@example.com",
"url": "https://institution.com"
},
"status": 1,
"external": true,
"keywords": ["task", "data", "analysis"],
"infrastructure": {
"memory": 4,
"cpus": 2,
"executable": "/path/to/executable",
"container_image": "container_image_name",
"clouds": {
"my_cloud": {
"launcher": "docker_SGE",
"default_cloud": true,
"queue": "default.q",
"executable": "/path/to/executable",
"executable_type": "docker"
}
}
},
"input_files": [
{
"name": "input_file_1",
"description": "Description of the first input file.",
"file_type": ["FILE_TYPE_1"],
"data_type": ["DATA_TYPE_1"],
"required": true,
"allow_multiple": false
},
{
"name": "input_file_2",
"description": "Description of the second input file.",
"file_type": ["FILE_TYPE_2"],
"data_type": ["DATA_TYPE_2"],
"required": true,
"allow_multiple": false
}
],
"output_files": [
{
"name": "output_file_1",
"file_type": "FILE_TYPE_1",
"data_type": "OUTPUT_DATA_TYPE_1",
"meta_data": {
"visible": true,
"description": "Description of the output file.",
"tool": "tool_skeleton"
},
"required": true,
"allow_multiple": false
}
],
"arguments": [
{
"name": "argument_name_1",
"description": "Description of the argument.",
"type": "integer",
"min": 1,
"max": 100
}
],
"sites": [
{
"site_id": "hpc_environment",
"status": 0
},
{
"site_id": "local",
"status": 0
}
]
}
For the *seqio_tool case:
{
"_id": "seqio_tool",
"name": "SEQIO Tool",
"title": "SEQIO Tool for FASTA Extraction",
"short_description": "A tool for extracting sequences from FASTA files based on IDs.",
"long_description": "SEQIO Tool is designed to extract specific sequences from a FASTA file using a list of IDs provided in a separate text file.",
"url": "https://example.com/seqio_tool",
"publication": "",
"owner": {
"author": "Maria Paola Ferri",
"institution": "Barcelona Supercomputing Center",
"contact": "maria.ferri@bsc.es",
"url": ""
},
"status": 0,
"external": true,
"keywords": [
"fasta",
"extraction",
"sequence"
],
"infrastructure": {
"memory": 12,
"cpus": 1,
"executable": "/home/vre_template_tool/VRE_RUNNER",
"container_image": "seqio_tool",
"clouds": {
"local": {
"launcher": "docker_SGE",
"default_cloud": true,
"queue": "local.q",
"executable": "/home/vre_template_tool/VRE_RUNNER",
"executable_type": "docker"
}
}
},
"input_files": [
{
"name": "input_fasta",
"description": "Input FASTA file containing sequences.",
"help": "Provide the FASTA file from which to extract sequences.",
"file_type": [
"FASTA"
],
"data_type": [
"sequence_data"
],
"required": true,
"allow_multiple": false
},
{
"name": "ids_txt",
"description": "Text file containing the IDs of sequences to extract.",
"help": "Provide a text file with sequence IDs, one per line.",
"file_type": [
"TXT"
],
"data_type": [
"id_list"
],
"required": true,
"allow_multiple": false
}
],
"input_files_public_dir": [],
"input_files_combinations": [],
"arguments": [
{
"name": "min_lenght",
"description": "Minimum length for sequences to be extracted.",
"help": "Specify a number between 30 and 100.",
"type": "string",
"min": 30,
"max": 100
}
],
"has_custom_viewer": false,
"output_files": [
{
"name": "output_fasta",
"required": true,
"allow_multiple": false,
"file": {
"file_type": "FASTA",
"data_type": "extracted_sequences",
"meta_data": {
"visible": true,
"description": "FASTA file containing the extracted sequences based on the provided IDs.",
"tool": "seqio_tool"
},
"file_path": "output_fasta.fasta"
}
}
],
"sites": [
{
"site_id": "local",
"status": 1
}
]
}
}
- This configuration includes:
- Tool metadata such as the name, description, and owner.
- Infrastructure details, including cloud configuration.
- Input and output files with associated file types and data types.
- Required arguments for the tool, such as
min_length, which are gonna come out on theworkspacedirectly.
- If necessary, **update the
file_typeanddata_typecollections **in MongoDB:
- Add new file types for
FASTAandTXTif they are not already present:
{
"_id": "FASTA",
"extension": ["fasta"]
},
{
"_id": "TXT",
"extension": ["txt"]
}
- Add new data types like
sequence_dataandid_list:
{
"_id": "sequence_data",
"name": "sequence_data",
"file_types": ["FASTA"]
}
{
"_id": "id_list",
"name": "id_list",
"file_types": ["TXT"]
}
{
"_id": "extracted_sequences",
"name": "extracted_sequences",
"file_types": [
"FASTA"
]
}
To update the new MongoDB configuration, it is necessary to restart the service:
docker-compose up -d mongo_seed- Ensure that the tool directory is correctly set up with the necessary files and configurations.
- Test the tool by submitting a job with a sample FASTA file and ID list, verifying that the output FASTA file is generated correctly.
- Check MongoDB to ensure the tool configuration, file types, and data types are consistent.
{
"_id": "seqio_tool",
"name": "SEQIO Tool",
"title": "SEQIO Tool for FASTA Extraction",
"short_description": "A tool for extracting sequences from FASTA files based on IDs.",
"long_description": "SEQIO Tool is designed to extract specific sequences from a FASTA file using a list of IDs provided in a separate text file.",
"url": "https://example.com/seqio_tool",
"publication": "",
"owner": {
"author": "Maria Paola Ferri",
"institution": "Barcelona Supercomputing Center",
"contact": "maria.ferri@bsc.es",
"url": ""
},
"status": {
"$numberInt": "1"
},
"external": true,
"keywords": [
"fasta",
"extraction",
"sequence"
],
"infrastructure": {
"memory": {
"$numberInt": "12"
},
"cpus": {
"$numberInt": "4"
},
"executable": "/home/vre_template_tool/VRE_RUNNER",
"container_image": "mapoferri/techton-seq-tool",
"clouds": {
"my_on_premises_cloud": {
"launcher": "docker_SGE",
"default_cloud": true,
"queue": "local.q",
"executable": "/home/vre_template_tool/VRE_RUNNER",
"executable_type": "docker"
}
}
},
"input_files": [
{
"name": "input_fasta",
"description": "Input FASTA file containing sequences.",
"help": "Provide the FASTA file from which to extract sequences.",
"file_type": [
"FASTA"
],
"data_type": [
"sequence_data"
],
"required": true,
"allow_multiple": false
},
{
"name": "ids_txt",
"description": "Text file containing the IDs of sequences to extract.",
"help": "Provide a text file with sequence IDs, one per line.",
"file_type": [
"TXT"
],
"data_type": [
"id_list"
],
"required": true,
"allow_multiple": false
}
],
"input_files_public_dir": [],
"input_files_combinations": [],
"arguments": [
{
"name": "min_length",
"description": "Minimum length for sequences to be extracted.",
"help": "Specify a number between 30 and 100.",
"type": "integer",
"min": 30,
"max": 100
}
],
"has_custom_viewer": false,
"output_files": [
{
"name": "output_fasta",
"required": true,
"allow_multiple": false,
"file": {
"file_type": "FASTA",
"data_type": "extracted_sequences",
"meta_data": {
"visible": true,
"description": "FASTA file containing the extracted sequences based on the provided IDs.",
"tool": "seqio_tool"
}
}
}
],
"sites": [
{
"site_id": "local",
"status": {
"$numberInt": "1"
}
}
]
}