|
|
Virtual Research Environments integrate tools and pipelines to enforce a research community. We offer the possibility to integrate your application in the analytical platform. Please, read though this documentation and contact us for any doubt or suggestion you may have.
|
|
|
Virtual Research Environments integrate tools and pipelines to enforce a research community. We offer the possibility to integrate your application in pone of these analytical platforms. Please, read though this documentation and **contact us for any doubt** or suggestion you may have.
|
|
|
|
|
|
#### Table of contents
|
|
|
* [Why](#why)
|
|
|
* [How it works](#How-it-works)
|
|
|
* [How to?](#how-to)
|
|
|
* [Why?](#why)
|
|
|
* [How it works?](#How-it-works)
|
|
|
* [How to bring a new tool?](#how-to-bring-a-new-tool)
|
|
|
|
|
|
### Why?
|
|
|
open Virtual Research Environment offers a number of benefits for developers willing to integrate their tools in the platform:
|
|
|
***
|
|
|
|
|
|
## Why?
|
|
|
open Virtual Research Environment offers a number of **benefits** for developers willing to integrate their tools in the platform:
|
|
|
|
|
|
- Open access platform publicly available
|
|
|
- A full web-based workbench with user support utilities, OIDC authentication service, and data handlers for local and remote datasets or repositories.
|
... | ... | @@ -16,109 +18,127 @@ open Virtual Research Environment offers a number of benefits for developers wil |
|
|
|
|
|
#### Requirements
|
|
|
The application or pipeline to be integrated should:
|
|
|
- Be free and open source code
|
|
|
- Run in non-interactively mode in linux-based operating system:
|
|
|
- Be free and **open source code**
|
|
|
- Run in **non-interactive mode** in linux-based operating system:
|
|
|
- Ubuntu 18.04
|
|
|
- Ubuntu 20.04
|
|
|
- CentOS 8 Stream
|
|
|
- *consult with us for others O.S*
|
|
|
|
|
|
|
|
|
### How it works?
|
|
|
|
|
|
There are some steps to follow for achieving the integration of a your application as a new VRE tool. As a result, the VRE is able to control the whole tool execution cycle:
|
|
|
1. Prepare the run web form where the user will specify arguments and inputs files for each run.
|
|
|
2. Validate input files and arguments requirements
|
|
|
3. Stage in input files to the run working directory (if required)
|
|
|
4. Execute the tool in the cloud in a scalable manner
|
|
|
5. Monitor and interactively report tool progress during the execution
|
|
|
6. Stage out output files from the run working directory (if required)
|
|
|
7. Register at the VRE the resulting output files to display them at the VRE
|
|
|
|
|
|
### How to?
|
|
|
Essentially, VRE will need twoTool developers should follow four basic steps that provides MuGVRE with the sufficient information to accomplish the whole tool execution cycle, explained in detail at Bring your tool → Integration of tools . Once these steps are completed, MuGVRE will be able to:
|
|
|
|
|
|
Step 1 - Define your tool's input and output files. The information is used to build job execution files for testing the RUNNER.
|
|
|
|
|
|
Step 2 - Prepare a new VRE RUNNER wrapping your application.
|
|
|
|
|
|
Step 3 - Define the metadata of your tool into a tool-specification file
|
|
|
***
|
|
|
|
|
|
Step 4- Submit the RUNNER code and the tool-specification file to VRE administrators, who will install and register the new tool.
|
|
|
## How it works?
|
|
|
|
|
|
Step 5- Test and debug the new tool from the VRE user interface
|
|
|
There are some steps to follow for achieving the integration of a your application as a new VRE tool. As a result, the VRE is able to control the whole **tool execution cycle**:
|
|
|
1. Preparation of the run web form where the user will specify arguments and inputs files for each run.
|
|
|
2. Validation input files and arguments requirements
|
|
|
3. Stage-in input files to the run working directory (if required)
|
|
|
4. Execution the tool in the cloud in a scalable manner
|
|
|
5. Monitoring and interactively reporting tool progress during the execution
|
|
|
6. Stage-out output files from the run working directory (if required)
|
|
|
7. Registration at the VRE of the resulting output files to display them at the VRE
|
|
|
|
|
|
Step 6- (optional) Prepare a web page to display a summary report on each execution
|
|
|
|
|
|
Step 7- Provide documentation for the new tool
|
|
|
|
|
|
|
|
|
>> Step 1 -- Define input files, arguments and output files of the new tool. Use them to build job execution files for testing the RUNNER
|
|
|
|
|
|
VRE job execution files are 2 JSON files. In production, these will be generated by the VRE server on each execution initiated by the user at the web interface. This data will be consumed by your tool RUNNER.
|
|
|
***
|
|
|
|
|
|
## How to bring a new tool?
|
|
|
Essentially, VRE will need two elements, (1) your application or workflow wrapped within a **VRE RUNNER**, and (2) **metadata** annotating it (*i.e.* input files requirements, descriptions). The following steps describe how to achieve it.
|
|
|
|
|
|
- Run configuration file (i.e. config.json): contains the list of input file selected by the user for a particular run, the values of the arguments, and the list of expected output files.
|
|
|
- [Step 1](#step-1-pre-define-your-tool-to-build-a-test-set-of-VRE-job-execution-files) Define your tool's input and output files. The information is used to build job execution files for testing the RUNNER.
|
|
|
- [Step 2](#step-2-prepare-a-VRE-RUNNER-wrapping-your-application) Prepare a new VRE RUNNER wrapping your application.
|
|
|
- [Step 3](#step-3-annotate-and-submit-the-new-VRE-tool) Annotate and submit the new VRE tool
|
|
|
- [Step 4](#step-4) Submit the RUNNER code and the tool-specification file to VRE administrators, who will install and register the new tool.
|
|
|
- [Step 5](#step-5) Test and debug the new tool from the VRE user interface
|
|
|
- [Step 6](#step-6) (optional) Prepare a web page to display a summary report on each execution
|
|
|
- [Step 7](#step-7) Provide documentation for the new tool
|
|
|
|
|
|
- Input files metadata: (in_metadata.json): contains the metadata of the input files listed in config.json, including the absolute file path.
|
|
|
<br/>
|
|
|
|
|
|
Additionally, is handy to have a shell script with the command line of the RUNNER. The 2 previous files are passed in as arguments (test.sh).
|
|
|
#### STEP 1: Pre-define your tool to build a test set of VRE job execution files
|
|
|
|
|
|
Defining which are the input files and arguments that your tool will consume is essential to build these test data. There are 2 ways of creating it:
|
|
|
VRE **job execution files** are two JSON files. In production, these will be generated by the VRE server on each execution initiated by the user at the web interface. This data should be consumed by the RUNNER wrapping your application.
|
|
|
|
|
|
1.a - Manual approach:
|
|
|
| VRE job execution files | Description |
|
|
|
| ------------- | ------------- |
|
|
|
| Run configuration file <br/>*i.e.* `config.json` | contains the list of input file selected by the user for a particular run, the values of the arguments, and the list of expected output files. |
|
|
|
| Infiles' metadata file <br/>*i.e.* `in_metadata.json`| contains the metadata of the input files listed as in config.json, including information like the absolute file path |
|
|
|
|
|
|
Manually generate the 2 files following the corresponding JSON schema and taking as reference some examples
|
|
|
|
|
|
- Examples:
|
|
|
Additionally, is handy to have a shell script with the execution command line of the RUNNER (*i.e.* `test.sh`). The 2 previous files are passed in as arguments.
|
|
|
|
|
|
Test data of the RUNNER template : https://github.com/inab/vre_template_tool/tree/master/tests/basic
|
|
|
Defining which are the input files and arguments that your tool will consume is essential to build the VRE job execution files. There are two ways of creating it:
|
|
|
|
|
|
Test data of the dpfrep RUNNER (example of a R-based tool): https://github.com/inab/vre_dpfrep_executor/tree/master/tests/basic
|
|
|
- **Manual approach**:
|
|
|
|
|
|
- Schemas:
|
|
|
Manually generate the 2 files following the corresponding JSON schemes and taking as reference some examples
|
|
|
|
|
|
euCanSHare tool schemas: https://github.com/euCanSHare/vre/tree/master/tool_schemas/tool_specification
|
|
|
- Examples:
|
|
|
- template RUNNER: [config.json & in_metadata.json](https://github.com/inab/vre_template_tool/tree/master/tests/basic)
|
|
|
- dpfrep RUNNER (example of a R-based tool): [config.json & in_metadata.json](https://github.com/inab/vre_dpfrep_executor/tree/master/tests/basic)
|
|
|
- JSON schemes:
|
|
|
- euCanSHare VRE tool schemes: [tool schemas](https://github.com/euCanSHare/vre/tree/master/tool_schemas/tool_specification)
|
|
|
|
|
|
1.b - VRE web interface approach:
|
|
|
- **VRE web interface approach**:
|
|
|
|
|
|
Through web forms that (1) allows the edition and validation of a JSON document gathering data about the input files and arguments, and (2) asks data related to your local development environment. The result is a downloadable config.json and in_metadata.json with the corresponding shell script ready to be locally executed (in step 2).
|
|
|
Use the tool's developer admin panel to created these files. The user interface include web forms that allows the edition and validation of a JSON document gathering data about the input files and arguments. If you provide data about your local development environment (*i.e.* working directories or the location of test input files, VRE will generate a `config.json` and a `in_metadata.json` for downloading.
|
|
|
- Where: in the left navigation menu, Admin → My Tools → Development → (+) Add new tool
|
|
|
- Requirements: user account with "tool developer" rights
|
|
|
|
|
|
- Where? https://vre.eucanshare.bsc.es/ -> Admin -> My Tools -> Development -> (+) Add new tool
|
|
|
> Note:<br/>
|
|
|
> schemes are being adapted to each VRE project. If the list of accepted values for `data-type` or `file-type` is not covering your use-case, just contact us. We'll extend the supported metadata.
|
|
|
|
|
|
- Requirements: VRE user account with "tool developer" rights
|
|
|
<br/>
|
|
|
|
|
|
#### STEP 2: Prepare a VRE RUNNER wrapping your application
|
|
|
|
|
|
>> Step 2 -- Prepare a new VRE RUNNER wrapping your application
|
|
|
VRE RUNNERs are pieces of code that work as adapters between the VRE server and each of the integrated applications or pipelines. Eventually, the RUNNER should:
|
|
|
1. Consume the VRE job execution files that will be generated when a user submits a new job from the web interface,
|
|
|
2. Run locally the wrapped application or pipeline, and
|
|
|
3. Generate a list of output files, information that the VRE server will use to register and display the files at users' workspace.
|
|
|
|
|
|
VRE RUNNERs are pieces of code that work as adaptors between the VRE server and each of the integrated applications or pipelines. The RUNNER (1) consumes the VRE job execution files generated when a user submits a new job at the web interface, (2) runs the wrapped application, and (3) generates the list of output files, data that the VRE server will eventually register and display at users' workspace.
|
|
|
For preparing the RUNNER, the easiest option is to **take as reference a `RUNNER template `** and use it as skeleton to wrap your own application. The template includes a couple of python classes that parse and load VRE job execution files into python dictionaries. The template includes a method that you can customize at your convenience to call the application, module or pipeline to be integrated.
|
|
|
|
|
|
For preparing the RUNNER, the easiest option is to take as reference the RUNNER template repository and adapt some of its methods. The template includes a couple of python classes that parse VRE job execution files into python dictionaries. These are passed to the `run` method of the `VRE_Tool` class, function that you can customize at your convenience to call the application, module or pipeline to be integrated.
|
|
|
**Step-by-step**
|
|
|
|
|
|
- RUNNER template repository : https://github.com/inab/vre_template_tool
|
|
|
1. **Fork or clone** the repository of the RUNNER template in your local development environment.
|
|
|
|
|
|
- RUNNER template documentation : https://vre-template-tool.readthedocs.io/en/latest/reference/classes.html
|
|
|
2. (optional) Run the `hello_word` example. The RUNNER template is initially configured to "wrap" an application called `hello.py`. It demonstrates the overall flow of a VRE RUNNER.
|
|
|
- How to: https://github.com/inab/vre_template_tool#run-the-wrapper
|
|
|
3. **Include your own job execution files** in the repository. You can copy the JSON files generated in *STEP 1* into the `test/` folder of the repository to replace the basic `hello_word` example. They should contain the input files and arguments for a test execution of your tool. You can try again to run the RUNNER as above, but now it's going to fail, as the RUNNER is still expecting the arguments and the input files of the `hello_word` example.
|
|
|
|
|
|
2.a - Fork or clone the repository of the RUNNER template in your local development environment.
|
|
|
Make sure that the absolute path of the working directory and the input files defined in these JSON files are accessible.
|
|
|
|
|
|
How to: https://github.com/inab/vre_template_tool
|
|
|
4. **Implement the `run` method of the `VRE_Tool`** so that the function executes the application, module or pipeline to be integrated. The input file locations and argument values as defined in the job execution files are going to be the content of parameters received in the `run` method.
|
|
|
|
|
|
2.b - (optional) Run the `hello_word` example. The RUNNER template is initially configured to "wrap" an application called `hello.py`. It demonstrates the overall flow of a VRE RUNNER.
|
|
|
| RUNNER template repository | |
|
|
|
| ------------- | ------------- |
|
|
|
| https://github.com/inab/vre_template_tool | [documentation](https://vre-template-tool.readthedocs.io/en/latest/reference/classes.html) |
|
|
|
|
|
|
How to: https://github.com/inab/vre_template_tool#run-the-wrapper
|
|
|
|
|
|
2.c - Pass the job execution files generated in step 1 as parameters of the VRE_RUNNER. These should contain the input files and arguments for a test execution of your tool. You can copy them into the `test/` folder of the repository to replace the basic `hello_word` example.
|
|
|
5. The RUNNER will be ready when the wrapped application is properly executed and the output files are generated at the location specified in `output_files[].file.path`. These paths are usually defined in `config.json` file.
|
|
|
Alternatively, if the name and number of the output files cannot be known before the execution, you should extend the `VRE_Tool.run` method to **define the `file.path` attribute** into the `output_files` dictionary. The RUNNER will write down it into the out-files metadata file (*i.e.* `out_metadata.json`).
|
|
|
|
|
|
Make sure that the absolute path of the working directory and the input files defined in these JSON files are accessible.
|
|
|
Make sure your output files are generated in the root of the working directory
|
|
|
|
|
|
2.d - Implement the `run` method of the `VRE_Tool` so that the function executes the application, module or pipeline to be integrated. The input file locations and argument values as defined in the job execution files are going to be content of parameters received in the `run` method.
|
|
|
6. **Save your RUNNER** in a GIT repository publicly available. In the same way than the template RUNNER, document the installation and include some test datasets, considering also the installation of the wrapped application itself: extra modules, dependencies, libraries, etc. VRE administrators will eventually install this repository in the VRE cloud.
|
|
|
|
|
|
2.e - The RUNNER will be ready when the wrapped application is properly executed and the output files are generated in the working directory, under the location specified in `output_files[].file.path`, defined either in `config.json` or in `out_metadata.json`.
|
|
|
<br/>
|
|
|
|
|
|
>> Step 3 -- Define the metadata of your tool into a tool-specification file
|
|
|
#### STEP 3: Annotate and submit the new VRE tool
|
|
|
|
|
|
Once the RUNNER is successfully executing the application in your local development environment, it is time to ask for registering the new tool to the corresponding VRE server. To do so, some descriptive metadata on the new application is required, *i.e.*, tool descriptions and titles, ownership, references, keywords, etc.
|
|
|
|
|
|
Notes:
|
|
|
Again, two approaches are supported:
|
|
|
|
|
|
`VRE_RUNNER` should have executable permissions
|
|
|
- Manual approach:
|
|
|
|
|
|
Generate the `tool specification file` taking as reference some examples to fully annotate the new tool
|
|
|
- JSON schemes:
|
|
|
- [tool specification](https://github.com/euCanSHare/vre/tree/master/tool_schemas/tool_specification)
|
|
|
- Examples:
|
|
|
- dpfrep RUNNER (example of a R-based tool): [tool_specification.json](https://github.com/inab/vre_dpfrep_executor/tree/master/tool_specification)
|
|
|
|
|
|
Save your tool specification file in your repository and send it all together to
|
|
|
|
|
|
Make sure that all output files are generated at the root of the working directory ( defined at the argument with key `execution`. Read from `test/config.jso |
|
|
\ No newline at end of file |
|
|
- VRE web interface approach
|
|
|
Make sure that all output files are generated at the root of the working directory (defined at the argument with key `execution`. Read from `test/config.jso |
|
|
\ No newline at end of file |