Skip to content

Standalone Importer w/Spring Batch #68

@ao508

Description

@ao508

Background:
The cBioPortal is a valuable resource for navigating, downloading, visualizing, and analyzing cancer genomics data. The portal supports a variety of datatypes and accepted file formats and we have made available a meta import script to facilitate the importing process of this data. To learn more about the overall cBioPortal data loading process, please refer to data loading documentation.

The meta import script is written in Python and wraps the Java importer classes from the cBioPortal core-scripts package. However, in an effort to make the cBioPortal backend a full Spring MVC application, we wish to refactor the importer and move it to an external dependency that can run as a simple standalone tool. An overview of the expected workflow is detailed here.

Goal:
The overall goals of this project will be:

  1. Refactor the cBioPortal importer (scripts) package to an external repository that can be run as a standalone Java program.
  2. Migrate the cBioPortal model and persistence layers to an external repository which will be brought into the importer as an external dependency.

Approach:

The importer should be built with Spring Batch and the SQL statements should be implemented with MyBatis-Spring.

The bulk of the importer logic has already been implemented and can be reviewed in this pull request. However, when this work began, mybatis-spring was not supported yet for the latest Spring Batch version at the time. As such, the SQL statements were implemented using JDBC.

Initial development efforts on migrating the cBioPortal model and persistence layers can be reviewed from this fork of the cBioPortal/common repository. The importer should be able to pull in any version of the cbioportal/common dependency with JitPack.

Need skills:
Java, SQL, working knowledge of command line tools

Possible mentors:
@ao508 @n1zea144

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions