-
Notifications
You must be signed in to change notification settings - Fork 4
Add Omics functions #48
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
I'm okay with merging this in. There are a couple of TODOs in the code but it seems to be working okay. |
|
I think this PR is almost ready to go. The one thing I don't like is that launchers can be run with the '--project_name' field, but when submitting tasks, a 'project_id' aka run group id is expected so there is still a little discrepancy that needs to be addressed. |
|
The last thing I notice is there are references to 'Output', and 'output'. Additionally files are uploaded to /outputs/jobid/inputs/.... Maybe we shouldn't use "Outputs" as a folder name as 'output' is used by omics for the workflow outputs. |
Unlike SBG and Arvaods, Omics allows two projects (run groups) to have identical names, so the get_project_by_name() function may find multiple run group IDs. I think using project ID would be better than using project name when using Omics. I can work on get_project_by_name() to return the first element, so that launcher can run. But it does have a potential issue when another run group exists. |
Currently output files of a run goes into s3://{Bucket}/Outputs/{RunGroupID}/{WorkflowID}/{TaskName}/{RunID}/ , and the upload_file() uploads files into s3://{Bucket}/Outputs/{RunGroupID}/launcher_inputs/ . I can change the output folder to s3://{Bucket}/Project/{RunGroupID}/{WorkflowID}/{TaskName}/{RunID}/ and upload file to s3://{Bucket}/Project/{RunGroupID}/launcher_inputs/ . |
| ''' | ||
| Create project | ||
| ''' | ||
| self.api.create_run_group(name=project_name) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should return a dict where { 'runGroupId': XYZ }
?
golharam
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks good:
- Change references to ProjectId to runGroupId to remove any confusion of terms
- Ask Claude 3.7 to implement unit tests
After that, we are good.
I have replaced the ProjectId to RunGroupId in omics_platform.py. |
| SUPPORTED_PLATFORMS = { | ||
| 'Arvados': ArvadosPlatform, | ||
| # 'Omics': OmicsPlatform, | ||
| 'Omics': OmicsPlatform, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we change the name from 'Omics' to 'AWSHealthOmics' to better reflect the real name? I think 'Omics' was just easier to say and type
| self.s3_client = boto3.client('s3') | ||
|
|
||
| # WES API connection parameters | ||
| self.wes_url = kwargs.get('wes_url', os.getenv('WES_URL')) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Make this WES_API_ENDPOINT
This PR includes changes to add Omics support.