Skip to content

Multisession modeling functionality for AutoLFADS#152

Open
claytonwashington wants to merge 27 commits intocunningham-lab:masterfrom
claytonwashington:lfads
Open

Multisession modeling functionality for AutoLFADS#152
claytonwashington wants to merge 27 commits intocunningham-lab:masterfrom
claytonwashington:lfads

Conversation

@claytonwashington
Copy link

No description provided.

…r update at 2023-10-03 16:52:49.268505. Purpose: test. AMI: ami-05ddab33dd696e7d0
…r update at 2023-10-04 11:00:56.540733. Purpose: setup before first single run. AMI: ami-076b9ecc29502b344
…r update at 2023-10-09 17:07:12.925453. Purpose: upgraded CUDA. AMI: ami-0295578a40774d483
…r update at 2023-10-09 17:40:31.179587. Purpose: updated cuda and got sklearn. AMI: ami-0163b17cdfa51a9f5
…r update at 2023-10-10 17:08:18.822096. Purpose: final test on local single and multi. AMI: ami-0ccfff108ae8de464
@cellistigs
Copy link
Collaborator

Confirmed unit tests passed on local machine. Moving on to tests with AWS SAM.

@cellistigs
Copy link
Collaborator

cellistigs commented Oct 27, 2023

Sam package built/Template validated.

$ cd ~/neurocaas/ncap_iac/ncap_blueprints/iac_utils/
$ bash fullbuild.sh ../autolfads-torch
$ sam validate --template-file 

@cellistigs
Copy link
Collaborator

cellistigs commented Oct 27, 2023

Built AWS sam package successfully (docker failing before- updated fullbuild.sh to use appropriate context.)

@cellistigs
Copy link
Collaborator

Need multisession parameter for this analysis to pass in config file.

@cellistigs
Copy link
Collaborator

cellistigs commented Oct 27, 2023

Tested standard implementation. Across various errors (different environment variables, misspecified arns) preserved behavior of shutting down instances as expected.

@cellistigs
Copy link
Collaborator

At the moment, ncap_iac/protocols/submit_start.py has both process_upload_dev and process_upload_multisession. It appears these are identical save for the submission class passed to them. It would be good to refactor this so we can pass a submission object- since these functions take care of error handling having a single function wherever possible will simplify testing in the future.

@cellistigs
Copy link
Collaborator

The following functionality has been tested for reliability of cloud management systems, validated through sam local invoke calls. Note that we only check MainLambda here, as the behavior of the postprocessing lambda file does not have to change.

  • failing to pass a multisession parameter
  • passing multisession:False with a single dataset in the inputs folder.
  • passing multisession:True with a single dataset in the inputs folder.
  • passing multisession:True with datasets in multisession_test (directory passed to submit file) (launches one instance)
  • passing multisession:False with datasets in multisession_test (directory passed to submit file) (launches one instance)
  • passing multisession:True with datasets in multisession_test (individual files passed to submit file) (launches two instances)
  • passing multisession:False with datasets in multisession_test (individual files passed to submit file) (launches two instance)

@cellistigs
Copy link
Collaborator

#deploy:autolfads-torch

@cellistigs
Copy link
Collaborator

We have some issues with logging- at least on AWS-SAM, we only ever see messages from logging start of the first instance. Check this in deployment. Note we have only checked the above functions for cloud reliability, so the above formats should be tested for expected functionality in all combinations of inputs listed above.

We should also refactor: At the moment, ncap_iac/protocols/submit_start.py has both process_upload_dev and process_upload_multisession. It appears these are identical save for the submission class passed to them. It would be good to refactor this so we can pass a submission object- since these functions take care of error handling having a single function wherever possible will simplify testing in the future.

@cellistigs
Copy link
Collaborator

#deploy:autolfads-torch

1 similar comment
@cellistigs
Copy link
Collaborator

#deploy:autolfads-torch

…r update at 2023-10-31 14:18:09.494446. Purpose: multisession entry no longer needed in config. AMI: ami-0699e307bc81482db
@cellistigs
Copy link
Collaborator

#deploy:autolfads-torch

…r update at 2023-11-10 15:49:37.826420. Purpose: fixed lfads-torch import. AMI: ami-0893ac69ba200a546
@cellistigs
Copy link
Collaborator

#deploy:autolfads-torch

…r update at 2023-11-16 13:55:30.184299. Purpose: fixed issues with access for root/SSM user. AMI: ami-017465330c3757649
…r update at 2023-11-16 14:25:10.865461. Purpose: root path fix fix?. AMI: ami-0754772739da3a57d
@cellistigs
Copy link
Collaborator

#deploy:autolfads-torch

…r update at 2023-11-28 17:12:13.330582. Purpose: fixed single dataset run; command broken for multisession. AMI: ami-024a73c9849ee99e4
…r update at 2023-12-01 21:09:51.895815. Purpose: new multisession post-tutorial. AMI: ami-0825772e0ac2e67c8
…r update at 2023-12-05 22:16:10.678782. Purpose: successful ssm test on multisession via nc_contrib hotfix 51. AMI: ami-0cebcfea203997fcf
@cellistigs
Copy link
Collaborator

#deploy:autolfads-torch

Copy link
Collaborator

@cellistigs cellistigs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Most of the changes here look good- however I have some suggestions regarding code structure and logging.

@cellistigs
Copy link
Collaborator

#deploy:autolfads-torch

@cellistigs
Copy link
Collaborator

#deploy:autolfads-torch

@cellistigs
Copy link
Collaborator

#deploy:autolfads-torch

@cellistigs
Copy link
Collaborator

#deploy:autolfads-torch

@cellistigs
Copy link
Collaborator

#deploy:autolfads-torch

@cellistigs
Copy link
Collaborator

#deploy:autolfads-torch

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants