Skip to content

Benchmark model usage with AutoPeptideML-1 vs 2 #163

@elizabethmcd

Description

@elizabethmcd

Hello,

Thank you again for creating such a great and user-friendly tool! I'm following up on Issue #162 with some additional context about the work I'm doing to understand the best path forward with regards to updates to AutoPeptideML.

I'm a computational biologist at Microcosm Foods, where we are working to make the benefits of fermented foods accessible to all through research and open datasets. Our first preprint that we are releasing soon is predicting bioactive peptides from genome-encoded peptides and proteomics experiments from fermented food datasets.

I've been primarily using AutoPeptideML v1.0.1 integrated in a Nextflow workflow https://github.com/MicrocosmFoods/peptide-bioactivity-prediction for predicting different bioactivities. This includes one custom model that I made using v1.0.1 of the webserver (a while back, over a year ago) and the benchmarked models as part of the original AutoPeptideML release available on Zenodo. I saw issue #24 and realized that there were updates to the benchmarked models, and combined with the issue in #162 I'm not sure how to move forward given the development plans for AutoPeptideML.

I see a couple of options for releasing our work and the workflow, and wanted to get your input if there is a preferred option. The other issue I have is that I need to update my custom model with some changes to the training data, and that's why I opened issue #162.

  1. I could keep the v1.01. dependency in my workflow. My understanding is that the updated benchmarked models ending in .onnx extension won't work with this version though, so I'd have to use the original versions of the models, which I understand that there could be some slight accuracy issues given the duplicate sequences in the train/test splits, but that these effects were pretty minimal. However it's my understanding that building my custom model with v1.0.6 or a fixed version of the webserver wouldn't be compatible with this version.
  2. I could update the dependency in my workflow to v1.0.6+, as it's my understanding this would be compatible with the updated benchmark models, and if autopeptideml-1 on webserver doesn't have the same behavior as in the past #162 is fixed I could re-create my custom model.
  3. I could update the dependency in my workflow to the v2.0.4, and build my custom model with v2.0.4. The good news is that I've confirmed that my custom model is built successfully with this version (both in the webserver and the command-line). However, I want to include results from the benchmark models, and I don't think these model formats are compatible with v2+ versions. I could remake these models, but I wanted to see if there were plans on your end to do this anyways. And I don't want to include results where the benchmarked models used for predictions with v1.0.1 and my custom model predictions with v2.0.4, if that makes sense.

Personally I think option 3 is the cleanest, as it would provide users of my workflow/readers of our preprint access to the most updated version of AutoPeptideML and not have to deal with making sure past v1+ versions are still working. I'm just not sure what to do about the benchmark models.

Apologies for the lengthy issue - I really love this tool and very much appreciate your active work to keep it maintained!! I appreciate any guidance on how I should move forward given your plans for the tool.

Thanks,
Elizabeth

Metadata

Metadata

Labels

questionFurther information is requested

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions