Skip to content

AlphaPeptDeep_ms2_generic error #174

@5h4ng

Description

@5h4ng

Hello,

When using the AlphaPeptDeep_ms2_generic model, I encountered a ValueError during prediction when the input DataFrame contains more than 1000 peptides whose lengths are not the same.

  1. Case that works:
    1000 peptides of the same length:

    import pandas as pd
    import numpy as np
    from koinapy import Koina
    
    inputs = pd.DataFrame()
    inputs['peptide_sequences'] = np.array(["AAA"] * 1000 + ["AAA"] * 1000)
    inputs['precursor_charges'] = np.array([2] * 1000 + [2] * 1000)
    inputs['collision_energies'] = np.array([25] * 1000 + [25] * 1000)
    inputs['instrument_types'] = np.array(["QE"] * 1000 + ["QE"] * 1000)
    
    model = Koina("AlphaPeptDeep_ms2_generic", "koina.wilhelmlab.org:443")
    predictions = model.predict(inputs, debug=True)
    # Prediction succeeds.
  2. Case that fails:
    1000 peptides of one length and another 1000 peptides of a different length:

    import pandas as pd
    import numpy as np
    from koinapy import Koina
    
    inputs = pd.DataFrame()
    inputs['peptide_sequences'] = np.array(["AAA"] * 1000 + ["AAAA"] * 1000)
    inputs['precursor_charges'] = np.array([2] * 1000 + [2] * 1000)
    inputs['collision_energies'] = np.array([25] * 1000 + [25] * 1000)
    inputs['instrument_types'] = np.array(["QE"] * 1000 + ["QE"] * 1000)
    
    model = Koina("AlphaPeptDeep_ms2_generic", "koina.wilhelmlab.org:443")
    predictions = model.predict(inputs, debug=True)

    This results in the following error:

    ValueError: all the input array dimensions except for the concatenation axis must match exactly, but along dimension 1, the array at index 0 has size 8 and the array at index 1 has size 12
    

I guess the error arises because the model dynamically calculates the output dimensions for each batch based on the max peptide length of the batch, but Koina attempts to concatenate outputs from multiple batches without handling mismatched dimensions.

Thanks,
Shang

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions