-
Notifications
You must be signed in to change notification settings - Fork 30
Open
Description
Firstly, thank you for putting together this useful resource.
I have been accessing the phoshphoproteomics data for some analyses, and have found what looks like peptide duplication.
I am using cptac Version: 1.5.14
I found the duplication by running the following:
luad = cptac.Luad()
test = luad.get_phosphoproteomics("bcm")
flat_columns = ['_'.join(map(str, col)) for col in test.columns]
duplicates = pd.Series(flat_columns).duplicated()
len(pd.Series(flat_columns)[duplicates])
`
Giving 61807 duplicates.
Inspection of a specific peptide confirmed duplication:
test.loc[:, [col for col in test.columns if "SCPIKEDSFLQRYSS" in col]]
Not sure if this is present in other tumour types at this stage. Do you know why I could be seeing this?
Many thanks
Metadata
Metadata
Assignees
Labels
No labels