Skip to content

Conversation

@henninggaertner
Copy link
Collaborator

@henninggaertner henninggaertner commented Dec 2, 2024

Description

fixes #41
log(0) is infinite, which is not the desired outcome for such a value. Therefore, they need to be filtered and dropped from the data, while notifying the users about the dropped values and urging them to change their pre-processing pipeline.

Changes

Add filtering, dropping and message in transformation.py
Adjust and add tests in test_transformation.py

Testing

Create a run with both protein and peptide import. Then manually change some intensity values to be 0 or negative in the run's dataframe folder. Then do the log transformation with the "dirty" data.
You should see a message about the dropped data.

PR checklist

Development

  • If necessary, I have updated the documentation (README, docstrings, etc.)
  • If necessary, I have created / updated tests.

Mergeability

  • main-branch has been merged into local branch to resolve conflicts
  • The tests and linter have passed AFTER local merge
  • The code has been formatted with black

Code review

  • I have self-reviewed my code.
  • At least one other developer reviewed and approved the changes

Add message warning the user of filtered data in transformation.py.
@henninggaertner henninggaertner linked an issue Dec 2, 2024 that may be closed by this pull request
3 tasks
@github-actions
Copy link

github-actions bot commented Dec 2, 2024

Coverage report

Click to see where and how coverage changed

FileStatementsMissingCoverageCoverage
(new stmts)
Lines missing
  protzilla
  disk_operator.py
  run.py
  run_helper.py
  steps.py 76-77, 161, 192
  protzilla/data_analysis
  plots.py
  protein_graphs.py
  ptm_analysis.py 56, 123-125
  protzilla/data_integration
  database_query.py 113-119
  di_plots.py
  enrichment_analysis.py
  protzilla/data_preprocessing
  imputation.py
  plots.py
  protzilla/importing
  ms_data_import.py 122, 276
  protzilla/methods
  data_analysis.py 21, 41, 123, 177, 200, 249, 284, 597, 615, 636, 651
  importing.py 123
  protzilla/utilities
  utilities.py 184
  ui/main
  settings.py 131
  ui/runs
  fields.py
  views.py 73-74, 180, 678, 683-687
  views_helper.py
  ui/runs/forms
  base.py 107
  custom_fields.py 119
  data_analysis.py 323-327, 384, 414, 450, 473, 1188, 1226, 1248
  data_integration.py
  ui/settings
  plot_template.py 84
Project Total  

The report is truncated to 25 files out of 60. To see the full report, please visit the workflow summary page.

This report was generated by python-coverage-comment-action

msg.append(
dict(
msg=f"Warning: {len(untransformable_peptide_data_df)} data points of peptide data with zero or negative intensity values were found and will be dropped. "
f"Please adapt your preprocessing pipeline if this is unexpected.",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should use "workflow" here instead of "pipeline" to stay consistent? :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

#TODO log transform of 0 values in df

3 participants