Skip to content

SEArCH is a Python tool for automated identification of compound pairs in LC/MS data that differ by a user-defined mass.

License

Notifications You must be signed in to change notification settings

JonCasC/Search2

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

37 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Search2: Mass Spectrometry Data Analysis

Search2 (the Spectral Elementaly-Altered Compound Highlighter, or SEArCH) is a Python tool for analyzing LC/MS mass spectrometry data, by identifying molecules with a specific mass difference (e.g., SO₃, gain/loss of functional groups, isotope labeling). It supports single .mzML files, lists of files, or entire directories, and provides flexible output options including CSV, heatmaps, and consensus feature maps.


Features

  • Automated search for mass differences (default: SO₃, 79.9569 Da)
  • Supports single files, file lists, or directories
  • Flexible filtering by retention time, relative intensity, and mass
  • Optional feature map analysis using pyOpenMS
  • Export results as CSV (single experiment), TSV (combined results), heatmaps, and consensusXML
  • MS2 spectra analysis
  • Comprehensive logging

Requirements

Install dependencies with:

pip install pyopenms numpy matplotlib tqdm

Usage

Run the main script from the command line:

python Search.py [options] <filename>

Arguments

  • filename : Path to a .mzML file, a .txt file with file paths, or a directory containing .mzML files.

Options

Option Description Default
-l, --lower Lower retention time limit in seconds. 30
-u, --upper Upper retention time limit in seconds. 3000
-t, --tolerance Mass tolerance for matching peaks (Da). 0.001
-c, --cutoff Minimum intensity threshold for peaks. 0.05
-m, --mass Mass difference to search for (Da). 79.9569
-o, --csvoff Disable CSV output for candidate pairs. (enabled)
-g, --graph Enable heatmap output. Use twice for helper lines. (disabled)
-n, --minmass Minimum m/z value for unsulfated species. 0
-d, --deisotopingdisable Disable deisotoping of spectra. (enabled)
-e, --merger Enable merging of spectra before analysis. (disabled)
-s, --subfolder Name of subfolder for output files. None
-r, --retlim Disable retention time limit for sulfated species. (enabled)
-q, --combine Export combined TSV summary of all experiments. (disabled)
-j, --msmsOn Enable MS2 spectra analysis. (disabled)
-f, --featuremap Use pyOpenMS feature map algorithm instead of standard SEARCH. (disabled)
-p, --scope Comparison scope for the SEARCH algorithm. 1

Output

  • CSV files: Candidate pairs found in each experiment.
  • Heatmaps: Visual representation of retention times.
  • Combined TSV: Summary of all experiments (if enabled).
  • ConsensusXML: Consensus feature map (if enabled).
  • MS2 TSV: MS2 analysis results (if enabled).
  • Log files: Full record of arguments, progress, and results.

Example

python Search.py data/example.mzML -n 100 -u 1200 -s Example_Data -gg -q 

Project Structure

Search2/
├── Search.py
├── search/
│   ├── __init__.py
│   ├── analyse.py
│   ├── export.py
│   ├── setup.py
│   ├── misc_functions.py
│   ├── logs/
├── search_output/

License

MIT License


Note: This tool is intended for research use. Please validate results before using in production or clinical settings.

About

SEArCH is a Python tool for automated identification of compound pairs in LC/MS data that differ by a user-defined mass.

Topics

Resources

License

Stars

Watchers

Forks

Languages