Skip to content

Conversation

@jon-myers
Copy link
Contributor

Summary

Implements Issue #46 - Adds comprehensive spectrogram data access and visualization capabilities to the IDTAP Python API, specifically designed for computational musicology workflows like DN-Book-Figures.

Key Features

  • SpectrogramData class with full CQT spectrogram support
  • Matplotlib integration for overlaying pitch contours on spectrograms
  • Intensity transformation with power-law contrast enhancement (1.0-5.0)
  • 35+ colormap support from matplotlib
  • Frequency/time cropping for targeted analysis
  • Image export to PNG, JPEG, WebP, and other formats
  • Logarithmic frequency scale for proper CQT representation

Implementation Details

New Files

  • idtap/spectrogram.py (643 lines) - Complete SpectrogramData class
  • idtap/tests/spectrogram_test.py (450 lines) - 36 comprehensive tests
  • docs/api/spectrogram.rst - Full API documentation with examples

Modified Files

  • idtap/client.py - Added download_spectrogram_data() and download_spectrogram_metadata()
  • idtap/__init__.py - Exported SpectrogramData and SUPPORTED_COLORMAPS
  • pyproject.toml - Added dependencies: numpy, pillow, matplotlib
  • Pipfile - Synced dependencies
  • docs/api/index.rst - Added spectrogram to API reference
  • docs/index.rst - Added spectrogram to features list

Key Methods

Loading:

  • SpectrogramData.from_audio_id(audio_id, client) - Load from audio ID
  • SpectrogramData.from_piece(piece, client) - Load from Piece object

Visualization:

  • plot_on_axis(ax, power, cmap, alpha, log_freq) - Plot on matplotlib axis for overlays
  • to_matplotlib(figsize, power, cmap, show_colorbar, log_freq) - Generate standalone figure
  • get_plot_data(power, apply_cmap, cmap) - Low-level data access

Processing:

  • apply_intensity(power) - Power-law contrast enhancement
  • apply_colormap(data, cmap) - Apply matplotlib colormaps
  • crop_frequency(min_hz, max_hz) - Crop to frequency range
  • crop_time(start_time, end_time) - Crop to time segment

Export:

  • save(filepath, width, height, power, cmap) - Save as image file
  • to_image(width, height, power, cmap) - Generate PIL Image

Technical Specifications

  • Algorithm: Essentia NSGConstantQ (CQT)
  • Default Frequency Range: 75-2400 Hz
  • Default Bins Per Octave: 72 (high resolution for microtonal analysis)
  • Data Format: uint8 grayscale, gzip-compressed
  • Time Resolution: ~0.0116 seconds per frame
  • Frequency Scale: Logarithmic (log_freq=True by default)

Testing

  • All 401 tests passing (365 existing + 36 new)
  • Comprehensive test coverage:
    • Initialization and parameter validation
    • Loading from server (with mocking)
    • Intensity transforms
    • Colormap application
    • Frequency/time cropping
    • Matplotlib integration
    • Image generation
    • Properties and constants

Dependencies

Added to pyproject.toml:

"numpy>=1.20.0",
"pillow>=9.0.0",
"matplotlib>=3.5.0"

Example Usage

from idtap import SwaraClient, SpectrogramData
import matplotlib.pyplot as plt

# Load spectrogram
client = SwaraClient()
spec = SpectrogramData.from_audio_id("audio_id", client)

# Create matplotlib overlay with pitch contour
fig, ax = plt.subplots(figsize=(12, 6))
im = spec.plot_on_axis(ax, power=2.0, cmap='viridis', alpha=0.7, zorder=0)

# Overlay pitch data
ax.plot(times, pitches, 'r-', linewidth=2, zorder=1)
ax.set_xlabel('Time (s)')
ax.set_ylabel('Frequency (Hz)')
plt.colorbar(im, ax=ax, label='Intensity')

# Crop and save
cropped = spec.crop_frequency(200, 800).crop_time(0, 10)
cropped.save("output.png", power=2.5, cmap='magma')

Design Decisions

  • No caching by default - Simpler implementation, users can cache if needed
  • Logarithmic frequency scale default - Proper CQT representation
  • Client instance optional - Creates client automatically if not provided
  • from_piece() returns None if no audio_id - Graceful handling
  • Power-law transform range 1.0-5.0 - Prevents over-processing

Documentation

  • Complete Sphinx/RST documentation in docs/api/spectrogram.rst
  • Usage examples for basic visualization, matplotlib overlays, and cropping
  • Technical details and supported colormaps
  • Integrated into main API documentation

🤖 Generated with Claude Code

Co-Authored-By: Claude noreply@anthropic.com

jon-myers and others added 3 commits November 6, 2025 13:47
…port

Resolves Issue #46

Add complete spectrogram support to enable programmatic access to the same
high-quality constant-Q spectrograms used in the IDTAP web application, with
extensive matplotlib integration for computational musicology research.

## New Features:

### SpectrogramData Class (`idtap/spectrogram.py`)
- **Data Loading**:
  - `from_audio_id(audio_id, client)` - Download from server
  - `from_piece(piece, client)` - Load from Piece object
  - Auto-decompresses gzipped spectrogram data from swara.studio

- **Transformations**:
  - `apply_intensity(power)` - Power-law contrast enhancement (1.0-5.0)
  - `apply_colormap(data, cmap)` - 35+ matplotlib colormaps
  - `crop_frequency(min_hz, max_hz)` - Frequency range cropping
  - `crop_time(start_time, end_time)` - Time range cropping

- **Matplotlib Integration** (for research workflows):
  - `plot_on_axis(ax, ...)` - Plot on existing axis for overlays
  - `get_plot_data(power, apply_cmap, cmap)` - Get processed data + extent
  - `get_extent()` - Get matplotlib extent [left, right, bottom, top]

- **Image Generation**:
  - `to_image(width, height, power, cmap)` - Generate PIL Image
  - `to_matplotlib(figsize, power, cmap)` - Generate standalone figure
  - `save(filepath, ...)` - Save to file (PNG, JPG, etc.)

- **Properties**:
  - `shape`, `duration`, `time_resolution`, `freq_bins`

### SwaraClient Updates (`idtap/client.py`)
- `download_spectrogram_data(audio_id)` - Download compressed data
- `download_spectrogram_metadata(audio_id)` - Download shape metadata

### Dependencies (`pyproject.toml`, `Pipfile`)
- Added `numpy>=1.20.0` for array processing
- Added `pillow>=9.0.0` for image generation
- Added `matplotlib>=3.5.0` for visualization

## Testing:
- 36 comprehensive test cases covering:
  - Data loading and initialization
  - Intensity transforms and colormap application
  - Frequency and time cropping
  - Matplotlib integration methods
  - Image generation and saving
  - All properties and edge cases
- **All 401 tests pass** (365 existing + 36 new)

## Usage Example:

```python
from idtap import SpectrogramData, get_piece
import matplotlib.pyplot as plt

# Load spectrogram
piece = get_piece("transcription_id")
spec = SpectrogramData.from_piece(piece)

# Create visualization with spectrogram underlay
fig, ax = plt.subplots(figsize=(12, 6))
spec.plot_on_axis(ax, power=2.5, cmap='viridis', alpha=0.6, zorder=0)

# Overlay pitch contour
ax.plot(times, freqs, 'r-', linewidth=2, zorder=1)
ax.set_xlabel('Time (s)')
ax.set_ylabel('Frequency (Hz)')

plt.savefig('figure.png', dpi=300)
```

## Documentation Updates:
- Updated CLAUDE.md with testing warning about browser authorization

## Design Decisions:
- Follows librosa/matplotlib patterns for research workflows
- Optional client parameter (creates if not provided)
- No caching in MVP (users can cache manually)
- Uses matplotlib colormaps (close enough to D3, simpler)
- Loads entire spectrogram into memory (suitable for typical sizes)

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
- Add log_freq parameter to plot_on_axis() and to_matplotlib() (default: True)
- Fixes issue where CQT log-spaced frequency bins were displayed with linear scale
- Sets y-axis to log scale automatically for proper CQT representation
- Create docs/api/spectrogram.rst with SpectrogramData class documentation
- Add usage examples for visualization and matplotlib overlays
- Include technical details (CQT, frequency range, bins per octave)
- Update docs/api/index.rst to include spectrogram in toctree and quick reference
- Update docs/index.rst to list spectrogram feature
@github-actions
Copy link
Contributor

github-actions bot commented Nov 6, 2025

📦 Test Package Built Successfully!

This PR has been automatically built and uploaded to TestPyPI for testing.

🔗 TestPyPI Link: https://test.pypi.org/project/idtap/

To test this version:

pip install --index-url https://test.pypi.org/simple/ idtap

✅ All tests passed and package builds successfully.

@jon-myers
Copy link
Contributor Author

@claude review

@claude
Copy link
Contributor

claude bot commented Nov 6, 2025

Claude encountered an error —— View job


I'll analyze this and get back to you.

@jon-myers
Copy link
Contributor Author

@claude review

@claude
Copy link
Contributor

claude bot commented Nov 6, 2025

Claude finished @jon-myers's task —— View job


Code Review for Spectrogram Support Implementation

Review Tasks

  • Review SpectrogramData class implementation
  • Examine client.py modifications
  • Analyze test coverage and quality
  • Check dependency management
  • Assess documentation
  • Look for potential bugs and security issues
  • Evaluate performance considerations

Code Quality Assessment

Excellent Implementation Quality

This is a very well-designed and thoroughly implemented feature. The code demonstrates excellent software engineering practices:

Strong Architecture:

  • Clear separation of concerns between data loading, processing, and visualization
  • Consistent API design following Python conventions
  • Proper type hints throughout (idtap/spectrogram.py:7-17)
  • Defensive programming with comprehensive parameter validation

Robust Error Handling:

  • Input validation for numpy arrays (idtap/spectrogram.py:64-69)
  • Power transform range validation (idtap/spectrogram.py:142-143)
  • Colormap validation with helpful error messages (idtap/spectrogram.py:177-180)

🎯 Design Decisions - Well Justified

Logarithmic frequency scale default (idtap/spectrogram.py:316-360):

  • Excellent choice for CQT spectrograms - mathematically correct
  • Proper implementation with reasonable y-axis limits

No caching by default - Simplified architecture, appropriate for research workflows

Client optional pattern (idtap/spectrogram.py:93-95) - Good UX design

Test Coverage Analysis

Comprehensive Testing

The test suite is exemplary with 36 tests covering:

  • Parameter validation and edge cases
  • Data loading with proper mocking (idtap/tests/spectrogram_test.py:79-98)
  • All transformation methods (intensity, colormap, cropping)
  • Matplotlib integration
  • Image generation

Test Quality Highlights:

  • Proper mocking of HTTP client to avoid network dependencies
  • Edge case testing (invalid inputs, boundary conditions)
  • Synthetic data generation for consistent testing

Dependencies & Security

Well-Managed Dependencies

Added dependencies are appropriate:

  • numpy>=1.20.0 - Essential for array operations
  • pillow>=9.0.0 - Image processing, good version floor
  • matplotlib>=3.5.0 - Visualization, reasonable version requirement

Dependency sync maintained between pyproject.toml and Pipfile

🔒 Security Assessment - Good

Data Handling Security:

  • Input validation prevents buffer overflows (idtap/spectrogram.py:64-69)
  • Type checking prevents injection attacks
  • Gzip decompression is safe (using stdlib gzip)

HTTP Client Security:

  • Reuses existing authenticated client - no new attack vectors
  • No direct URL construction from user input

Performance Considerations

Performance Design

Efficient Processing:

  • Vectorized numpy operations for intensity transforms (idtap/spectrogram.py:148-152)
  • In-place operations where appropriate
  • PIL integration for efficient image handling

Memory Management:

  • Creates new instances for cropping (immutable pattern) - good for research workflows
  • No unnecessary data copying in transforms

Potential Improvements:

  • Could add lazy loading for very large spectrograms
  • Consider chunked processing for memory-constrained environments

Minor Issues Found

🔍 Small Areas for Improvement

  1. Magic Number Documentation (idtap/spectrogram.py:509):

    return 0.0116  # seconds per frame (approximate)

    The comment explains it's approximate, but could benefit from more detail about the calculation source.

  2. Error Message Consistency (idtap/spectrogram.py:177-180):
    Could reference the constant SUPPORTED_COLORMAPS more explicitly in error messages.

  3. PIL Import Pattern (idtap/spectrogram.py:384, 411):
    Multiple PIL imports could be consolidated, but this is very minor.

Client.py Integration

Clean Integration

New methods are well-designed (idtap/client.py:687-709):

  • Clear method names and documentation
  • Consistent return types
  • Proper endpoint construction
  • Reuses existing _get method

API Design:

  • Separate data and metadata endpoints - good separation
  • Binary data handling for compressed spectrograms

Documentation Quality

Excellent Documentation

Sphinx/RST documentation (docs/api/spectrogram.rst):

  • Complete API reference
  • Usage examples
  • Technical specifications
  • Well-integrated into main docs

Code Documentation:

  • Comprehensive docstrings with examples
  • Clear parameter descriptions
  • Return type documentation

Overall Assessment

🌟 Outstanding Implementation

This is production-ready code that demonstrates:

  • Excellent software engineering practices
  • Comprehensive testing (401 tests total, all passing)
  • Thoughtful API design for research workflows
  • Proper security considerations
  • Thorough documentation

🎯 Recommendation: Approve

This PR implements a complex feature with exceptional quality. The code is:

  • Well-architected and maintainable
  • Thoroughly tested with comprehensive coverage
  • Properly documented for users and developers
  • Secure with appropriate input validation
  • Performance-conscious with efficient algorithms

The implementation successfully balances simplicity for researchers with the power needed for computational musicology workflows.


@jon-myers jon-myers merged commit 870b1c3 into main Nov 6, 2025
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants