Correlate two profiles of striation marks by laurensWe · Pull Request #98 · NetherlandsForensicInstitute/scratch

laurensWe · 2026-01-23T09:13:23Z

Migrates ProfileCorrelateSingle.m from MATLAB to Python, implementing a profile correlation algorithm for comparing striated marks on bullets.

Key changes:

New profile_correlator module with clean separation of concerns:
- correlator.py - Main correlate_profiles() function using brute-force alignment
- data_types.py - Profile, AlignmentParameters, ComparisonResults dataclasses
- statistics.py - Roughness metrics (Sa, Sq), signature differences, overlap ratio
- transforms.py - Pixel equalization and scaling using scipy interpolation
Algorithm: Uses brute-force grid search over shifts and scales (instead of MATLAB's fminsearchbnd 2D optimization). Tries all shift positions with 7 scale factors (±5% range), selecting maximum
correlation.
Minimum overlap: 350 μm (enforced with np.ceil to prevent rounding issues)
Simplified data model: Profile contains only depth_data (1D array) and pixel_size

Test plan

Unit tests for correlator, statistics, and transforms modules
Synthetic profile tests (shifted, scaled, partial, flipped)
Regression tests with 17 real-like profile pairs

…ngle.m

…tests

…e_boundary_zeroes

…ngle.m

…n_overlap=200mum

…zation The brute-force correlator now uses the same logic for all profile lengths, so there's no need to distinguish between partial and full profile matching. - Remove is_partial_profile and partial_start_position from ComparisonResults - Update visualization to always use position_shift for alignment - Simplify test assertions to not check partial profile flag - Delete old correlate_profiles_old output directory - Update test images with new visualization format Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

- Add generate_matlab_visualizations.py script to generate images from MATLAB test data in resources/profile_correlator/ - Fix NaN handling in plot_correlation_result using nanmin/nanmax - Generate 20 test case visualizations in outputs/matlab_test_cases/ Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

- Use each profile's own pixel size for x-axis calculation instead of assuming both profiles have the same pixel size - Only apply partial profile offset logic when pixel sizes match This fixes the different_sampling visualization where ref (3.5 μm/pixel) and comp (5.0 μm/pixel) have different physical lengths. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

laurensWe · 2026-02-03T09:42:25Z

Comparison with MATLAB

This implementation deliberately simplifies the MATLAB ProfileCorrelateSingle.m
algorithm for maintainability (~300 lines vs 3000+ lines).

Intentional Simplifications (divergence from MATLAB):

Global search: MATLAB uses multi-scale coarse-to-fine search with bounded
ranges at each level. This implementation searches all positions globally,
which can find different (sometimes better) alignments for repetitive patterns.
No Nelder-Mead optimization: MATLAB uses fminsearch for sub-sample
precision. This implementation uses discrete sample shifts only.
No low-pass filtering: MATLAB filters profiles at each scale level.
This implementation operates on the original profiles.
Discrete scale factors: Instead of continuous optimization, we try
a fixed set of scale factors (e.g., 0.95, 0.97, ..., 1.05).

packages/scratch-core/src/conversion/profile_correlator/__init__.py

packages/scratch-core/src/conversion/profile_correlator/correlator.py

packages/scratch-core/src/conversion/profile_correlator/data_types.py

packages/scratch-core/src/conversion/profile_correlator/similarity.py

packages/scratch-core/src/conversion/profile_correlator/transforms.py

packages/scratch-core/src/conversion/profile_correlator/__init__.py

packages/scratch-core/src/conversion/profile_correlator/transforms.py

packages/scratch-core/src/conversion/profile_correlator/correlator.py

packages/scratch-core/src/conversion/profile_correlator/transforms.py

packages/scratch-core/src/conversion/profile_correlator/correlator.py

cfs-data · 2026-02-05T12:04:09Z

packages/scratch-core/src/conversion/profile_correlator/correlator.py

+
+
+def correlate_profiles(
+    profile_ref: Profile,


Please write out full names

Laurens and Peter think that names are self-explanatory enough

packages/scratch-core/src/conversion/profile_correlator/correlator.py

github-actions · 2026-02-06T12:01:24Z

Diff Coverage

Diff: origin/main..HEAD, staged and unstaged changes

packages/scratch-core/src/conversion/preprocess_impression/preprocess_impression.py (100%)
packages/scratch-core/src/conversion/profile_correlator/init.py (100%)
packages/scratch-core/src/conversion/profile_correlator/correlator.py (96.2%): Missing lines 112,174,189
packages/scratch-core/src/conversion/profile_correlator/data_types.py (98.0%): Missing lines 42
packages/scratch-core/src/conversion/profile_correlator/statistics.py (96.9%): Missing lines 108
packages/scratch-core/src/conversion/profile_correlator/transforms.py (87.5%): Missing lines 45,56
packages/scratch-core/src/conversion/resample.py (100%)

Summary

Total: 196 lines
Missing: 7 lines
Coverage: 96%

packages/scratch-core/src/conversion/profile_correlator/correlator.py

Lines 108-116

  108     alignment = _find_best_alignment(
  109         heights_ref, heights_comp, scale_factors, min_overlap_samples
  110     )
  111     if alignment is None:
! 112         return None
  113 
  114     # Step 3: Compute and return metrics
  115     return _compute_metrics(alignment, pixel_size, len_ref, len_comp)

Lines 170-178

  170                 shift, len_comp, len_ref
  171             )  # Calculate overlap region for this shift
  172 
  173             if overlap_length < min_overlap_samples:
! 174                 continue
  175 
  176             partial_ref = heights_ref[idx_ref_start : idx_ref_start + overlap_length]
  177             partial_comp = heights_comp_scaled[
  178                 idx_comp_start : idx_comp_start + overlap_length

Lines 185-193

  185                 best_shift = shift
  186                 best_scale = scale
  187 
  188     if best_correlation is None:
! 189         return None
  190 
  191     # Redo computations for best_cale and best_shift (instead of copying partial_ref and partial_comp above multiple times. This saves time.)
  192     heights_comp_scaled = resample_array_1d(heights_comp, best_scale)
  193     len_comp = len(heights_comp_scaled)

packages/scratch-core/src/conversion/profile_correlator/data_types.py

Lines 38-46

  38         Get the number of samples in the profile.
  39 
  40         :returns: Number of samples in heights.
  41         """
! 42         return len(self.heights)
  43 
  44 
  45 @dataclass(frozen=True)
  46 class AlignmentParameters:

packages/scratch-core/src/conversion/profile_correlator/statistics.py

Lines 104-112

  104         or if overlap_length exceeds shorter_length (invalid input).
  105     """
  106     shorter_length = min(ref_length, comp_length)
  107     if np.isclose(shorter_length, 0.0):
! 108         return np.nan
  109     if overlap_length > shorter_length:
  110         return np.nan
  111     return 0.5 * (overlap_length / ref_length + overlap_length / comp_length)

packages/scratch-core/src/conversion/profile_correlator/transforms.py

Lines 41-49

  41     # Downsample the higher-resolution profile to match the lower-resolution one
  42     if pixel_1 > pixel_2:
  43         to_downsample, target_pixel_size = profile_2, profile_1.pixel_size
  44     else:
! 45         to_downsample, target_pixel_size = profile_1, profile_2.pixel_size
  46 
  47     factor = target_pixel_size / to_downsample.pixel_size
  48     downsampled = Profile(
  49         heights=resample_array_1d(to_downsample.heights, factor),

Lines 52-57

  52 
  53     if pixel_1 > pixel_2:
  54         return profile_1, downsampled
  55     else:
! 56         return downsampled, profile_2

github-actions · 2026-02-06T12:01:25Z

Package	Line Rate	Branch Rate	Health
.	96%	88%	✔
comparators	100%	100%	✔
computations	100%	100%	✔
container_models	99%	100%	✔
conversion	97%	86%	✔
conversion.export	100%	100%	✔
conversion.filter	92%	83%	✔
conversion.leveling	100%	100%	✔
conversion.leveling.solver	100%	75%	✔
conversion.plots	98%	85%	✔
conversion.preprocess_impression	99%	91%	✔
conversion.preprocess_striation	89%	58%	✔
conversion.profile_correlator	96%	80%	✔
extractors	98%	75%	✔
mutations	100%	100%	✔
parsers	98%	80%	✔
parsers.patches	89%	60%	✔
preprocessors	95%	75%	✔
processors	100%	100%	✔
renders	98%	50%	✔
utils	91%	75%	✔
Summary	96% (2201 / 2281)	82% (237 / 290)	✔

Minimum allowed line rate is 50%

cfs-data · 2026-02-06T14:01:30Z

packages/scratch-core/src/conversion/profile_correlator/correlator.py

+    shift: int, len_small: int, len_large: int
+) -> tuple[int, int, int]:
+    """
+    Find starting idx for both striations, and compute overlap length


Suggested change

Find starting idx for both striations, and compute overlap length

Find starting index for both striations, and compute overlap length

cfs-data · 2026-02-06T14:14:48Z

packages/scratch-core/src/conversion/profile_correlator/correlator.py

+        idx_small_start = -shift
+        overlap_length = min(len_large, len_small + shift)
+
+    return idx_small_start, idx_large_start, overlap_length


I understand it was like this in MATLAB, so this is not a suggestion for this PR, but: shouldn't the overlap length be a variable as well (ranging from min(min_overlap_samples, maximum_overlap)? Now the comparison profile is cut either from the beginning or from the end, which allows for "small" overlapping regions with high correlation at the endings of the profiles, but not anywhere else. This seems arbitrary?

@laurensWe @vergep What do you think?

cfs-data · 2026-02-06T14:25:01Z

packages/scratch-core/src/conversion/profile_correlator/correlator.py

+    )
+
+    # make scaling symmetric to what you choose as ref or comp
+    scale_factors = np.unique(np.concatenate((scale_factors, 1 / scale_factors)))


Comments in code usually suggests you should perhaps use more functions:

def _get_scale_factors(max_scaling: float, n_steps: int) -> NDArray: """Generate scale factors.""" scales = np.linspace(1.0 - max_scaling, 1.0 + max_scaling, n_steps) scales = np.unique(np.concatenate((scale_factors, 1 / scale_factors))) return scales

cfs-data · 2026-02-06T14:42:49Z

packages/scratch-core/src/conversion/profile_correlator/correlator.py

+
+    # Redo computations for best_cale and best_shift (instead of copying partial_ref and partial_comp above multiple times. This saves time.)
+    heights_comp_scaled = resample_array_1d(heights_comp, best_scale)
+    len_comp = len(heights_comp_scaled)


This line can be removed

cfs-data · 2026-02-06T14:43:48Z

packages/scratch-core/src/conversion/profile_correlator/correlator.py

+
+    if best_correlation is None:
+        return None
+


You could consider splitting the function here into two separte functions by returning the parameters found here, since now your function is doing more than one thing (which is generally undesirable)

cfs-data

Nice!

cfs-data · 2026-02-06T14:57:29Z

packages/scratch-core/src/conversion/resample.py

+    return result
+
+
+def resample_array_2d(


I think you should update every usage of this function in the codebase?

laurensWe and others added 25 commits January 20, 2026 13:44

initial migration

a277d3b

matlab vs python profile_correlator tests

5090857

rename tests resources + make them connect

c4837a8

Fix absolute imports

3d4f735

Change tests + removed fluff + finetune types to use

6087e9a

Change docstrings + remove redundant code

7c77246

Merge branch 'feature/step_3' into feature/migrate_ProfileCorrelateSi…

00f1cb5

…ngle.m

use already implemented filtering + resolve unit differences in unit …

d7642bb

…tests

remove the filtering.py and use existing functionality + remove remov…

cc3d6b8

…e_boundary_zeroes

Merge branch 'feature/step_3' into feature/migrate_ProfileCorrelateSi…

cb36067

…ngle.m

matlab file tests with ccf

7f59620

in between push

8428968

further debugging

c00b06d

Slightly different optimization algo

317b140

more docs for deviation

5a37910

Merge branch 'main' into feature/migrate_ProfileCorrelateSingle.m

651cbc4

delete debug files + add tests with synthetic data

098d3ef

Add extra test

526e7bd

only brute force, no distinction between partial and full profile, mi…

415de0d

…n_overlap=200mum

Improve on image generation for comparing striation profiles

52ab924

Remove Dead code

a449bf3

Remove legacy code + divert docs to own python implementation

742c21c

Remove the references to the multi-scale (matlab) approach

27881f1

vergep reviewed Feb 3, 2026

View reviewed changes

laurensWe and others added 2 commits February 3, 2026 11:15

remove dead transform code + unify the resampling/interpolating

0966776

test cases Coen for profile_correlator

df423fe