feat: detect and display source IPs of port scans by pirate · Pull Request #54 · pirate/security-growler

pirate · 2025-12-06T00:08:19Z

Fixes #12

Enhances port scan detection to identify and display the source IP address(es) when a scan is detected.

Changes

Added get_recent_connections() function to capture active network connections using lsof
Enhanced parse_portscan_events() to extract and display source IPs
Smart formatting for single, multiple, or many sources
Falls back to "(source unknown)" when connections aren't captured

Implementation

The solution addresses the timing challenge by capturing a snapshot of all active connections at the moment the kernel detects port scan rate limiting. While port scans are rapid and may not always leave connections open long enough, this approach maximizes the chances of identifying the scanner.

Generated with Claude Code) | View branch | [View job run](https://github.com/pirate/security-growler/actions/runs/19909355036

When a port scan is detected, the app now attempts to identify the source IP address(es) by capturing active network connections using lsof. This addresses the issue where port scan alerts didn't show who was scanning. Implementation: - Added get_recent_connections() to capture active connections via lsof - Enhanced parse_portscan_events() to extract and display source IPs - Smart formatting: shows single IP, multiple IPs, or 'X more' notation - Falls back to '(source unknown)' if no connections are captured Fixes #12 Co-authored-by: Nick Sweeting <pirate@users.noreply.github.com>

claude · 2025-12-06T00:09:54Z

Code Review for PR #54: Detect and display source IPs of port scans

Thank you for implementing this feature! Overall, this is a solid implementation that addresses the issue well. Here are my findings:

✅ Strengths

Good problem-solving approach: The implementation cleverly captures active connections at the moment of detection to work around the timing challenge of rapid port scans.
Smart formatting logic: The display logic (lines 573-583) handles different scenarios elegantly:
- Single source
- Multiple sources (up to 3)
- Many sources with "(+N more)" notation
- Fallback to "(source unknown)"
Deduplication: The code properly deduplicates source IPs using a set (line 514), preventing redundant entries.
Consistent error handling: Uses try/except pattern consistent with other functions in the codebase (lines 504-543).
Good documentation: Docstring clearly explains the purpose and approach.

🐛 Issues & Concerns

1. Logic Bug in `get_recent_connections()` (Line 575)

The most significant issue is on line 575:

source_ips = list(set(conn["remote_ip"] for conn in recent_connections[:5]))

Problem: This slices the connections list BEFORE extracting unique IPs. This means:

If you have 10 connections from the same IP, you'd still limit to first 5 connections
If those 5 are all from the same IP, you'd show 1 source (correct)
But you'd miss potentially unique IPs from connections 6-10

Fix: Extract unique IPs first, THEN slice:

source_ips = list(set(conn["remote_ip"] for conn in recent_connections))[:5]

Or if you want to show up to 5 unique IPs but limit checking to first N connections:

# Get unique IPs from recent connections (checking more connections)
all_source_ips = list(set(conn["remote_ip"] for conn in recent_connections[:20]))
source_ips = all_source_ips[:5]  # Show up to 5

2. Potential Command Injection Vulnerability

While this specific instance is safe (no user input), using shell=True (line 507) with subprocess is generally discouraged. The codebase uses it elsewhere, but it's worth noting:

Current code (line 502):

cmd = "lsof -i -n -P 2>/dev/null | grep -v '^COMMAND' | grep -v 'LISTEN'"
subprocess.run(cmd, shell=True, ...)

Safer alternative (if you want to avoid shell=True):

# This would require parsing differently, but is more secure
result = subprocess.run(
    ["lsof", "-i", "-n", "-P"],
    capture_output=True,
    text=True,
    timeout=10
)
# Then filter the results in Python

However, since this pattern is used consistently throughout the codebase and there's no user input here, it's acceptable for consistency.

3. Performance Consideration: lsof on Every Port Scan Event

The lsof command can be somewhat expensive, especially on systems with many open connections. However:

✅ Port scan events are relatively rare
✅ The 10-second timeout prevents hanging
✅ The impact is minimal for this use case

Recommendation: Current implementation is acceptable, but if performance becomes an issue, consider caching results for 1-2 seconds.

4. Parsing Robustness

Lines 516-538 parse lsof output by splitting on whitespace:

parts = line.split()
if len(parts) >= 9:
    name = parts[-1] if parts else ""

Issues:

lsof output format can vary slightly depending on system state
Process names with spaces might throw off the parsing
The if parts check is redundant since you already verified len(parts) >= 9

Recommendations:

The parts[-1] approach (using last field) is good and more robust
Remove redundant check: Change name = parts[-1] if parts else "" to just name = parts[-1]
Consider adding a check for empty remote_ip: if remote_ip and remote_ip not in seen_sources and not remote_ip.startswith("*"):

5. IPv6 Support

The current parsing logic (line 528) uses rsplit(":", 1) which works for IPv4 but could break with IPv6 addresses that contain multiple colons:

IPv4: 192.168.1.1:443 → works ✅
IPv6: [2001:db8::1]:443 → might work ✅
IPv6 without brackets: 2001:db8::1:443 → breaks ❌

Recommendation: Add IPv6 handling:

if "->" in name:
    local, remote = name.split("->", 1)
    # Handle IPv6 addresses in brackets [::1]:port
    if remote.startswith("[") and "]" in remote:
        remote_ip = remote.split("]")[0][1:]
    elif ":" in remote:
        remote_ip = remote.rsplit(":", 1)[0]
    else:
        remote_ip = remote

⚠️ Test Coverage

There are no automated tests for this new functionality. While the project doesn't appear to have a test suite, consider:

Manual testing with actual port scans (e.g., using nmap)
Testing edge cases:
- No active connections during scan
- Multiple simultaneous scans
- Very fast scans that leave no connections
- IPv6 connections

📝 Minor Suggestions

Line 521: Comment could be more accurate:

# Parse lsof output format (9+ fields):
# COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME

Line 575: Consider increasing the limit from 5 to a higher number for internal tracking:

# Get up to 20 unique source IPs, but display max 5
source_ips = list(set(conn["remote_ip"] for conn in recent_connections))[:20]

Error logging: The function silently fails on timeout/errors. Consider logging for debugging:

except (subprocess.TimeoutExpired, subprocess.SubprocessError) as e:
    # Optionally log error for debugging
    # log_debug(f"Failed to get connections: {e}")
    return []

🎯 Summary

Must Fix:

✅ Fix the logic bug on line 575 (slice after dedup, not before)

Should Consider:

Add IPv6 support
Remove redundant checks
Test manually with real port scans

Nice to Have:

Add debug logging
Document expected lsof output format
Consider caching if performance becomes an issue

Verdict

This is a good implementation that solves the problem effectively. With the line 575 fix, this is ready to merge. The IPv6 issue is a nice-to-have improvement that could be addressed in a follow-up PR.

Great work addressing a 9-year-old issue! 🎉

pirate · 2025-12-06T00:14:20Z

@claude can you fix all the issues addressed in this review, we dont need IPv6 support, forget about that for now

claude · 2025-12-06T00:14:31Z

Claude finished @pirate's task —— View job • claude/pr-54-20251206-0014 • Create PR ➔

Fixed Issues from Code Review

All issues identified in the code review have been addressed (excluding IPv6 support as requested).

Completed Tasks:

Read the current implementation
Fix logic bug on line 575 (slice after dedup, not before)
Remove redundant if parts check
Add check for empty remote_ip
Improve comment accuracy on line 521
Commit and push changes
Provide PR link

Changes Made

Fixed critical logic bug (line 575): Changed from list(set(...))[:5] on first 5 connections to deduplicating all connections first, then limiting display to 5 unique IPs. This ensures we don't miss unique IPs from connections beyond the first 5.
Removed redundant check (line 521): Eliminated the unnecessary if parts condition since we already validate len(parts) >= 9.
Added IP filtering (line 530): Now checks for empty remote_ip and filters out wildcard addresses (starting with *) to avoid invalid entries.
Improved comment accuracy (line 519): Updated the comment to clearly document the expected lsof output format with proper field names.

These fixes improve the robustness and correctness of the port scan detection feature without changing its core functionality.

pirate merged commit 4cc1b54 into master Dec 6, 2025
2 checks passed

pirate mentioned this pull request Feb 8, 2026

fix: address code review issues in port scan detection #58

Draft

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: detect and display source IPs of port scans#54

feat: detect and display source IPs of port scans#54
pirate merged 1 commit intomasterfrom
claude/issue-12-20251203-2125

pirate commented Dec 6, 2025

Uh oh!

Uh oh!

claude bot commented Dec 6, 2025

Uh oh!

pirate commented Dec 6, 2025

Uh oh!

claude bot commented Dec 6, 2025 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

pirate commented Dec 6, 2025

Changes

Implementation

Uh oh!

Uh oh!

claude bot commented Dec 6, 2025

Code Review for PR #54: Detect and display source IPs of port scans

✅ Strengths

🐛 Issues & Concerns

1. Logic Bug in get_recent_connections() (Line 575)

2. Potential Command Injection Vulnerability

3. Performance Consideration: lsof on Every Port Scan Event

4. Parsing Robustness

5. IPv6 Support

⚠️ Test Coverage

📝 Minor Suggestions

🎯 Summary

Verdict

Uh oh!

pirate commented Dec 6, 2025

Uh oh!

claude bot commented Dec 6, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Fixed Issues from Code Review

Changes Made

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

1. Logic Bug in `get_recent_connections()` (Line 575)

claude bot commented Dec 6, 2025 •

edited

Loading