Skip to content

feat: detect and display source IPs of port scans#54

Merged
pirate merged 1 commit intomasterfrom
claude/issue-12-20251203-2125
Dec 6, 2025
Merged

feat: detect and display source IPs of port scans#54
pirate merged 1 commit intomasterfrom
claude/issue-12-20251203-2125

Conversation

@pirate
Copy link
Owner

@pirate pirate commented Dec 6, 2025

Fixes #12

Enhances port scan detection to identify and display the source IP address(es) when a scan is detected.

Changes

  • Added get_recent_connections() function to capture active network connections using lsof
  • Enhanced parse_portscan_events() to extract and display source IPs
  • Smart formatting for single, multiple, or many sources
  • Falls back to "(source unknown)" when connections aren't captured

Implementation

The solution addresses the timing challenge by capturing a snapshot of all active connections at the moment the kernel detects port scan rate limiting. While port scans are rapid and may not always leave connections open long enough, this approach maximizes the chances of identifying the scanner.

Generated with Claude Code) | View branch | [View job run](https://github.com/pirate/security-growler/actions/runs/19909355036

When a port scan is detected, the app now attempts to identify the
source IP address(es) by capturing active network connections using lsof.
This addresses the issue where port scan alerts didn't show who was
scanning.

Implementation:
- Added get_recent_connections() to capture active connections via lsof
- Enhanced parse_portscan_events() to extract and display source IPs
- Smart formatting: shows single IP, multiple IPs, or 'X more' notation
- Falls back to '(source unknown)' if no connections are captured

Fixes #12

Co-authored-by: Nick Sweeting <pirate@users.noreply.github.com>
@pirate pirate merged commit 4cc1b54 into master Dec 6, 2025
2 checks passed
@claude
Copy link

claude bot commented Dec 6, 2025

Code Review for PR #54: Detect and display source IPs of port scans

Thank you for implementing this feature! Overall, this is a solid implementation that addresses the issue well. Here are my findings:

✅ Strengths

  1. Good problem-solving approach: The implementation cleverly captures active connections at the moment of detection to work around the timing challenge of rapid port scans.

  2. Smart formatting logic: The display logic (lines 573-583) handles different scenarios elegantly:

    • Single source
    • Multiple sources (up to 3)
    • Many sources with "(+N more)" notation
    • Fallback to "(source unknown)"
  3. Deduplication: The code properly deduplicates source IPs using a set (line 514), preventing redundant entries.

  4. Consistent error handling: Uses try/except pattern consistent with other functions in the codebase (lines 504-543).

  5. Good documentation: Docstring clearly explains the purpose and approach.

🐛 Issues & Concerns

1. Logic Bug in get_recent_connections() (Line 575)

The most significant issue is on line 575:

source_ips = list(set(conn["remote_ip"] for conn in recent_connections[:5]))

Problem: This slices the connections list BEFORE extracting unique IPs. This means:

  • If you have 10 connections from the same IP, you'd still limit to first 5 connections
  • If those 5 are all from the same IP, you'd show 1 source (correct)
  • But you'd miss potentially unique IPs from connections 6-10

Fix: Extract unique IPs first, THEN slice:

source_ips = list(set(conn["remote_ip"] for conn in recent_connections))[:5]

Or if you want to show up to 5 unique IPs but limit checking to first N connections:

# Get unique IPs from recent connections (checking more connections)
all_source_ips = list(set(conn["remote_ip"] for conn in recent_connections[:20]))
source_ips = all_source_ips[:5]  # Show up to 5

2. Potential Command Injection Vulnerability

While this specific instance is safe (no user input), using shell=True (line 507) with subprocess is generally discouraged. The codebase uses it elsewhere, but it's worth noting:

Current code (line 502):

cmd = "lsof -i -n -P 2>/dev/null | grep -v '^COMMAND' | grep -v 'LISTEN'"
subprocess.run(cmd, shell=True, ...)

Safer alternative (if you want to avoid shell=True):

# This would require parsing differently, but is more secure
result = subprocess.run(
    ["lsof", "-i", "-n", "-P"],
    capture_output=True,
    text=True,
    timeout=10
)
# Then filter the results in Python

However, since this pattern is used consistently throughout the codebase and there's no user input here, it's acceptable for consistency.

3. Performance Consideration: lsof on Every Port Scan Event

The lsof command can be somewhat expensive, especially on systems with many open connections. However:

  • ✅ Port scan events are relatively rare
  • ✅ The 10-second timeout prevents hanging
  • ✅ The impact is minimal for this use case

Recommendation: Current implementation is acceptable, but if performance becomes an issue, consider caching results for 1-2 seconds.

4. Parsing Robustness

Lines 516-538 parse lsof output by splitting on whitespace:

parts = line.split()
if len(parts) >= 9:
    name = parts[-1] if parts else ""

Issues:

  • lsof output format can vary slightly depending on system state
  • Process names with spaces might throw off the parsing
  • The if parts check is redundant since you already verified len(parts) >= 9

Recommendations:

  • The parts[-1] approach (using last field) is good and more robust
  • Remove redundant check: Change name = parts[-1] if parts else "" to just name = parts[-1]
  • Consider adding a check for empty remote_ip: if remote_ip and remote_ip not in seen_sources and not remote_ip.startswith("*"):

5. IPv6 Support

The current parsing logic (line 528) uses rsplit(":", 1) which works for IPv4 but could break with IPv6 addresses that contain multiple colons:

  • IPv4: 192.168.1.1:443 → works ✅
  • IPv6: [2001:db8::1]:443 → might work ✅
  • IPv6 without brackets: 2001:db8::1:443 → breaks ❌

Recommendation: Add IPv6 handling:

if "->" in name:
    local, remote = name.split("->", 1)
    # Handle IPv6 addresses in brackets [::1]:port
    if remote.startswith("[") and "]" in remote:
        remote_ip = remote.split("]")[0][1:]
    elif ":" in remote:
        remote_ip = remote.rsplit(":", 1)[0]
    else:
        remote_ip = remote

⚠️ Test Coverage

There are no automated tests for this new functionality. While the project doesn't appear to have a test suite, consider:

  • Manual testing with actual port scans (e.g., using nmap)
  • Testing edge cases:
    • No active connections during scan
    • Multiple simultaneous scans
    • Very fast scans that leave no connections
    • IPv6 connections

📝 Minor Suggestions

  1. Line 521: Comment could be more accurate:

    # Parse lsof output format (9+ fields):
    # COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME
  2. Line 575: Consider increasing the limit from 5 to a higher number for internal tracking:

    # Get up to 20 unique source IPs, but display max 5
    source_ips = list(set(conn["remote_ip"] for conn in recent_connections))[:20]
  3. Error logging: The function silently fails on timeout/errors. Consider logging for debugging:

    except (subprocess.TimeoutExpired, subprocess.SubprocessError) as e:
        # Optionally log error for debugging
        # log_debug(f"Failed to get connections: {e}")
        return []

🎯 Summary

Must Fix:

  • ✅ Fix the logic bug on line 575 (slice after dedup, not before)

Should Consider:

  • Add IPv6 support
  • Remove redundant checks
  • Test manually with real port scans

Nice to Have:

  • Add debug logging
  • Document expected lsof output format
  • Consider caching if performance becomes an issue

Verdict

This is a good implementation that solves the problem effectively. With the line 575 fix, this is ready to merge. The IPv6 issue is a nice-to-have improvement that could be addressed in a follow-up PR.

Great work addressing a 9-year-old issue! 🎉

@pirate
Copy link
Owner Author

pirate commented Dec 6, 2025

@claude can you fix all the issues addressed in this review, we dont need IPv6 support, forget about that for now

@claude
Copy link

claude bot commented Dec 6, 2025

Claude finished @pirate's task —— View jobclaude/pr-54-20251206-0014Create PR ➔


Fixed Issues from Code Review

All issues identified in the code review have been addressed (excluding IPv6 support as requested).

Completed Tasks:

  • Read the current implementation
  • Fix logic bug on line 575 (slice after dedup, not before)
  • Remove redundant if parts check
  • Add check for empty remote_ip
  • Improve comment accuracy on line 521
  • Commit and push changes
  • Provide PR link

Changes Made

  1. Fixed critical logic bug (line 575): Changed from list(set(...))[:5] on first 5 connections to deduplicating all connections first, then limiting display to 5 unique IPs. This ensures we don't miss unique IPs from connections beyond the first 5.

  2. Removed redundant check (line 521): Eliminated the unnecessary if parts condition since we already validate len(parts) >= 9.

  3. Added IP filtering (line 530): Now checks for empty remote_ip and filters out wildcard addresses (starting with *) to avoid invalid entries.

  4. Improved comment accuracy (line 519): Updated the comment to clearly document the expected lsof output format with proper field names.

These fixes improve the robustness and correctness of the port scan detection feature without changing its core functionality.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Detect source of portscans

1 participant