A comprehensive serverless solution for tracking IAM, STS, and Console signin activities using AWS-native services, designed to operate within AWS free tier limits for most use cases. Features real-time security alerting and advanced analytics capabilities.
Critical Understanding:
- IAM events (iam.amazonaws.com) are recorded ONLY in us-east-1 regardless of where the API call originates
- STS events (sts.amazonaws.com) are recorded in the region where the assume role call was made
- Signin events (signin.amazonaws.com) are recorded ONLY in us-east-1 for global console authentication
- All three event types are essential for complete activity tracking
- Table Name:
{stack-name}-events - Purpose: Store IAM, STS, signin, and SSO/Identity Center events efficiently with fast query capabilities
- Schema Design:
- Partition Key:
event_date(format: YYYY-MM-DD) - Sort Key:
event_id(CloudTrail Event ID) - Attributes:
event_time(ISO timestamp)event_name(CloudTrail action)event_source(iam/sts/signin.amazonaws.com/sso.amazonaws.com)event_type(parsed: iam/sts/signin/sso)user_name(extracted user/role name)aws_region(important for STS and SSO events)source_ipuser_agentrequest_parameters(JSON)response_elements(JSON)error_code(if failed)error_message(if failed)
- Partition Key:
- Global Secondary Indexes:
- GSI on
user_namefor user-specific queries - GSI on
event_namefor action-specific queries
- GSI on
- TTL: Optional - set on items older than configured retention period
- Table Name:
{stack-name}-control - Purpose: Track processing state and checkpoints per region/source
- Schema Design:
- Partition Key:
region(e.g., "us-east-1-iam", "eu-west-1-sts", "us-east-1-signin") - Attributes:
last_processed_timestamp(ISO timestamp of last event processed)last_execution_time(when Lambda last ran for this source)events_processed_count(running total)last_error(if any)processing_status(active/paused/error)
- Partition Key:
- Separate checkpoints for each region/source combination
- Table Name:
{stack-name}-alerts - Purpose: Prevent duplicate alert notifications
- Schema Design:
- Partition Key:
event_id(CloudTrail Event ID) - Sort Key:
alert_type(CRITICAL/HIGH/WARNING) - Attributes:
alert_title(Alert description)message_id(SNS message ID)sent_timestamp(when alert was sent)ttl(30-day expiration)
- Partition Key:
- TTL Enabled: 30-day retention to keep alerts history but not forever
- Handler:
handler.lambda_handler - Purpose: Query CloudTrail across regions and sources for events
- Runtime: Python 3.13
- Memory: 1024MB (increased for multi-threading performance)
- Timeout: 5 minutes (300 seconds)
- Environment Variables:
PROCESS_IAM_EVENTS=truePROCESS_STS_EVENTS=truePROCESS_SIGNIN_EVENTS=trueFILTER_AWS_SERVICE_EVENTS=trueFILTERED_ROLES=''(comma-separated role patterns to filter)ALERTS_ENABLED=trueMAX_WORKERS=16
Modules:
handler.py: Main Lambda entry point with multi-threading coordinationcloudtrail_processor.py: CloudTrail API queries and event parsingdynamodb_operations.py: DynamoDB batch writes and checkpoint managementsecurity_alerts.py: Real-time security alert analysis and SNS notifications
Key Functions:
- Query us-east-1 CloudTrail for IAM events (iam.amazonaws.com)
- Query us-east-1 CloudTrail for Signin events (signin.amazonaws.com)
- Query all active regions for STS events using multi-threading
- Process up to 32 regions/sources in parallel
- Smart user name extraction (role names from AssumeRole, root from ConsoleLogin)
- Role filtering: Filter out noisy CSPM/security tool roles using configurable patterns
- Real-time security alert checking
- Transform and batch store in DynamoDB
Performance Optimizations:
- ThreadPoolExecutor with configurable max_workers
- Batch write to DynamoDB (25 items per batch)
- Connection pooling for AWS SDK clients
- Separate processing threads for each region/source
- Handler:
export_handler.lambda_handler - Purpose: Export DynamoDB data to S3 in Parquet format
- Runtime: Python 3.13
- Memory: 2048MB (for Pandas/PyArrow processing)
- Timeout: 15 minutes (900 seconds)
- Layer: AWSSDKPandas-Python313
Modules:
export_handler.py: Main export logic and date range managementparquet_processor.py: Pandas/PyArrow Parquet conversions3_operations.py: S3 bucket operations and path generationdynamodb_operations.py: DynamoDB scan operations for export
Features:
- Daily synchronization of missing dates
- Partitioned Parquet output by year/month/day/region
- Optimized compression and columnar storage
- Incremental export (only new/missing data)
- Purpose: Send real-time security notifications
- Features:
- Email subscriptions (configurable)
- Message attributes for filtering
- KMS encryption for sensitive alerts
From security_alerts.py:
check_root_activity: Root account login/failed login (CRITICAL)check_user_creation: IAM user creation (HIGH)check_admin_policy_attachment: Admin policy attachments (CRITICAL)- Detects: AdministratorAccess, IAMFullAccess, PowerUserAccess, AWSSSOMasterAccountAdministrator, AWSIdentityCenterFullAccess, AWSSSOMemberAccountAdministrator
check_dangerous_inline_policy: Policies with , iam:, sts:* (CRITICAL)check_access_key_creation: New access key generation (CRITICAL)check_role_trust_policy: External account/wildcard principals (CRITICAL)check_access_key_update: Access key status changes (HIGH)check_mfa_deletion: MFA device deletion/deactivation (CRITICAL)check_sso_permission_set_creation: SSO permission set creation (CRITICAL)check_sso_permission_set_update: SSO permission set updates (CRITICAL)check_sso_admin_policy_attachment: Admin policy attached to SSO permission set (CRITICAL)check_sso_account_assignment: SSO account assignment created (CRITICAL)check_sso_app_creation: SSO managed application instance creation (HIGH)check_sso_app_deletion: SSO managed application instance deletion (HIGH)
Alert Processing:
- Real-time analysis of each stored event
- Deduplication via alerts table (30-day TTL)
- Non-blocking processing (failures don't stop event collection)
- SNS notifications with detailed context
- Tracker Schedule: Configurable (hourly/6h/12h/daily)
- Exporter Schedule: Configurable (6h/12h/daily/weekly)
- Cost: Free for rule creation and invocations
- Purpose: Long-term storage and complex analytics
- Structure:
s3://{stack-name}-analytics-{account-id}/ iam-events/ year=2024/ month=1/ # No leading zeros (for Athena partition projection) day=15/ # No leading zeros (for Athena partition projection) region=us-east-1/ # Only if PARTITION_BY_REGION=true events_20240115_143052.parquet # Timestamp for uniqueness - Format: Parquet with snappy compression
- Partitioning: By year/month/day/region for query optimization
- Lifecycle Policy:
- Standard → Infrequent Access after 30 days
- Infrequent Access → Glacier after 90 days
- Deep Archive after 365 days
- Glue Database: Automatically managed schema
- Glue Crawler: Daily partition discovery
- Athena WorkGroup: Dedicated workspace with result location
- Pre-built Queries: 9 security and compliance queries
- Python CLI:
query_runner.pyfor programmatic access - Rich Terminal Output: Color-coded tables with formatting
- Export Capabilities: JSON output for integration
- Cost Tracking: Query execution metrics and cost estimation
EventBridge Timer → Tracker Lambda → [Parallel Processing]
├─ us-east-1 (IAM events)
├─ us-east-1 (Signin events)
├─ us-east-1/configured (SSO events)
├─ us-west-2 (STS events)
├─ eu-west-1 (STS events)
└─ ... (all active regions)
- User Name Extraction:
- AssumeRole events: Extract role name from
requestParameters.roleArn - ConsoleLogin events: Extract "root" or IAM username from
userIdentity - Other events: Use existing logic for session context and ARNs
- AssumeRole events: Extract role name from
- Event Type Classification: iam/sts/signin based on event source
- Error Handling: Capture both
errorCodeanderrorMessagefields - JSON Serialization: Store complex parameters as JSON strings
Processed Events → DynamoDB (Real-time) → Security Alerts → SNS
↓
Daily Export → S3 (Parquet) → Athena (Analytics)
- Lambda queries last 90 days of CloudTrail events per source:
- us-east-1: IAM events (iam.amazonaws.com)
- us-east-1: Signin events (signin.amazonaws.com)
- All regions: STS events (sts.amazonaws.com)
- Stores all events in DynamoDB with source/region information
- Records latest EventTime as checkpoint per region/source
- Lambda reads last checkpoint from DynamoDB control table (per region/source)
- For each region/source combination:
- Queries CloudTrail:
StartTime = checkpoint + 1 second, EndTime = now() - 5 minutes - 5-minute buffer prevents missing in-flight events
- Queries CloudTrail:
- Processes and stores new events
- Updates checkpoint for each region/source separately
- Triggers security alert analysis for each event
with ThreadPoolExecutor(max_workers=32) as executor:
futures = []
# IAM events (us-east-1 only)
if PROCESS_IAM_EVENTS:
futures.append(executor.submit(process_region_events,
'us-east-1', 'iam.amazonaws.com'))
# Signin events (us-east-1 only)
if PROCESS_SIGNIN_EVENTS:
futures.append(executor.submit(process_region_events,
'us-east-1', 'signin.amazonaws.com'))
# STS events (all regions)
if PROCESS_STS_EVENTS:
for region in active_regions:
futures.append(executor.submit(process_region_events,
region, 'sts.amazonaws.com'))
# Gather results with timeout
for future in as_completed(futures, timeout=240):
events_processed += future.result()From query_runner.py QUERY_DEFINITIONS:
user_lookup: User activity patterns and identificationfailed_auth: Failed authentication attempts and brute force detectionroot_usage: Root account activity detectionoff_hours: After-hours access outside 6 AM - 10 PMactive_users: Most active users with usage patterns and error ratespermission_changes: IAM policy modifications trackingrole_assumptions: Role usage patterns and frequency analysisdaily_summary: Daily activity summaries for compliance reportinghourly_activity: Peak usage analysis for capacity planningsso_permission_sets: SSO permission set management trackingsso_account_assignments: SSO account assignment trackingsso_admin_policies: SSO admin policy attachment detectionsso_applications: SSO application management trackingsso_admin_users: SSO administrative users identificationsso_activity_summary: SSO usage patterns by event type
athena_utilities.py: Core Athena operations (execute_athena_query, create_iam_events_table, get_table_statistics, validate_s3_location)query_runner.py: Main CLI tool with QUERY_DEFINITIONS dictionary containing all 15 pre-built queries, Rich terminal formatting supportanalytics_queries.sql: Reference SQL queries (if present)
deploy.sh: SAM deployment with S3 bucket creation, automatic AWS CLI/SAM CLI installation in venvdestroy.sh: CloudFormation stack deletion with confirmation promptstatus.sh: Formatted stack status display with colored outputvalidate.sh: Template validationlogs.sh: Lambda log viewingrun-query.sh: Query execution wrappersetup-athena.sh: Athena table initializationtest-alerts.sh: SNS alert testing
query_runner.py: Main CLI tool with 15 pre-built analytics queriesathena_utilities.py: Athena query execution and table managementrequirements.txt: Python dependencies (boto3, rich for terminal formatting)
Subject: [CRITICAL] IAM Alert: Root Account Login
IAM Activity Alert: Root Account Login
Severity: CRITICAL
Time: 2025-08-10T18:26:07+00:00
Region: us-east-1
Root account logged in from IP: 90.7.221.136
Event Details:
- Event Name: ConsoleLogin
- User: root
- Source IP: 90.7.221.136
- Event ID: b5636e39-f66f-4ce1-b9f1-2368c72b7fc6
Action Required: Review this activity immediately in CloudTrail.
- DynamoDB: Free (25GB storage, 25 RCU/WCU)
- Lambda: Free (1M invocations, 400,000 GB-seconds)
- SNS: Free (1,000 notifications)
- EventBridge: Free
- CloudTrail: Free (90-day event history)
- S3: Free (5GB standard storage)
- Total: $0/month for most organizations
- DynamoDB: ~$0.25/GB/month + $0.00013/RCU + $0.00065/WCU
- Lambda: $0.20 per 1M requests + $0.0000166667/GB-second
- SNS: $0.50 per 1M notifications (beyond free tier)
- S3: $0.023/GB/month (standard) with lifecycle transitions
- Athena: $5 per TB scanned (only when running queries)
- Estimated: $5-20/month for very active organizations
- Automatic DynamoDB on-demand pricing (pay per request)
- S3 lifecycle policies for automatic archiving
- Configurable processing schedules
- Query result caching in Athena
- Parquet compression reduces storage costs by 75%
-
Tracker Lambda Execution Role:
cloudtrail:LookupEventsin all regionsdynamodb:PutItem,GetItem,UpdateItem,Query,BatchWriteItemon events tabledynamodb:PutItem,GetItem,UpdateItemon control tabledynamodb:PutItem,GetItemon alerts tablesns:Publishon alerts topic (if enabled)ec2:DescribeRegionsfor region enumeration
-
Exporter Lambda Execution Role:
dynamodb:Scan,Queryon events tables3:PutObject,GetObject,ListBucketon analytics bucket
-
Query User Permissions:
athena:StartQueryExecution,GetQueryResultss3:GetObjecton analytics bucketglue:GetDatabase,GetTable,GetPartitions
- Encryption at Rest: DynamoDB and S3 use AWS managed keys
- Encryption in Transit: TLS 1.2+ for all API calls
- Access Control: Fine-grained IAM policies
- Network Security: Lambda functions run in AWS-managed VPCs
- Data Retention: Configurable TTL and lifecycle policies
CreateUser,DeleteUser,UpdateUserAttachUserPolicy,DetachUserPolicyCreateAccessKey,DeleteAccessKey,UpdateAccessKeyCreateRole,DeleteRole,UpdateRolePutRolePolicy,DeleteRolePolicyAttachRolePolicy,DetachRolePolicyCreateLoginProfile,UpdateLoginProfile
AssumeRole- Most critical for tracking role usageAssumeRoleWithWebIdentity- Federation trackingAssumeRoleWithSAML- Enterprise SSO trackingGetSessionToken- Temporary credential usageGetFederationToken- Federation token requests
ConsoleLogin- AWS Console authentication (success/failure)SwitchRole- Role switching in consoleExitRole- Role exit in console
CreatePermissionSet,UpdatePermissionSet- Permission set managementAttachManagedPolicyToPermissionSet- Policy attachments (critical for admin access)CreateAccountAssignment,DeleteAccountAssignment- Account access grantsCreateManagedApplicationInstance- Third-party app integrationsFederate- SSO authentication events
iam-activity-tracker/
├── functions/ # Lambda function code
│ ├── tracker/ # Real-time event collection
│ │ ├── handler.py # Main tracker Lambda entry point
│ │ ├── cloudtrail_processor.py # CloudTrail API and event parsing
│ │ ├── dynamodb_operations.py # DynamoDB operations and checkpoints
│ │ ├── security_alerts.py # 8 alert functions and SNS notifications
│ │ └── requirements.txt
│ └── exporter/ # S3 analytics export
│ ├── export_handler.py # Export Lambda entry point
│ ├── parquet_processor.py # Pandas/PyArrow Parquet conversion
│ ├── s3_operations.py # S3 bucket operations and paths
│ ├── dynamodb_operations.py # DynamoDB scan for export
│ └── requirements.txt
├── queries/ # Analytics tools
│ ├── athena_utilities.py # Athena query execution
│ ├── query_runner.py # CLI with 9 pre-built queries
│ ├── analytics_queries.sql # Raw SQL for reference
│ ├── setup.sh # Python environment and venv setup
│ └── requirements.txt
├── scripts/ # Operational scripts
│ ├── deploy.sh # SAM deployment
│ ├── destroy.sh # Stack cleanup
│ ├── status.sh # Stack status display
│ ├── validate.sh # Template validation
│ ├── logs.sh # Lambda log viewing
│ ├── run-query.sh # Query execution wrapper
│ ├── setup-athena.sh # Athena table setup
│ └── test-alerts.sh # Alert testing
├── template.yaml # SAM deployment template
└── README.md # User documentation
- Function Errors: Alert on any Lambda execution failures
- Function Duration: Alert on timeouts or high latency
- DynamoDB Throttling: Alert on capacity exceeded
- S3 Export Failures: Alert on export Lambda failures
/aws/lambda/{stack-name}-tracker: Event collection logs/aws/lambda/{stack-name}-exporter: S3 export logs- CloudTrail event details in structured JSON format
This architecture provides complete visibility into IAM activities while maintaining cost-effectiveness and operational simplicity through serverless design patterns.