-
Notifications
You must be signed in to change notification settings - Fork 110
Access Log Analytics
Tempesta FW uses an mmap()ed memory area to deliver client access logs to
the user-space in a zero-copy fashion. tfw_logger is a user-space daemon
spawning a worker thread per a CPU core and gathers access log records from
the mmap()ed area with no locks and no copying. The received from the kernel
access log records are grouped and sent to the ClikHouse
database. Thanks to the batches, ClickHouse can absorb millions of records
per second on a modest hardware node.
ClickHouse is a powerful analytical (column-oriented DBMS) for online analytical processing (OLAP). We have chosen ClickHouse because
- it can eat enormous number of batched records per second
- it provides powerful SQL-based query language
- it's open source
The ClickHouse advanced analytic queries over the stored data facilitate web performance analytics and security incidents (e.g. application DDoS attacks) response.
HTTP Request → Tempesta FW → mmap buffer → tfw_logger → ClickHouse
↓
/dev/tempesta_mmap_log
↓
[Worker Thread 1] [Worker Thread 2] ... [Worker Thread N]
↓
ClickHouse Database
Always keep your ClickHouse database separate from the edge servers that may come under a DDoS attack - this ensures you can analyze the incident in near real time.
For the current development version 0.9 add to your tempesta_fw.conf:
access_log mmap logger_config=/path/to/tfw_logger.json;
For the stable version 0.8 use:
access_log mmap mmap_host=localhost mmap_log=/var/log/tempesta_access.log;
Do not use dmesg for access_log! This can lead to kernel hung under heavy load and should
used only for debug purposes!
When using tfw_logger, you can configure the size of the per-CPU memory-mapped buffers
used for storing access log events before they are transmitted to external storage:
mmap_log_buffer_size <SIZE>;
SIZE specifies the buffer size in bytes for each CPU. The value must be a power of 2 and
a multiple of 4KB (page size). The allowed range is from 4KB to 128MB.
Defaults: 1M.
Examples:
mmap_log_buffer_size 4M;
mmap_log_buffer_size 512K;
mmap_log_buffer_size 16M;
Larger buffer sizes can improve performance under high load by reducing the frequency of buffer flushes to external storage, but will consume more memory. The optimal size depends on your traffic patterns and available system memory.
Create a separate JSON configuration file (e.g., /etc/tempesta/tfw_logger.json):
{
"log_path": "/var/log/tempesta/tfw_logger.log",
"access_log": {
"plugin_path": "/opt/tempesta/access_log.so",
"host": "localhost",
"port": 9000,
"user": "tempesta_user",
"password": "secure_password",
"db_name": "default",
"table_name": "access_log",
"max_events": 10000,
}
}
In 0.8 tfw_logger is configured by attributes for the access_log configuration
option in the main Tempesta FW configuration tempesta_fw.conf:
access_log mmap mmap_host=localhost mmap_usser=tempesta_user mmap_password=secure_password mmap_log=/var/log/tempesta_access.log;
| Option | Description | Default | Example |
|---|---|---|---|
log_path |
Path to tfw_logger log file | /var/log/tempesta/tfw_logger.log |
|
plugin_path |
Path to access logger plugin | - | /opt/tempesta/access_log.so |
access_log.host |
ClickHouse server hostname | localhost |
clickhouse.example.com |
access_log.port |
ClickHouse native protocol port | 9000 |
9000 |
access_log.db_name |
ClickHouse database name | default |
custom_default |
access_log.table_name |
ClickHouse table name | access_log |
custom_access_log |
access_log.user |
ClickHouse username (optional) | default |
tempesta_user |
access_log.password |
ClickHouse password (optional) | - | secure_password |
access_log.max_events |
Batch size for inserts | 1000 |
500 |
-
Small deployments: 4MB (
4194304) -
Medium traffic: 16MB (
16777216) -
High traffic: 64MB+ (
67108864) -
Enterprise: 256MB+ (
268435456)
Buffer size must be multiple of page size and ≥4096 bytes.
tfw_logger creates the following ClickHouse table structure:
CREATE TABLE IF NOT EXISTS access_log.access_log (
timestamp DateTime64(3, 'UTC'),
address IPv6,
method UInt8,
version UInt8,
status UInt16,
response_content_length UInt64,
response_time UInt32,
vhost String,
uri String,
referer String,
user_agent String,
tft UInt64,
tfh UInt64,
dropped_events UInt64
) ENGINE = MergeTree()
ORDER BY timestamp;| Field | Type | Description |
|---|---|---|
timestamp |
DateTime64(3) | Request timestamp with millisecond precision |
addr |
IPv6 | Client IP address (IPv4 mapped to IPv6) |
method |
UInt8 | HTTP method (GET=1, POST=2, etc.) |
version |
UInt8 | HTTP method (GET=1, POST=2, etc.) |
status |
UInt16 | HTTP response status code |
response_content_length |
UInt64 | Response content length in bytes |
response_time |
UInt32 | Response time in milliseconds |
vhost |
String | Host header value |
uri |
String | Request URI path and query |
referer |
String | Referer header |
user_agent |
String | User-Agent header |
tft |
UInt64 | TF TLS hash |
tfh |
UInt64 | TF HTTP hash |
dropped_events |
UInt64 | Number of dropped events (monitoring) |
Field method is a numerical value (see tfw_http_meth_t in
http.h):
1: COPY
2: DELETE
3: GET
4: HEAD
5: LOCK
6: MKCOL
7: MOVE
8: OPTIONS
9: PATCH
10: POST
11: PROPFIND
12: PROPPATCH
13: PUT
14: TRACE
15: UNLOCK
16: PURGE
17: UNKNOWN
Field version is also a numerical value:
0: INVALID
1: HTTP 0.9
2: HTTP 1.0
3: HTTP 1.1
4: HTTP 2
cd tempesta/logger
make build# Generate default configuration
./tfw_logger --generate --config /etc/tempesta/tfw_logger.json
# Edit configuration as needed
sudo nano /etc/tempesta/tfw_logger.json-- Create database and user
CREATE DATABASE access_log;
CREATE USER tempesta_user IDENTIFIED BY 'secure_password';
GRANT ALL ON access_log.* TO tempesta_user;Add to tempesta_fw.conf FOR 0.9:
access_log mmap logger_config=/etc/tempesta/tfw_logger.json;or for 0.8:
access_log mmap mmap_host=localhost mmap_log=/var/log/tempesta_access.log;
# Start Tempesta FW (creates mmap device)
sudo ./scripts/tempesta.sh --start
# tfw_ogger will be started automatically by tempesta.sh
# Or start manually:
sudo ./logger/tfw_logger --config /etc/tempesta/tfw_logger.json# Show help
./tfw_logger --help
# Start with specific configuration
./tfw_logger --config /etc/tempesta/tfw_logger.json
# Override configuration options
./tfw_logger --config config.json --host clickhouse.example.com --port 9001 --table custom_access_log
# Test configuration
./tfw_logger --config config.json --help| Option | Description |
|---|---|
--help, -h |
Show help message |
--generate, -g |
Generate default configuration file |
--config, -c PATH |
Path to JSON configuration file |
--host, -H HOST |
ClickHouse hostname (override) |
--port, -P PORT |
ClickHouse port (override) |
--database, -d DATABASE |
ClickHouse database name (override) |
--table, -t TABLE |
ClickHouse table name (override) |
--user, -u USER |
ClickHouse username (override) |
--password, -p PASS |
ClickHouse password (override) |
--log-path, -l PATH |
Log file path (override) |
tfw_logger automatically sets CPU affinity for worker threads:
- Each worker thread is bound to a specific CPU core
- Number of threads automatically matches available CPU cores (respects affinity/cgroups)
- Improves cache locality and reduces context switching
Larger buffers reduce syscall overhead but increase memory usage:
| Traffic Level | Buffer Size | Memory Usage |
|---|---|---|
| Low (< 1K RPS) | 4MB | ~4MB per worker |
| Medium (1K-10K RPS) | 16MB | ~16MB per worker |
| High (10K-100K RPS) | 64MB | ~64MB per worker |
| Enterprise (100K+ RPS) | 256MB+ | ~256MB+ per worker |
-- Optimize table for high-frequency inserts
ALTER TABLE access_log.access_log
MODIFY SETTING merge_with_ttl_timeout = 3600;
-- Create materialized views for common queries
CREATE MATERIALIZED VIEW access_log.hourly_stats
ENGINE = SummingMergeTree()
ORDER BY (toStartOfHour(timestamp), status)
AS SELECT
toStartOfHour(timestamp) as hour,
status,
count() as requests,
avg(response_time) as avg_response_time
FROM access_log.access_log
GROUP BY hour, status;# Check if tfw_logger is running
ps aux | grep tfw_logger
# Check log output
tail -f /var/log/tempesta/tfw_logger.log
# Verify ClickHouse connectivity
clickhouse-client --query "SELECT count() FROM access_log.access_log"Monitor these key metrics:
-- Request rate
SELECT
toStartOfMinute(timestamp) as minute,
count() as requests_per_minute
FROM access_log.access_log
WHERE timestamp >= now() - INTERVAL 1 HOUR
GROUP BY minute
ORDER BY minute;
-- Error rates
SELECT
status,
count() as count,
count() * 100.0 / sum(count()) OVER () as percentage
FROM access_log.access_log
WHERE timestamp >= now() - INTERVAL 1 HOUR
GROUP BY status
ORDER BY count DESC;
-- Dropped events (buffer overruns)
SELECT max(dropped_events) as max_dropped
FROM access_log.access_log
WHERE timestamp >= now() - INTERVAL 1 HOUR;tfw_logger won't start
# Check if Tempesta FW is running
sudo ./scripts/tempesta.sh --status
# Verify mmap device exists
ls -la /dev/tempesta_mmap_log
# Check permissions
sudo chmod 666 /dev/tempesta_mmap_log # Temporary fixPermission denied on mmap device
# Run tfw_logger with appropriate permissions
sudo ./tfw_logger --config config.json
# Or fix device permissions permanently
sudo chown tempesta:tempesta /dev/tempesta_mmap_logClickHouse connection failed
# Test ClickHouse connectivity
clickhouse-client --host localhost --port 9000 --query "SELECT 1"
# Check user permissions
clickhouse-client --query "SHOW GRANTS FOR tempesta_user"High memory usage
# Reduce buffer size in configuration
{
"clickhouse": {
"max_events": 500 # Reduce batch size
}
}Common log patterns:
# Successful startup
grep "Starting Tempesta FW Logger" /var/log/tempesta/tfw_logger.log
# Worker thread info
grep "worker threads started" /var/log/tempesta/tfw_logger.log
# ClickHouse connectivity
grep "ClickHouse" /var/log/tempesta/tfw_logger.log
# Error patterns
grep -i error /var/log/tempesta/tfw_logger.log-- Top pages by requests
SELECT
uri,
count() as requests,
avg(response_time) as avg_response_time_us
FROM access_log.access_log
WHERE timestamp >= now() - INTERVAL 1 DAY
GROUP BY uri
ORDER BY requests DESC
LIMIT 10;
-- Status code distribution
SELECT
status,
count() as requests,
count() * 100.0 / sum(count()) OVER () as percentage
FROM access_log.access_log
WHERE timestamp >= now() - INTERVAL 1 DAY
GROUP BY status;
-- Traffic by hour
SELECT
toStartOfHour(timestamp) as hour,
count() as requests,
uniq(addr) as unique_visitors
FROM access_log.access_log
WHERE timestamp >= now() - INTERVAL 1 DAY
GROUP BY hour
ORDER BY hour;-- High error rate alert
SELECT count() as error_count
FROM access_log.access_log
WHERE timestamp >= now() - INTERVAL 5 MINUTE
AND status >= 500;
-- Slow response time alert
SELECT count() as slow_requests
FROM access_log.access_log
WHERE timestamp >= now() - INTERVAL 5 MINUTE
AND response_time > 1000000; -- > 1 second# Build and test
cd tempesta/logger
make build test
- Home
- Requirements
- Installation
-
Configuration
- Migration from Nginx
- On-the-fly reconfiguration
- Handling clients
- Backend servers
- Load Balancing
- Caching Responses
- Non-Idempotent Requests
- Modify HTTP Messages
- Virtual hosts and locations
- HTTP Session Management
- HTTP Tables
- HTTP(S) Security
- Header Via
- Health monitor
- TLS
- Virtual host confusion
- Traffic Filtering by Fingerprints
- Access Log Analytics
- Run & Stop
- Application Performance Monitoring
- Use cases
- Performance
- Bot Protection
- Contributing