Skip to content

Conversation

@d-Rickyy-b
Copy link
Owner

This pull request introduce architectural improvements and enhancements to the broadcast system, adds support for external stream processing (Kafka and NSQ), updates configuration and dependencies, and refines linter and build settings. The main focus is on refactoring the certificate broadcast mechanism to be more modular and extensible, enabling integration with various client types and external systems.

Broadcast System Refactor and Extensibility

  • Refactored the broadcast system by introducing a new Dispatcher (formerly BroadcastManager) in internal/broadcast/broadcastmanager.go, which now manages clients via a CertProcessor interface, allowing for different client implementations (e.g., WebSocket, Kafka, NSQ). The dispatcher handles registration, unregistration, message dispatching, and client statistics.
  • Added a BaseClient struct to encapsulate common client logic (buffering, skipping, naming, etc.), which can be embedded in different client types.

Configuration and External Integration

  • Extended the config.yaml to support external stream processing tools (Kafka and NSQ), added options for real client IP handling, and improved buffer size configuration (renamed broadcastmanager buffer to dispatcher).

Dependency and Build Updates

  • Standardized Docker Compose port mappings by quoting host:port strings for compatibility.
  • Changed Goreleaser and changelog settings for improved build consistency.

Linter and Code Quality Enhancements

  • Overhauled .golangci.yml to use the new v2 format, refined enabled linters, moved formatter settings, and improved exclusion rules for test files.

Bug Fixes and Minor Improvements

  • Fixed a context leak in cmd/certpicker/main.go by properly canceling the context after use.
  • Updated Go module dependencies.
  • Standardized Docker Compose port mapping syntax.
  • Migrated .golangci.yml to v2, refined enabled linters, and improved formatter and exclusion settings.
  • Fixed context leak in certificate picker CLI.

Comment on lines +102 to +112
case message := <-c.broadcastChan:
if !c.isConnected {
continue
}

_ = c.conn.SetWriteDeadline(time.Now().Add(writeWait))

c.conn.Broker()
_, err := c.conn.WriteMessages(
kafka.Message{Value: message},
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we receive a lot of certs every second (3/second is not a bad guess), is it perhaps more efficient and less resource intensive to bulk write messages to kafka ?

example:

_, err = conn.WriteMessages(
    kafka.Message{Value: message1},
    kafka.Message{Value: message2},
    kafka.Message{Value: message3},
)

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought about bulk submissions before. We could buffer the certs for, say, a second and then send out all received certs. With the CT logs from the Google loglist, Certstream itself processes around 250-300 certs per second in total. I have not tried this but there might be problems with this approach as submission of the ~300 or more certificates could take longer than our "buffering time" of 1s.

Another approach could be to always wait until we collected, say, 50 certificates before sending them over to kafka. The issue with that approach is the added delay in case a monitored CA does not release that many certificates. I'm interested in your thoughts though.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. 300/Sec means that we should strongly consider buffering certs before pushing them to kafka.
  2. I'm in favour of the second approach, we could buffer 50 certs but we need not wait until we have all 50, we could have a timer/ticker that fires at certain intervals pushing all available certs in the buffer to kafka. (basically a combination of the first and second approach that you outlined)

Copy link
Contributor

@messede-degod messede-degod Sep 6, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In any case (with or without buffering) do you thing we might end up skipping a lot of certs due to the channel being full ? (this might depend on how large certBufferSize is)

Comment on lines +32 to +42
kc := &KafkaClient{
conn: conn,
addr: addr,
topic: topic,
BaseClient: BaseClient{
broadcastChan: make(chan []byte, certBufferSize),
stopChan: make(chan struct{}),
name: name,
subType: subType,
},
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we send a lot of json data to kafka, which would consume a non significant amount of bandwidth, should we perhaps consider turning on compression for kafka messages (could be a optional setting).

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we perhaps consider turning on compression for kafka messages (could be a optional setting).

Sounds like a good idea. Should be easily achievable, too: https://pkg.go.dev/github.com/segmentio/kafka-go#readme-compression

@github-actions
Copy link

@d-Rickyy-b We couldn't find any modification to the CHANGELOG.md file. If your changes are not suitable for the changelog, that's fine. Otherwise please add them to the changelog!

@d-Rickyy-b
Copy link
Owner Author

I rebased the commits onto master to have all the latest in-progress features in the kafka branch as well

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants