Set internal replication factors to match default and min.insync.replicas#140
Set internal replication factors to match default and min.insync.replicas#140
Conversation
kafka/10broker-config.yml
Outdated
| num.partitions=1 | ||
|
|
||
| default.replication.factor=3 | ||
| offsets.topic.replication.factor=3 |
There was a problem hiding this comment.
This property is set to 1 below at line 117.
There was a problem hiding this comment.
Ah, explains why we didn't get the documented default.
|
@shrinandj Given that the documented default is 3, and that this error hasn't been reported before, could it be in your case that if a consumer was running when kafka was starting for the first time, the topic is created with 1 replica because there's only one broker? I guess Maybe then the config change in e78f1c5 has no effect? Or will Kafka refuse to create the topic if there is an explicit value 3? I must find time to test this. May be relevant to #116. Update: didn't see your review comment above. It explains why we get 1. I've pushed 321189a. Maybe the gotcha can be alleviated by grouping all of these properties. |
|
I tried to add a repro case in https://github.com/Yolean/kubernetes-kafka/compare/fix-offset-topic-replication...test-consumergroup?expand=1 but I think it'll be too complex for end-to-end testing this way, while it should be rather trivial to verify that all replicas are considered members of the group. |
|
#108 is why I had three replicas for __consumer_offsets in the QA cluster I tested on now. It was created with 1 replica there too. The error message there looks a bit different from https://stackoverflow.com/questions/48536347/kafka-consumer-get-marking-the-coordinator-dead-error-when-using-group-ids, but the cause could be the same. |
|
I also noticed in now that our Kafka Streams meta topics have 1 replica. I wonder which of the |
which match the default.replication.factor and min.insync.replicas that we've changed to now.
|
|
Here's what we had in our new cluster: Properties Anyway defaults are I think I've also found the reason why our Kafka Streams topics have 1 replica. It's the replication.factor property. It isn't listed as a broker config, so probably it can only be set on clients. |
|
I had been quite focused on I can't find a metric for the configured topic replication factor, neither from #125 or from #128. The JMX MBean |
|
It could be argued that applications must check for this as part of QA, but for Kafka Streams this is non-trivial as you typically have lower replication on test clusters, and in production must provision the client with a setting that overrides the default. Hence I added a readiness "test" in e784bca, that could spot the problem before application downtime monitoring does. |
|
I wanted to try to fix existing topics using Kafka Manager, but it turns out it can add partitions but not increase replication factor: yahoo/CMAK#224. Will use our job instead as in #108 (comment). Will not affect this PR, as #95 was designed for manual topic name change. |
may be impacting the producer clients, losing messages or causing back-pressure in the application.
This is most often a “site down” type of problem and will need to be addressed immediately.”
Excerpt from: Neha Narkhede, Gwen Shapira, and Todd Palino. ”Kafka: The Definitive Guide”.
We now export kafka_controller_kafkacontroller_value{name="OfflinePartitionsCount",} and friends.
See #140 for why.
reverting #140 to avoid "does not meet the required replication factor '3' for the offsets topic"
|
I am new to kafka, I have upgraded my cluster from v3.0 to v3.1 and hit this issue. I had run jobs under maintains folder but I had also run On my kafka manager I had observed that all my cluster topics are replicated to 3 but only I could not find any clue and I had to deleted cluster. |
|
@cemo See https://github.com/Yolean/kubernetes-kafka/tree/master/maintenance#increase-a-topics-replication-factor Kafka maintenance terminology isn't IMO intuitive. Let us know if you find better tooling. There's many ways to use kafka-reassign-partitions.sh, depending on the interesting fact that you basically need to craft the reassignment json yourself. Look at the |
reverting #140 to avoid "does not meet the required replication factor '3' for the offsets topic"
reverting #140 to avoid "does not meet the required replication factor '3' for the offsets topic"
reverting #140 to avoid "does not meet the required replication factor '3' for the offsets topic"
* Scales to 2 brokers + 3 zookeeper instances * Same as default.replication.factor/min.insync.replicas * Minimizes the cluster for use with for example Minikube * Configures internal topics for single broker, reverting Yolean#140 to avoid "does not meet the required replication factor '3' for the offsets topic" * Ksql rc (#1) * Burrow's master now handles api v3 * Container fails to start, I see no logs * This log config looks better, but makes no difference regarding start
reverting #140 to avoid "does not meet the required replication factor '3' for the offsets topic"
Based on @shrinandj's find in #139.
As explained in #116 (comment) we'd like to keep min.insync.replicas=2.
What's odd is that the default, according to docs, is 3. Also I guess this change won't affect a running kafka cluster.
This property isn't mentioned in https://kafka.apache.org/documentation/#prodconfig.
There was a change in 0.11: "The offsets.topic.replication.factor broker config is now enforced upon auto topic creation."
This PR quite possibly needs
transaction.state.log.replication.factor.config.storage.replication.factor.status.storage.replication.factor.A similar fix toreplication.factor.