medium-post-resources/prioritizing_service_availability.md at main · tal-asulin/medium-post-resources

Key consideration	Availability	Durability
Service availability	Service availability will not be affected.	Kafka producers route their messages to other healthy partitions. In severe cases like a low number of partitions for a topic, producers might stop producing data entirely. Kafka consumers will most likely halt their service until the offline partitions become online again. This might result in increased consumer lag.
Data loss	Data loss for the offsets that were not synced will most likely be lost. In the worst-case scenario, the entire partition log might be lost.	Durability will be guaranteed if persistent storage exists for the partition logs (like network storage). But even so, producers might lose data due to the temporary inability to write into Kafka in addition to potential inner memory queue overload.
Data skew	No data skew.	As the service should keep producing messages to the healthy partitions, data skew will likely happen.
Consumer lag	Lag should not happen, except momentarily, during consumer group rebalance operation for partition reassignment.	Kafka allows consuming from healthy partitions while it suffers from offline partitions; some libraries just stop consuming until all the partitions are online again.
Degraded throughput rate	No effect on throughput (as long as the cluster capacity in terms of resources was not impacted).	Throughput mostly depends on the partitioner algorithm used. There is a higher chance of impact when choosing a sticky partition algorithm.

Provide feedback