You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hey there, recently we began discussing improvements to our rolling-restart process for brokers and quickly turned to KafkaJS to explore its potential for monitoring under-replicated partitions during broker restarts.
Our approach focuses on checking two key conditions for each topic partition after we restart one of the brokers:
If the number of current in-sync replicas (isr.length) for a partition is less than the configured minimum (min.insync.replicas), it indicates an under-replicated partition
If a partition has no leader (partition.leader < 0), it is also considered problematic
Sharing a short snippet to give a bit of context, not the final code, but helps get the idea... specifically referring to the areAllInSync function, also attached the functions it uses.
asyncfetchTopicMetadata(): Promise<{topics: KafkaJS.ITopicMetadata[]}>{returnthis.admin.fetchTopicMetadata();}configEntriesToMap(configEntries: KafkaJS.ConfigEntries[]): Map<string,string>{const configMap =newMap<string,string>();configEntries.forEach((config)=>configMap.set(config.configName,config.configValue));returnconfigMap;}asyncdescribeConfigs(topicMetadata: {topics: KafkaJS.ITopicMetadata[];}): Promise<Map<string,Map<string,string>>>{consttopicConfigurationsByName=newMap<string,Map<string,string>>();constresources=topicMetadata.topics.map((topic: KafkaJS.ITopicMetadata)=>({type: Constants.Types.Topic,configName: [Constants.MinInSyncReplicas],name: topic.name,}));constrawConfigurations=awaitthis.admin.describeConfigs({ resources,includeSynonyms: false});// Set the configurations by topic name for easier accessrawConfigurations.resources.forEach((resource)=>topicConfigurationsByName.set(resource.resourceName,this.configEntriesToMap(resource.configEntries)));returntopicConfigurationsByName;}asyncareAllInSync(): Promise<boolean>{consttopicMetadata=awaitthis.fetchTopicMetadata();consttopicConfigurations=awaitthis.describeConfigs(topicMetadata);// Flatten the replication metadata extracted from each partition of every topic into a single arrayconstvalidationResults=topicMetadata.topics.flatMap((topic: KafkaJS.ITopicMetadata)=>topic.partitions.map((partition: PartitionMetadata)=>this.extractReplicationMetadata(topic.name,partition,topicConfigurations)));constproblematicPartitions=validationResults.filter((partition)=>partition.isProblematic);
...
}
I’d appreciate any feedback that could help validate whether our logic for identifying problematic partitions between brokers restarts is correct, which currently relies on the condition partition.isr.length < parseInt(minISR) || partition.leader < 0.
The text was updated successfully, but these errors were encountered:
tomerguttman
changed the title
[Question] Using admin.fetchTopicMetadata to monitor under replicated partitions
[Question] Using admin.fetchTopicMetadata to monitor under replicated partitions between brokers restarts
Feb 23, 2025
Hey there, recently we began discussing improvements to our rolling-restart process for brokers and quickly turned to KafkaJS to explore its potential for monitoring under-replicated partitions during broker restarts.
Our approach focuses on checking two key conditions for each topic partition after we restart one of the brokers:
min.insync.replicas
), it indicates an under-replicated partitionSharing a short snippet to give a bit of context, not the final code, but helps get the idea... specifically referring to the
areAllInSync
function, also attached the functions it uses.I’d appreciate any feedback that could help validate whether our logic for identifying problematic partitions between brokers restarts is correct, which currently relies on the condition
partition.isr.length < parseInt(minISR) || partition.leader < 0
.@tulios @Nevon
Thanks in advance! 😃
The text was updated successfully, but these errors were encountered: