MAX_POLL_RECORDS_CONFIG in Kafka

The default value of MAX_POLL_RECORDS_CONFIG in Apache Kafka is:

500


What is MAX_POLL_RECORDS_CONFIG?

The MAX_POLL_RECORDS_CONFIG configuration specifies the maximum number of records a Kafka consumer can retrieve in a single call to the poll() method.


Key Details:

  1. Default Behavior:

    • If not configured, the consumer will fetch up to 500 records per poll.
    • This is a batch size limit and helps in controlling the volume of data processed at a time.
  2. Adjusting the Value:

    • Smaller values (e.g., 10): Useful for low-latency processing or when you want to process fewer records at a time.
    • Larger values (e.g., 1000): Useful for batch processing or applications designed to handle high throughput.
  3. Use Case Scenarios:

    • High throughput: Increase the value to minimize the number of poll() calls.
    • Low latency or real-time processing: Decrease the value to process smaller chunks faster.

Example Configuration

Java Configuration:

Properties props = new Properties();
props.put(ConsumerConfig.MAX_POLL_RECORDS_CONFIG, "100");

Spring Boot Configuration (application.yml):

spring:
  kafka:
    consumer:
      max-poll-records: 100

Impact of MAX_POLL_RECORDS_CONFIG

  1. Processing Time:

    • Higher values mean more records to process, potentially increasing processing time per poll.
    • This might cause the consumer to exceed the max.poll.interval.ms timeout, leading to a rebalance.
  2. Resource Utilization:

    • Larger batch sizes require more memory and CPU for processing.
    • Adjust the value based on your application's capacity.
  3. Throughput:

    • Increasing the value improves throughput but can delay individual record processing.
    • Useful for batch workloads where latency isn’t critical.

Tuning Recommendations:

  • Start with the default value (500).
  • Adjust based on:
    • Application processing time.
    • Memory and CPU constraints.
    • Desired latency and throughput trade-offs.

Would you like help tuning this configuration for a specific use case?

댓글

이 블로그의 인기 게시물

Using the MinIO API via curl

Boilerplate for typescript server programing

How to split a list into chunks of 100 items in JavaScript, 자바스크립트 리스트 쪼개기

HTML Inline divisions at one row by Tailwind

CDPEvents in puppeteer

가속도 & 속도

Declaration of custom object or type in Node.js

Sparse encoder

Gradle multi-module project

How to checkout branch of remote git, 깃 리모트 브랜치 체크아웃