MAX_POLL_RECORDS_CONFIG in Kafka

The default value of MAX_POLL_RECORDS_CONFIG in Apache Kafka is:

500


What is MAX_POLL_RECORDS_CONFIG?

The MAX_POLL_RECORDS_CONFIG configuration specifies the maximum number of records a Kafka consumer can retrieve in a single call to the poll() method.


Key Details:

  1. Default Behavior:

    • If not configured, the consumer will fetch up to 500 records per poll.
    • This is a batch size limit and helps in controlling the volume of data processed at a time.
  2. Adjusting the Value:

    • Smaller values (e.g., 10): Useful for low-latency processing or when you want to process fewer records at a time.
    • Larger values (e.g., 1000): Useful for batch processing or applications designed to handle high throughput.
  3. Use Case Scenarios:

    • High throughput: Increase the value to minimize the number of poll() calls.
    • Low latency or real-time processing: Decrease the value to process smaller chunks faster.

Example Configuration

Java Configuration:

Properties props = new Properties();
props.put(ConsumerConfig.MAX_POLL_RECORDS_CONFIG, "100");

Spring Boot Configuration (application.yml):

spring:
  kafka:
    consumer:
      max-poll-records: 100

Impact of MAX_POLL_RECORDS_CONFIG

  1. Processing Time:

    • Higher values mean more records to process, potentially increasing processing time per poll.
    • This might cause the consumer to exceed the max.poll.interval.ms timeout, leading to a rebalance.
  2. Resource Utilization:

    • Larger batch sizes require more memory and CPU for processing.
    • Adjust the value based on your application's capacity.
  3. Throughput:

    • Increasing the value improves throughput but can delay individual record processing.
    • Useful for batch workloads where latency isn’t critical.

Tuning Recommendations:

  • Start with the default value (500).
  • Adjust based on:
    • Application processing time.
    • Memory and CPU constraints.
    • Desired latency and throughput trade-offs.

Would you like help tuning this configuration for a specific use case?

댓글

이 블로그의 인기 게시물

Using the MinIO API via curl

Kafka consumer in a Spring Boot application using a scheduled task

To download a file from MinIO using Spring Boot, 스프링부트 Minio 사용하기