MAX_POLL_RECORDS_CONFIG in Kafka

The default value of MAX_POLL_RECORDS_CONFIG in Apache Kafka is:

500


What is MAX_POLL_RECORDS_CONFIG?

The MAX_POLL_RECORDS_CONFIG configuration specifies the maximum number of records a Kafka consumer can retrieve in a single call to the poll() method.


Key Details:

  1. Default Behavior:

    • If not configured, the consumer will fetch up to 500 records per poll.
    • This is a batch size limit and helps in controlling the volume of data processed at a time.
  2. Adjusting the Value:

    • Smaller values (e.g., 10): Useful for low-latency processing or when you want to process fewer records at a time.
    • Larger values (e.g., 1000): Useful for batch processing or applications designed to handle high throughput.
  3. Use Case Scenarios:

    • High throughput: Increase the value to minimize the number of poll() calls.
    • Low latency or real-time processing: Decrease the value to process smaller chunks faster.

Example Configuration

Java Configuration:

Properties props = new Properties();
props.put(ConsumerConfig.MAX_POLL_RECORDS_CONFIG, "100");

Spring Boot Configuration (application.yml):

spring:
  kafka:
    consumer:
      max-poll-records: 100

Impact of MAX_POLL_RECORDS_CONFIG

  1. Processing Time:

    • Higher values mean more records to process, potentially increasing processing time per poll.
    • This might cause the consumer to exceed the max.poll.interval.ms timeout, leading to a rebalance.
  2. Resource Utilization:

    • Larger batch sizes require more memory and CPU for processing.
    • Adjust the value based on your application's capacity.
  3. Throughput:

    • Increasing the value improves throughput but can delay individual record processing.
    • Useful for batch workloads where latency isn’t critical.

Tuning Recommendations:

  • Start with the default value (500).
  • Adjust based on:
    • Application processing time.
    • Memory and CPU constraints.
    • Desired latency and throughput trade-offs.

Would you like help tuning this configuration for a specific use case?

댓글

이 블로그의 인기 게시물

To switch to a specific tag in a Git repository

How to checkout branch of remote git, 깃 리모트 브랜치 체크아웃

Using the MinIO API via curl

To download a file from MinIO using Spring Boot, 스프링부트 Minio 사용하기

리눅스의 부팅과정 (프로세스, 서비스 관리)

Chromium 개발 환경 세팅, 크로미움 개발 준비하기

Joining an additional control plane node to an existing Kubernetes cluster

urllib3 with proxy settings

CDPEvents in puppeteer

Avro + Grpc in python