MAX_POLL_RECORDS_CONFIG in Kafka

The default value of MAX_POLL_RECORDS_CONFIG in Apache Kafka is:

500


What is MAX_POLL_RECORDS_CONFIG?

The MAX_POLL_RECORDS_CONFIG configuration specifies the maximum number of records a Kafka consumer can retrieve in a single call to the poll() method.


Key Details:

  1. Default Behavior:

    • If not configured, the consumer will fetch up to 500 records per poll.
    • This is a batch size limit and helps in controlling the volume of data processed at a time.
  2. Adjusting the Value:

    • Smaller values (e.g., 10): Useful for low-latency processing or when you want to process fewer records at a time.
    • Larger values (e.g., 1000): Useful for batch processing or applications designed to handle high throughput.
  3. Use Case Scenarios:

    • High throughput: Increase the value to minimize the number of poll() calls.
    • Low latency or real-time processing: Decrease the value to process smaller chunks faster.

Example Configuration

Java Configuration:

Properties props = new Properties();
props.put(ConsumerConfig.MAX_POLL_RECORDS_CONFIG, "100");

Spring Boot Configuration (application.yml):

spring:
  kafka:
    consumer:
      max-poll-records: 100

Impact of MAX_POLL_RECORDS_CONFIG

  1. Processing Time:

    • Higher values mean more records to process, potentially increasing processing time per poll.
    • This might cause the consumer to exceed the max.poll.interval.ms timeout, leading to a rebalance.
  2. Resource Utilization:

    • Larger batch sizes require more memory and CPU for processing.
    • Adjust the value based on your application's capacity.
  3. Throughput:

    • Increasing the value improves throughput but can delay individual record processing.
    • Useful for batch workloads where latency isn’t critical.

Tuning Recommendations:

  • Start with the default value (500).
  • Adjust based on:
    • Application processing time.
    • Memory and CPU constraints.
    • Desired latency and throughput trade-offs.

Would you like help tuning this configuration for a specific use case?

댓글

이 블로그의 인기 게시물

Using the MinIO API via curl

Install and run an FTP server using Docker

PYTHONPATH, Python 모듈 환경설정

Elasticsearch Ingest API

How to checkout branch of remote git, 깃 리모트 브랜치 체크아웃

Fundamentals of English Grammar #1

You can use Sublime Text from the command line by utilizing the subl command

How to start computer vision ai

Catch multiple exceptions in Python

git 명령어