[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Flink kafka consumer stopped committing offsets


Hi,

We have a Flink stream job that uses Flink kafka consumer. Normally it commits consumer offsets to Kafka.

However this stream ended up in a state where it's otherwise working just fine, but it isn't committing offsets to Kafka any more. The job keeps writing correct aggregation results to the sink, though. At the time of writing this, the job has been running 14 hours without committing offsets.

Below is an extract from taskmanager.log. As you can see, it didn't log anything until ~2018-06-07 22:08. Also that's where the log ends, these are the last lines so far.

Could you help check if this is a know bug, possibly already fixed, or something new?

I'm using a self-built Flink package 1.5-SNAPSHOT, flink commit 8395508b0401353ed07375e22882e7581d46ac0e which is not super old.

Cheers,
Juho

2018-06-06 10:01:33,498 INFO  org.apache.kafka.common.utils.AppInfoParser                   - Kafka version : 0.10.2.1
2018-06-06 10:01:33,498 INFO  org.apache.kafka.common.utils.AppInfoParser                   - Kafka commitId : e89bffd6b2eff799
2018-06-06 10:01:33,560 INFO  org.apache.kafka.clients.consumer.internals.AbstractCoordinator  - Discovered coordinator my-kafka-host-10-1-16-97.cloud-internal.mycompany.com:9092 (id: 2147483550 rack: null) for group aggregate-all_server_measurements_combined-20180606-1000.
2018-06-06 10:01:33,563 INFO  org.apache.kafka.clients.consumer.internals.AbstractCoordinator  - Discovered coordinator my-kafka-host-10-1-16-97.cloud-internal.mycompany.com:9092 (id: 2147483550 rack: null) for group aggregate-all_server_measurements_combined-20180606-1000.
2018-06-07 22:08:28,773 INFO  org.apache.kafka.clients.consumer.internals.AbstractCoordinator  - Marking the coordinator my-kafka-host-10-1-16-97.cloud-internal.mycompany.com:9092 (id: 2147483550 rack: null) dead for group aggregate-all_server_measurements_combined-20180606-1000
2018-06-07 22:08:28,776 WARN  org.apache.kafka.clients.consumer.internals.ConsumerCoordinator  - Auto-commit of offsets {topic1-2=OffsetAndMetadata{offset=12300395550, metadata=''}, topic1-18=OffsetAndMetadata{offset=12299210444, metadata=''}, topic3-0=OffsetAndMetadata{offset=5064277287, metadata=''}, topic4-6=OffsetAndMetadata{offset=5492398559, metadata=''}, topic2-1=OffsetAndMetadata{offset=89817267, metadata=''}, topic1-10=OffsetAndMetadata{offset=12299742352, metadata=''}} failed for group aggregate-all_server_measurements_combined-20180606-1000: Offset commit failed with a retriable exception. You should retry committing offsets.
2018-06-07 22:08:29,840 INFO  org.apache.kafka.clients.consumer.internals.AbstractCoordinator  - Marking the coordinator my-kafka-host-10-1-16-97.cloud-internal.mycompany.com:9092 (id: 2147483550 rack: null) dead for group aggregate-all_server_measurements_combined-20180606-1000
2018-06-07 22:08:29,841 WARN  org.apache.kafka.clients.consumer.internals.ConsumerCoordinator  - Auto-commit of offsets {topic1-6=OffsetAndMetadata{offset=12298347875, metadata=''}, topic4-2=OffsetAndMetadata{offset=5492779112, metadata=''}, topic1-14=OffsetAndMetadata{offset=12299972108, metadata=''}} failed for group aggregate-all_server_measurements_combined-20180606-1000: Offset commit failed with a retriable exception. You should retry committing offsets.