[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[jira] [Created] (FLINK-9450) Job hangs if S3 access it denied during checkpoints

Elias Levy created FLINK-9450:

             Summary: Job hangs if S3 access it denied during checkpoints
                 Key: FLINK-9450
                 URL: https://issues.apache.org/jira/browse/FLINK-9450
             Project: Flink
          Issue Type: Bug
          Components: State Backends, Checkpointing
    Affects Versions: 1.4.2
            Reporter: Elias Levy

We have a streaming job that consumes from and writes to Kafka.  The job is configured to checkpoint to S3.  If we deny access to S3 by using iptables on the TM host to deny all outgoing connections to ports 80 and 443, whether using DROP or REJECT, and whether using REJECT with -reject-with tcp-reset or -r reject-with imp-port-unreachable, the job soon stops publishing to Kafka.

This happens whether or not the Kafka sources have {{setCommitOffsetsOnCheckpoints}} set to true or false.

The system is configured to use Presto for the S3 file system.  The job has a small amount of state, so it is configured to use {{FsStateBackend}} with asynchronous snapshots.

If the ip tables rules are removed, the job continues the function.

I would expect the job to either fail or continue running if a checkpoint fails.

This message was sent by Atlassian JIRA