osdir.com

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Writing stream to Hadoop


I think you can look at this comment, thanks.
* <p>Part files can be in one of three states: {@code in-progress}, {@code pending} or {@code finished}.
* The reason for this is how the sink works together with the checkpointing mechanism to provide exactly-once
* semantics and fault-tolerance. The part file that is currently being written to is {@code in-progress}. Once
* a part file is closed for writing it becomes {@code pending}. When a checkpoint is successful the currently
* pending files will be moved to {@code finished}.





2018-06-05 17:14 GMT+08:00 miki haiat <miko5054@xxxxxxxxx>:
Im trying to write some data to Hadoop by using this code 

The state backend is set without time    
StateBackend sb = new FsStateBackend("hdfs://***:9000/flink/my_city/checkpoints");
env.setStateBackend(sb);
BucketingSink<Tuple2<IntWritable, Text>> sink =
new BucketingSink<>("hdfs://****:9000/mycity/raw");
sink.setBucketer(new DateTimeBucketer("yyyy-MM-dd--HHmm"));
sink.setInactiveBucketCheckInterval(120000);
sink.setInactiveBucketThreshold(120000);
the result is that all the files are stuck in in.programs  status and not closed.
is it related to the state backend configuration.

thanks,

Miki