OSDir


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Confusions About JDBCOutputFormat


Hi Hequn,

Establishing a connection for each batch write may also have idle connection problem, since we are not sure when the connection will be closed. We call flush() method when a batch is finished or  snapshot state, but what if the snapshot is not enabled and the batch size not reached before the connection is closed?

May be we could use a Timer to test the connection periodically and keep it alive. What do you think?

I will open a jira and try to work on that issue.

Best, 
wangsan



> On Jul 10, 2018, at 8:38 PM, Hequn Cheng <chenghequn@xxxxxxxxx> wrote:
> 
> Hi wangsan,
> 
> I agree with you. It would be kind of you to open a jira to check the problem.
> 
> For the first problem, I think we need to establish connection each time execute batch write. And, it is better to get the connection from a connection pool.
> For the second problem, to avoid multithread problem, I think we should synchronized the batch object in flush() method.
> 
> What do you think?
> 
> Best, Hequn
> 
> 
> 
> On Tue, Jul 10, 2018 at 2:36 PM, wangsan <wamgsam@xxxxxxx <mailto:wamgsam@xxxxxxx>> wrote:
> Hi all,
> 
> I'm going to use JDBCAppendTableSink and JDBCOutputFormat in my Flink application. But I am confused with the implementation of JDBCOutputFormat.
> 
> 1. The Connection was established when JDBCOutputFormat is opened, and will be used all the time. But if this connction lies idle for a long time, the database will force close the connetion, thus errors may occur.
> 2. The flush() method is called when batchCount exceeds the threshold, but it is also called while snapshotting state. So two threads may modify upload and batchCount, but without synchronization.
> 
> Please correct me if I am wrong.
> 
> ——
> wangsan
>