osdir.com


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Kinesis consumer e2e test


Hi,

yes, that is correct. The failure mapper is there to cause a failover event for which we can then check i) that exactly-once or at-least-once is not violated, depending on the expected semantics and ii) that the restore works at all ;-). You might be able to reuse org.apache.flink.streaming.tests.FailureMapper for this. For the future, it would surely also be nice to have a test that covers rescaling as well, but for now just having any real test is already a great improvement.

Best,
Stefan

> On 12. Nov 2018, at 05:23, Thomas Weise <thw@xxxxxxxxxx> wrote:
> 
> Hi Stefan,
> 
> Thanks for the info. So if I understand correctly, the pipeline you had in
> mind is:
> 
> Consumer -> Map -> Producer
> 
> What do you expect as outcome of the mapper failure? That no records are
> lost but some possibly duplicated in the sink?
> 
> Regarding the abstraction, I will see what I can do in that regard. From
> where I start it may make more sense to do some of that as follow-up when
> the Kafka test is ported.
> 
> Thanks,
> Thomas
> 
> 
> On Thu, Nov 8, 2018 at 10:20 AM Stefan Richter <s.richter@xxxxxxxxxxxxxxxxx>
> wrote:
> 
>> Hi,
>> 
>> I was also just planning to work on it before Stephan contacted Thomas to
>> ask about this test.
>> 
>> Thomas, you are right about the structure, the test should also go into
>> the `run-nightly-tests.sh`. What I was planning to do is a simple job that
>> consists of a Kinesis consumer, a mapper that fails once after n records,
>> and a kinesis producer. I was hoping that creation, filling, and validation
>> of the Kinesis topics can be done with the Java API, not by invoking
>> commands in a bash script. In general I would try to minimise the amount of
>> scripting and do as much in Java as possible. It would also be nice if the
>> test was generalised, e.g. that abstract Producer/Consumer are created from
>> a Supplier and also the validation is done over some abstraction that lets
>> us iterate over the produced output. Ideally, this would be a test that we
>> can reuse for all Consumer/Producer cases and we could also port the tests
>> for Kafka to that. What do you think?
>> 
>> Best,
>> Stefan
>> 
>>> On 8. Nov 2018, at 07:22, Tzu-Li (Gordon) Tai <tzulitai@xxxxxxxxxx>
>> wrote:
>>> 
>>> Hi Thomas,
>>> 
>>> I think Stefan Richter is also working on the Kinesis end-to-end test,
>> and
>>> seems to be planning to implement it against a real Kinesis service
>> instead
>>> of Kinesalite.
>>> Perhaps efforts should be synced here.
>>> 
>>> Cheers,
>>> Gordon
>>> 
>>> 
>>> On Thu, Nov 8, 2018 at 1:38 PM Thomas Weise <thw@xxxxxxxxxx> wrote:
>>> 
>>>> Hi,
>>>> 
>>>> I'm planning to add an end-to-end test for the Kinesis consumer. We have
>>>> done something similar at Lyft, using Kinesalite, which can be run as
>>>> Docker container.
>>>> 
>>>> I see that some tests already make use of Docker, so we can assume it
>> to be
>>>> present in the target environment(s)?
>>>> 
>>>> I also found the following ticket:
>>>> https://issues.apache.org/jira/browse/FLINK-9007
>>>> 
>>>> It suggest to also cover the producer, which may be a good way to create
>>>> the input data as well. The stream itself can be created with the
>> Kinesis
>>>> Java SDK.
>>>> 
>>>> Following the existing layout, there would be a new module
>>>> flink-end-to-end-tests/flink-kinesis-test
>>>> 
>>>> Are there any suggestions or comments regarding this?
>>>> 
>>>> Thanks,
>>>> Thomas
>>>> 
>> 
>>