currently, it's impossible to process both a stream and a (dynamically updated) dataset in a single job. I'll provide you with some workarounds, all of which are based on that the file for active test names is not so large.
(1) You may define your own stream source which should be aware of the file update, and keep the input file as a stream (the Stream B as you described). Some special records can be inserted to indicate the start and end of an update. Note that instead of using the `keyBy()` method, the Stream B should be broadcasted, while the Stream A can be partitioned arbitrarily. With this method, you can clean and rebuild the states according to the start/end indicators.
(2) You may also take the file of active test names as external states and set processing time timers to update them regularly (e.g., with 1 min interval) in a ProcessFunction.
IMO, the watermark may not work as expected for your use case. Besides, since the file will be updated unpredictably, it's hard to guarantee the precision of results.
Hope that helps,