I think the first question that has to be answered here is: Is it possible *at all* to implement parallel reading of RFC 4180?
I.e., given a start byte offset, is it possible to reliably locate the first record boundary at or after that offset while scanning only a small amount of data?
If it is possible, then that's what the SDF (or BoundedSource, etc.) should do - split into blind byte ranges, and use this algorithm to assign consistent meaning to byte ranges.
To answer your questions 2 and 3: think of it this way.
The SDF's ProcessElement takes an element and a restriction.
ProcessElement must make only one promise: that it will correctly perform exactly the work associated with this element and restriction.
The challenge is that the restriction can become smaller while ProcessElement runs - in which case, ProcessElement must also do fewer work. This can happen concurrently to ProcessElement running, so really the guarantee should be rephrased as "By the time ProcessElement completes, it should have performed exactly the work associated with the element and tracker.currentRestriction() at the moment of completion".
This is all that is asked of ProcessElement. If Beam decides to ask the tracker to split itself into two ranges (making the current one - "primary" - smaller, and producing an additional one - "residual"), Beam of course takes the responsibility for executing the residual restriction somewhere else: it won't be lost.
E.g. if ProcessElement was invoked with [a, b), but while it was invoked it was split into [a, b-100) and [b-100, b), then the current ProcessElement call must process [a, b-100), and Beam guarantees that it will fire up another ProcessElement call for [b-100, b) (Of course, both of these calls may end up being recursively split further).
I'm not quite sure what you mean by "recombining" - please let me know if the explanation above makes things clear enough or not.