Note that even if you use GroupByKey and a 1 second window, it could be that key K at time T1 and T2 are scheduled to be processed in parallel which means that you will still need locking.
Apache Beam has no transform which allows you to partition the data how you want without using synchronization/locking/... unless your underlying storage engine allows you to pass in user specified version numbers which then you could use the windowing information to produce larger and larger version numbers so the storage engine would know which write it should keep and which write it should discard.
Alternatively, if you know which runner you want to use, it may be that intrinsically via some execution properties of the runner you ca get what you need but you'll have a pipeline which isn't following strict Apache Beam semantics and if the runner was to change, it may break you.
Finally, if none of that works out, you'll want to use a stream processing engine that allows you to specifically say that any key range should only ever be processed on a single machine at a time. This can have lots of its own problems if you hit a hot key since one machine will be swamped processing while the others are relatively idle.