[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: FileSystems should retrieve lastModified time

+1 for adding last modified time to MatchResult.Metadata. 

Sounds like a useful change that will enable additional use-cases.

- Cham

On Mon, Oct 29, 2018 at 11:08 AM Jeff Klukas <jklukas@xxxxxxxxxxx> wrote:
I just wrote up a JIRA issues proposing that FileSystem implementations retrieve lastModified time of the files they list: https://issues.apache.org/jira/browse/BEAM-5910

Any immediate concerns? I'm not intimately familiar with HDFS, but I'm otherwise confident that GCS, S3, and local filesystems can all give us a suitable timestamp.

In the short term, this change would allow users to write their own polling logic on top of FileSystems to periodically check for updates to files. Currently, you would need to fall back to the APIs for each individual storage provider.

Longer term, I'd love to see FileIO.match.continuously support an option for returning updated contents when files are updated.