[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Limit on number of files to read for Dataset

It causes more overhead (processes etc) which might make it slower. Furthermore if you have them stored on HDFS then the bottleneck is the namenode which will have to answer millions of requests. 
The latter point will change in future Hadoop versions with http://ozone.hadoop.apache.org/

On 13. Aug 2018, at 21:01, Darshan Singh <darshan.meel@xxxxxxxxx> wrote:

Hi Guys,

Is there a limit on number of files flink dataset can read? My question is will there be any sort of issues if I have say millions of files to read to create single dataset.