I think that Hadoop recommends to solve such setups with a viewfs:// that spans both HDFS clusters and then the two different clusters look like different paths within on file system. Similar as mounting different file systems into one directory tree in unix.On Tue, May 22, 2018 at 4:41 PM, Kien Truong <duckientruong@xxxxxxxxx> wrote:
You only need to modify the core-site and hdfs-site read by Flink.
On 5/22/2018 9:07 PM, Deepak Sharma wrote:
Wouldnt 2 core-site and hdfs-site xmls need to be provided in this case then ?
On Tue, May 22, 2018, 19:34 Raul Valdoleiros <raul.valdoleiros.oliveira@gma
Thanks for you reply.
Your goal is to store the checkpoints in one hdfs cluster and the data in other hdfs cluster.
So the flink should be able to connect to two different hdfs clusters.
2018-05-22 15:00 GMT+01:00 Kien Truong <duckientruong@xxxxxxxxx>:
If your cluster are not high-availability clusters then just use the full path to the cluster.
For example, to refer to directory /checkpoint on cluster1, use hdfs://namenode1_ip:port/check
Like wise, /data on cluster2 will be hdfs://namenode2_ip:port/data
If your cluster is a HA cluster, then you need to modify the hdfs-site.xml like section 1 of this guide
DPDocuments/HDP2/HDP-2.6.4/bk_ administration/content/distcp_ between_ha_clusters.html
Then use the full path to the cluster hdfs://cluster1ha/checkpoint & hdfs://cluster2ha/data
On 5/21/2018 9:19 PM, Raul Valdoleiros wrote:
I want to store my data in one hdfs and the flink checkpoints in another hdfs. I didn't find a way to do it, anyone can point me a direction?
Thanks in advance,