Page 1 of 1

Issue while connecting to HDFS

Posted: Fri Sep 30, 2016 11:53 am
by just4u_sharath
Hi,

I am getting below errors while trying to connect to HDFS through datastage using any stage like BDFS,HDFS File connector, File Connector.

Error 1:
main_program: Fatal Error: Failed to read data from Application Master sending YARN resource information. Check the health of YARN/Hadoop. Socket error: Connection reset by peer. Look into YARN Client's logs for more information at /opt/IBM/InformationServer/Server/PXEngine/logs/yarn_logs/yarn_client.root.out*

Error2:
Fatal Error: Failed to read data from Application Master on allocated containers. Try reducing APT_YARN_CONTAINER_SIZE if the YARN cluster is running low on resources. Also check that yarn.scheduler.minimum-allocation-mb setting in yarn-site.xml is set below APT_YARN_CONTAINER_SIZE. If yarn.nodemanager.vmem-check-enabled=true check if the virtual memory size for containers needs to be increased.. Socket returned end of file. Look into Application Master's logs for more information at mildatastage:/usr/local/hadoop-2.7.2/logs/userlogs/application_1475245442231_2781/container_1475245442231_2781_01_000001/../pxyarn_logs/oshjob.2781_0. These logs will be moved to a log aggregation directory after job completion if Hadoop log aggregation is enabled.

Error3:
main_program: Waiting for YARN containers with the section leader processes to start running.
main_program: Accept timed out retries = 39
"
"
main_program: **** PX on Hadoop startup failed ****
main_program: The TCP port being used for startup is 10,002; the associated socket number is 9.
main_program: Unable to contact one or more Section Leaders.
Probable configuration problem; contact Orchestrate system administrator.

Can any one please help me out in this.

Thanks in advance

Re: Issue while connecting to HDFS

Posted: Mon Oct 03, 2016 4:37 am
by just4u_sharath
Can anyone please help me out on this...It's urgent

Posted: Mon Oct 03, 2016 7:02 am
by chulett
Then you should be involving your official support provider as there's probably not a ton of Big Data / HDFS / YARN users here yet. I'd also be curious if you've done any of the things the errors suggested you 'look into'.

Posted: Tue Oct 04, 2016 6:15 pm
by Timato
I'm almost certain that this is not an HDFS issue (i've encountered something similar in my own VMs).

It looks like you've set up InfoServer over Hadoop and the problem is when YARN is allocating the containers to execute the job, its running out of memory in your nodes. If i recall correctly DS needs a certain number of YARN containers to run and you have to configure each container's minimum and maximum memory usage (i used Ambari to do this 8) ) Its timing out when waiting for the node manager to return a container.

What you can try is to run a simple job (row gen -> transform -> peek) and make sure you're using the correct configuration file and goto the YARN UI and monitor the new/requested and killed applications and then get the logs through there for investigation.