Goal:
Hive on Tez : How to identify the reused YARN containersEnv:
Hive 2.1Tez 0.8
Solution:
Tez can reuse YARN containers to improve the performance, because it saves the time to allocate a new YARN container.Take a simple Hive on Tez query for example:
hive> set hive.execution.engine=tez; hive> select count(*) from passwords; Query ID = mapr_20170523131710_be8b1c97-6607-4c2e-8fb2-8e1c152047c3 Total jobs = 1 Launching Job 1 out of 1 Status: Running (Executing on YARN cluster with App id application_1475192050844_0021) ---------------------------------------------------------------------------------------------- VERTICES MODE STATUS TOTAL COMPLETED RUNNING PENDING FAILED KILLED ---------------------------------------------------------------------------------------------- Map 1 .......... container SUCCEEDED 1 1 0 0 0 0 Reducer 2 ...... container SUCCEEDED 1 1 0 0 0 0 ---------------------------------------------------------------------------------------------- VERTICES: 02/02 [==========================>>] 100% ELAPSED TIME: 8.41 s ---------------------------------------------------------------------------------------------- OK 29 Time taken: 18.663 seconds, Fetched: 1 row(s)Quick question: How many YARN containers are spawned for above query?
You may answer "3" based on common understanding of a YARN job: one Application Master container, one Mapper container and one Reducer container.
But the actual answer in above example is 2.
Based on below ResourceManager log:
# tail -100f yarn-mapr-resourcemanager-s1.poc.com.log |grep -i assign 2017-05-23 13:17:14,964 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerNode: Assigned container container_e02_1475192050844_0021_01_000001 of capacity <memory:1024, vCores:1, disks:0.0> on host s4.poc.com:57626, which has 1 containers, <memory:1024, vCores:1, disks:0.0> used and <memory:16897, vCores:1, disks:1.33> available after allocation 2017-05-23 13:17:21,972 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerNode: Assigned container container_e02_1475192050844_0021_01_000002 of capacity <memory:1024, vCores:1, disks:0.0> on host s3.poc.com:44818, which has 1 containers, <memory:1024, vCores:1, disks:0.0> used and <memory:5383, vCores:1, disks:1.33> available after allocationOnly 2 YARN containers allocated for application application_1475192050844_0017:
- container_e02_1475192050844_0021_01_000001
- container_e02_1475192050844_0021_01_000002
1. container_e02_1475192050844_0021_01_000001
This is the DAG Application Master container.Evidence is in the container log "syslog" file:
[root@s4 container_e02_1475192050844_0021_01_000001]# grep "Creating DAGAppMaster" syslog 2017-05-23 13:17:17,638 [INFO] [main] |app.DAGAppMaster|: Creating DAGAppMaster for applicationId=application_1475192050844_0021, attemptNum=1, AMContainerId=container_e02_1475192050844_0021_01_000001, jvmPid=15496, userFromEnv=mapr, cliSessionOption=true, pwd=/tmp/hadoop-mapr/nm-local-dir/usercache/mapr/appcache/application_1475192050844_0021/container_e02_1475192050844_0021_01_000001, localDirs=/tmp/hadoop-mapr/nm-local-dir/usercache/mapr/appcache/application_1475192050844_0021, logDirs=/opt/mapr/hadoop/hadoop-2.7.0/logs/userlogs/application_1475192050844_0021/container_e02_1475192050844_0021_01_000001
As we know, this DAG Application Master may run multiple Hive queries(DAGs), so for this Hive query, the DAG ID can be obtained from above "syslog" as well:
[root@s4 container_e02_1475192050844_0021_01_000001]# grep dagId syslog 2017-05-23 13:17:20,152 [INFO] [IPC Server handler 0 on 39921] |app.DAGAppMaster|: Generating DAG graphviz file, dagId=dag_1475192050844_0021_1, filePath=/opt/mapr/hadoop/hadoop-2.7.0/logs/userlogs/application_1475192050844_0021/container_e02_1475192050844_0021_01_000001/dag_1475192050844_0021_1.dot
Then for that Hive query(dagId=dag_1475192050844_0021_1), the details are actually in its DAG log which is syslog_dag_1475192050844_0021_1.
For example, the other map and reduce containers:
[root@s4 container_e02_1475192050844_0021_01_000001]# grep -i assign syslog_dag_1475192050844_0021_1 |grep -i container 2017-05-23 13:17:22,775 [INFO] [DelayedContainerManager] |rm.YarnTaskSchedulerService|: Assigning container to task: containerId=container_e02_1475192050844_0021_01_000002, task=attempt_1475192050844_0021_1_00_000000_0, containerHost=s3.poc.com:44818, containerPriority= 5, containerResources=<memory:1024, vCores:1, disks:0.0>, localityMatchType=RackLocal, matchedLocation=/default-rack, honorLocalityFlags=false, reusedContainer=false, delayedContainers=0 2017-05-23 13:17:29,054 [INFO] [DelayedContainerManager] |rm.YarnTaskSchedulerService|: Assigning container to task: containerId=container_e02_1475192050844_0021_01_000002, task=attempt_1475192050844_0021_1_01_000000_0, containerHost=s3.poc.com:44818, containerPriority= 5, containerResources=<memory:1024, vCores:1, disks:0.0>, localityMatchType=NodeLocal, matchedLocation=s3.poc.com, honorLocalityFlags=true, reusedContainer=true, delayedContainers=1So we got to know that the Mapper and Reducer tasks are sharing the same container -- container_e02_1475192050844_0021_01_000002.
Here the keyword is "reusedContainer=true" which means the Reducer reused the container which was used for Mapper.
Simply saying, in the DAG Application Master container log directory, the "syslog" can tell you which DAG ID is for which query.
The corresponding "DAG syslog" can tell you what are other Mapper and Reducer containers.
2. container_e02_1475192050844_0021_01_000002
Actually the "syslog" does not have much information about the 2 tasks.Because the log files are redirected to below "attempt syslog" files:syslog_attempt_1475192050844_0021_1_00_000000_0 -- for Mapper
syslog_attempt_1475192050844_0021_1_01_000000_0 -- for Reducer
Key Takeaways:
The methodology is:- From ResourceManager log to find out which containers are spawned for this Hive on Tez job.
- From DAG Application Master "syslog" to find out what is the DAG ID for this Hive query.
- From DAG Application Master "DAG syslog" to find out what are other Mapper or Reducer containers and also which container is reused.
No comments:
Post a Comment