Goal:
Hue is operating files or directories on MapR-FS by sending restful API calls to httpfs.This article shows how to troubleshoot Hue issues using the same set of restful API calls.
Solution:
1. Enable DEBUG logging for runcpserver.log.
Runcpserver is a web server that provides the core web functionality of Hue.To enable the DEBUG level logging, please change /opt/mapr/hue/hue-<version>/desktop/conf/log.conf:
[handler_logfile]
class=handlers.RotatingFileHandler
# Choices are DEBUG, INFO, WARNING, ERROR, CRITICAL
level=DEBUG
formatter=default
args=('%LOG_DIR%/%PROC_NAME%.log', 'a', 1000000, 3)
After that, restart Hue.maprcli node services -name hue -action stop -nodes hostname maprcli node services -name hue -action start -nodes hostname
2. Make sure httpfs process is running fine.
Use below command to identify which server is running httpfs in Hadoop cluster.maprcli node list -columns service
Go to that server, and check if it is listening on the port(default is 14000).[root]# lsof -i:14000
COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME
python2.6 7048 mapr 13u IPv4 35689564 0t0 TCP mapr4-3:48061->mapr4-3:scotty-ft (CLOSE_WAIT)
java 7848 mapr 134u IPv6 35688826 0t0 TCP *:scotty-ft (LISTEN)
3. Verify hue.ini is pointing to correct httpfs IP and port.
In section "[[hdfs_clusters]]" of hue.ini, for example: # Use WebHdfs/HttpFs as the communication mechanism.
# Domain should be the NameNode or HttpFs host.
# Default port is 14000 for HttpFs.
webhdfs_url=http://mapr4-3:14000/webhdfs/v1
4. Troubleshoot Hue issues by monitoring runcpserver.log to capture the restful API calls to httpfs.
For example, if we copy a file /tmp/mapr/Master.csv to /tmp/mapr/Master.csv.2 using Hue file browser, we can capture below calls and so on.a. Get metadata of source file.
From runcpserver.log:
GET /webhdfs/v1/tmp/mapr/Master.csv?op=GETFILESTATUS&user.name=mapr&doas=mapr HTTP/1.1Then we can use below curl command to manually check(Note: "mapr4-3" is the hostname of httpfs server):
# curl "http://mapr4-3:14000/webhdfs/v1/tmp/mapr/Master.csv?op=GETFILESTATUS&user.name=mapr"
{"FileStatus":{"pathSuffix":"","type":"FILE","length":6049426,"owner":"mapr","group":"mapr","permission":"755","accessTime":1419263547000,"modificationTime":1419263556835,"blockSize":268435456,"replication":3}}
b. Open and read source file.From runcpserver.log:
GET /webhdfs/v1/tmp/mapr/Master.csv?length=67108864&op=OPEN&user.name=mapr&offset=0&doas=mapr HTTP/1.1We can use below curl command to verify the same:
curl -X GET -L "http://mapr4-3:14000/webhdfs/v1/tmp/mapr/Master.csv?length=67108864&op=OPEN&user.name=mapr&offset=0&doas=mapr"
Please refer to webhdfs API call for more details on syntax.Note:The reason of adding "&user.name=mapr" is to avoid below error:
HTTP Status 403 - Anonymous requests are disallowed
If results from restful API calls are not expected results, the issue could be in httpfs side.
No comments:
Post a Comment