| Command Part
1. Restart drilldrillbit.sh restart
maprcli node services -name drill-bits -action restart -nodes <node IP addresses separated by a space>Or
maprcli node services -name drill-bits -action restart -filter csvc=="drill-bits"2. Sqlline
sqlline -u jdbc:drill:zk=<zk1host>:<port>,<zk2host>:<port>,<zk3host>:<port>3. Check SQL IDs in jstack
$ clush -a "jstack \`jps -m|grep -i drill|cut -d' ' -f1\`|grep tid|grep frag|cut -d ':' -f1|sort|uniq -c"|sort node1: 3 "2939d224-6b1f-3fda-5a42-4b741e5283a6 node2: 3 "2939d224-6b1f-3fda-5a42-4b741e5283a6 node3: 4 "2939d224-6b1f-3fda-5a42-4b741e5283a6 ...4. Check memory consumption of drillbit
$ clush -aB "ps auwwxx|grep -i drill|grep -v grep|grep -v internal_start|awk '{print \$6}'" --------------- node1 --------------- 25761828 --------------- node2 --------------- 25417036 ---------------5. Take jstack on all drillbits and collect them
clush -a "jstack \`jps -m|grep -i drill|cut -d' ' -f1\` > /home/mapr/hao/jstack_drillbit.out" mkdir /home/mapr/hao/jstackall clush -a --rcopy /home/mapr/hao/jstack_drillbit.out --dest /home/mapr/hao/jstackall
| SQL Part
1. Verify drillbitsselect * from sys.drillbits;2. Schema information
select * from INFORMATION_SCHEMA.SCHEMATA;3. Explain plan
!set maxwidth 10000 explain plan for <query> ; explain plan without implementation for <query> ; explain plan including all attributes for <query>;4. System options
SELECT * FROM sys.options WHERE type='BOOT';5. Record output to a file
!record /tmp/abc.txt !record6. Enable verbose error mode at system/session level
alter system set `exec.errors.verbose`=true;7. Store as json file for CTAS.
alter session set `store.format`='json';8. Enable compiler debug.
ALTER SESSION SET `exec.java_compiler_debug` = true;9. Decrease store.parquet.vector_fill_check_threshold in case of too large parquet row.
alter session set `store.parquet.vector_fill_check_threshold` = 2;10. Get drill commit ID.
select commit_id from `sys`.`version`;11. UTF16 character set in query text.
select columns[2] from `chinese.csv` where columns[0]= _UTF16'北京';12. Enable decimal support.
alter session set `planner.enable_decimal_data_type`=true;13. Disable hash based on a single key.
alter session set `planner.enable_hash_single_key` = false ;14. Increase plan memory component.
alter session set `planner.memory_limit` = 1073741824;
| Function Part
1. Current unix time & Convert unix time to timestamp.Current unix time:
unix_timestamp()Convert unix time to timestamp:
select from_unixtime(unix_timestamp()) from `sys`.`version`; +----------------------+ | EXPR$0 | +----------------------+ | 2015-05-29 19:39:43 | +----------------------+ 1 row selected (0.082 seconds)Or:
select TO_TIMESTAMP(unix_timestamp()) from `sys`.`version`; +------------------------+ | EXPR$0 | +------------------------+ | 2015-05-29 20:30:41.0 | +------------------------+ 1 row selected (0.081 seconds)2. Flatten -- to generate each row for each element of one array.
$ cat flatten1.json {"fixed_column":"abc", "list_column":[{"id1":"1","name":"zhu"},{"id1":"2","name":"hao"}]} 0: jdbc:drill:zk=local> select t.`fixed_column` as fixed_column, flatten(t.`list_column`) from dfs.tmp.`drilltest/flatten1.json` as t; +--------------+------------+ | fixed_column | EXPR$1 | +--------------+------------+ | abc | {"id1":"1","name":"zhu"} | | abc | {"id1":"2","name":"hao"} | +--------------+------------+ 2 rows selected (0.165 seconds)3. kvgen -- to convert unions/maps to lists of key-value pairs.
test.json: {"data": {"id":1,"id2":2,"id3":3} } select kvgen(tab.data) from `test.json` tab limit 1; +------------+ | EXPR$0 | +------------+ | [{"key":"id","value":1},{"key":"id2","value":2},{"key":"id3","value":3}] | +------------+ 1 row selected (0.153 seconds) with tmp as ( select kvgen(tab.data)[0] as newdata from `test.json` tab limit 1 ) select tmp.newdata.`key`,tmp.newdata.`value` from tmp; +------------+------------+ | EXPR$0 | EXPR$1 | +------------+------------+ | id | 1 | +------------+------------+ 1 row selected (0.09 seconds)4. repeated_count and repeated_contains -- for array.
a.json: {"data": ["apple","orange","abc"]} select repeated_count(data) from dfs.tmp.`/drilltest/a.json`; +------------+ | EXPR$0 | +------------+ | 3 | +------------+ select repeated_contains(data,'apple') from dfs.tmp.`/drilltest/a.json`; +------------+ | EXPR$0 | +------------+ | true | +------------+ select repeated_contains(data,'nothing') from dfs.tmp.`/drilltest/a.json`; +------------+ | EXPR$0 | +------------+ | false | +------------+5. byte_substr
In hbase shell, create a hbase or maprdb table:
create '/testtable','cf' put '/testtable','ABCD_9223370655563575807','cf:c','abc'Then we can get the substring of rowkey in Drill:
select convert_from(byte_substr(row_key,1,4),'UTF8') from dfs.`/testtable`; +------------+ | EXPR$0 | +------------+ | ABCD | +------------+ 1 row selected (0.293 seconds) select convert_from(byte_substr(row_key,6,length(row_key)),'UTF8') from dfs.`/testtable`; +------------+ | EXPR$0 | +------------+ | 9223370655563575807 | +------------+ 1 row selected (0.263 seconds)6. left and right
> select `left`('blahblah',2) from sys.options limit 1; +------------+ | EXPR$0 | +------------+ | bl | +------------+ 1 row selected (0.248 seconds) > select `right`('blahblah',2) from sys.options limit 1; +------------+ | EXPR$0 | +------------+ | ah | +------------+ 1 row selected (0.201 seconds)
| Configuration Part
1. Memory Leak debugdrill-override.conf:
debug.error_on_leak: true2. PStore location
> select * from sys.options where name like '%store.provider%'; +------------+------------+------------+------------+------------+------------+------------+ | name | kind | type | num_val | string_val | bool_val | float_val | +------------+------------+------------+------------+------------+------------+------------+ | drill.exec.sys.store.provider.hbase.config."hbase.zookeeper.property.clientPort" | LONG | BOOT | 2181 | null | null | null | | drill.exec.sys.store.provider.mongo.connection | null | BOOT | null | "mongodb://localhost:27017/?connectTimeoutMS=60000&maxPoolSize=1000&safe=true" | null | null | | drill.exec.sys.store.provider.hbase.config."hbase.zookeeper.quorum" | null | BOOT | null | "localhost" | null | null | | drill.exec.sys.store.provider.local.write | BOOLEAN | BOOT | null | null | true | null | | drill.exec.sys.store.provider.hbase.table | null | BOOT | null | "drill_store" | null | null | | drill.exec.sys.store.provider.local.path | null | BOOT | null | "/tmp/drill" | null | null | | drill.exec.sys.store.provider.class | null | BOOT | null | "org.apache.drill.exec.store.sys.zk.ZkPStoreProvider" | null | null | +------------+------------+------------+------------+------------+------------+------------+Note: Starting from Drill 1.0, boot options are moved to sys.boot.
3. Sort and spilling.
> select * from sys.options where name like '%exec.sort%'; +------------+------------+------------+------------+------------+------------+------------+ | name | kind | type | num_val | string_val | bool_val | float_val | +------------+------------+------------+------------+------------+------------+------------+ | drill.exec.sort.external.spill.fs | null | BOOT | null | "file:///" | null | null | | drill.exec.sort.purge.threshold | LONG | BOOT | 100 | null | null | null | | drill.exec.sort.external.spill.threshold | LONG | BOOT | 200 | null | null | null | | drill.exec.sort.external.spill.batch.size | LONG | BOOT | 4000 | null | null | null | | drill.exec.sort.external.spill.group.size | LONG | BOOT | 100 | null | null | null | | drill.exec.sort.external.spill.directories | null | BOOT | null | [ # jar:file:/opt/mapr/drill/drill-0.6.0/jars/drill-java-exec-0.7.0-incubating-SNAPSHOT-rebuffed.jar!/drill-module.conf: 134 "/tmp/drill/spill" ] | null | null | | drill.exec.sort.external.batch.size | LONG | BOOT | 4000 | null | null | null | +------------+------------+------------+------------+------------+------------+------------+Note: Starting from Drill 1.0, boot options are moved to sys.boot.
4. Planner
> select * from sys.options where name like '%planner%'; +------------+------------+------------+------------+------------+------------+------------+ | name | kind | type | num_val | string_val | bool_val | float_val | +------------+------------+------------+------------+------------+------------+------------+ | planner.memory.max_query_memory_per_node | LONG | SYSTEM | 2048 | null | null | null | | planner.join.row_count_estimate_factor | DOUBLE | SYSTEM | null | null | null | 1.0 | | planner.affinity_factor | DOUBLE | SYSTEM | null | null | null | 1.2 | | planner.disable_exchanges | BOOLEAN | SYSTEM | null | null | false | null | | planner.enable_mergejoin | BOOLEAN | SYSTEM | null | null | true | null | | planner.enable_broadcast_join | BOOLEAN | SYSTEM | null | null | true | null | | planner.enable_hashagg | BOOLEAN | SYSTEM | null | null | true | null | | planner.memory.average_field_width | LONG | SYSTEM | 8 | null | null | null | | planner.broadcast_threshold | LONG | SYSTEM | 1000000 | null | null | null | | planner.add_producer_consumer | BOOLEAN | SYSTEM | null | null | false | null | | planner.producer_consumer_queue_size | LONG | SYSTEM | 10 | null | null | null | | planner.enable_hash_single_key | BOOLEAN | SYSTEM | null | null | true | null | | planner.memory.hash_agg_table_factor | DOUBLE | SYSTEM | null | null | null | 1.1 | | planner.slice_target | LONG | SYSTEM | 100000 | null | null | null | | planner.width.max_per_node | LONG | SYSTEM | 2 | null | null | null | | planner.memory.enable_memory_estimation | BOOLEAN | SYSTEM | null | null | false | null | | planner.enable_multiphase_agg | BOOLEAN | SYSTEM | null | null | true | null | | planner.memory.non_blocking_operators_memory | LONG | SYSTEM | 64 | null | null | null | | planner.enable_streamagg | BOOLEAN | SYSTEM | null | null | true | null | | planner.memory.hash_join_table_factor | DOUBLE | SYSTEM | null | null | null | 1.1 | | planner.width.max_per_query | LONG | SYSTEM | 1000 | null | null | null | | planner.enable_hashjoin | BOOLEAN | SYSTEM | null | null | true | null | | planner.producer_consumer_queue_size | LONG | SESSION | 100000000000000000 | null | null | null | | planner.add_producer_consumer | BOOLEAN | SESSION | null | null | false | null | +------------+------------+------------+------------+------------+------------+------------+ 24 rows selected (0.219 seconds)5. Query scheduling
> select * from sys.options where name like '%exec.queue%'; +------------+------------+------------+------------+------------+------------+------------+ | name | kind | type | num_val | string_val | bool_val | float_val | +------------+------------+------------+------------+------------+------------+------------+ | exec.queue.timeout_millis | LONG | SYSTEM | 300000 | null | null | null | | exec.queue.small | LONG | SYSTEM | 100 | null | null | null | | exec.queue.threshold | LONG | SYSTEM | 30000000 | null | null | null | | exec.queue.large | LONG | SYSTEM | 10 | null | null | null | | exec.queue.enable | BOOLEAN | SYSTEM | null | null | false | null | +------------+------------+------------+------------+------------+------------+------------+
==
No comments:
Post a Comment