Features
1. Parse SQL(explain or explain analyze) plan to well formatted form.
Input:QUERY PLAN ----------------------------------------------------------------------------------------------------------------------------------------------------------- -------------------------- Gather Motion 8:1 (slice2; segments: 8) (cost=0.00..10045.93 rows=25 width=205) Rows out: 386 rows at destination with 5.940 ms to first row, 13 ms to end, start offset by 0.660 ms. -> Seq Scan on test a (cost=0.00..10045.93 rows=25 width=205) Filter: (subplan) Rows out: Avg 48.2 rows x 8 workers. Max 65 rows (seg3) with 4.103 ms to first row, 6.221 ms to end, start offset by -230160251 ms. SubPlan 1 -> Materialize for deadlock safety (cost=34.72..38.58 rows=49 width=64) Rows out: Avg 9357.4 rows x 8 workers. Max 16960 rows (seg6) with 6.153 ms to first row, 8.724 ms to end of 53 scans, start offset by -230169262 ms. -> Broadcast Motion 8:8 (slice1; segments: 8) (cost=16.68..34.33 rows=49 width=64) Rows out: Avg 386.0 rows x 8 workers at destination. Max 386 rows (seg0) with 3.201 ms to first row, 3.796 ms to end, start offset by -230160721 ms. -> Subquery Scan tmp (cost=16.68..34.33 rows=49 width=64) Rows out: Avg 5999 rows x 8 workers. Max 12000 rows (seg3) with 1.397 ms to first row, 1.802 ms to end, start offset by -230160251 ms. -> Hash Join (cost=16.68..34.33 rows=49 width=205) Hash Cond: d.relname = e.relname Rows out: Avg 48.2 rows x 8 workers. Max 65 rows (seg3) with 1.396 ms to first row, 1.780 ms to end, start offset by -230160251 ms. Executor memory: 5K bytes avg, 6K bytes max (seg3). Work_mem used: 5K bytes avg, 6K bytes max (seg3). Workfile: (0 spilling, 0 reused) (seg3) Hash chain length 1.0 avg, 1 max, using 65 of 131111 buckets. -> Seq Scan on test d (cost=0.00..11.86 rows=49 width=205) Rows out: Avg 48.2 rows x 8 workers. Max 65 rows (seg3) with 0.047 ms to first row, 0.062 ms to end, start offset by -230160251 ms. -> Hash (cost=11.86..11.86 rows=49 width=64) Rows in: Avg 48.2 rows x 8 workers. Max 65 rows (seg3) with 0.072 ms to end, start offset by -230160249 ms. -> Seq Scan on test e (cost=0.00..11.86 rows=49 width=64) Rows out: Avg 48.2 rows x 8 workers. Max 65 rows (seg3) with 0.010 ms to first row, 0.027 ms to end, start offset by -230160249 ms. Slice statistics: (slice0) Executor memory: 345K bytes. (slice1) Executor memory: 2434K bytes avg x 8 workers, 2448K bytes max (seg3). Work_mem: 6K bytes max. (slice2) Executor memory: 346K bytes avg x 8 workers, 350K bytes max (seg2). Statement statistics: Memory used: 128000K bytes Total runtime: 14.278 ms (31 rows)Output:
2. Search key words and only show specific plan node(s).
For example, search "hash":3. Calculate the cost/time for each plan node from accumulated values in SQL plan.
Simply saying, SQL plan checker will parse the children nodes recursively and deduct the level 1 children's costs from the total cost.For example, Look at below snippet of plan nodes:
Hash Join (cost=16.680..34.330 rows=49 width=205) Hash Cond: d.relname = e.relname Rows out: Avg 48.2 rows x 8 workers. Max 65 rows (seg3) with 1.396 ms to first row, 1.780 ms to end, start offset by -230160251 ms. Executor memory: 5K bytes avg, 6K bytes max (seg3). Work_mem used: 5K bytes avg, 6K bytes max (seg3). Workfile: (0 spilling, 0 reused) (seg3) Hash chain length 1.0 avg, 1 max, using 65 of 131111 buckets. -> Seq Scan on test d (cost=0.000..11.860 rows=49 width=205) Rows out: Avg 48.2 rows x 8 workers. Max 65 rows (seg3) with 0.047 ms to first row, 0.062 ms to end, start offset by -230160251 ms. -> Hash (cost=11.860..11.860 rows=49 width=64) Rows in: Avg 48.2 rows x 8 workers. Max 65 rows (seg3) with 0.072 ms to end, start offset by -230160249 ms. -> Seq Scan on test e (cost=0.000..11.860 rows=49 width=64) Rows out: Avg 48.2 rows x 8 workers. Max 65 rows (seg3) with 0.010 ms to first row, 0.027 ms to end, start offset by -230160249 ms.Total cost of the whole hash join's cost is 34.330.
Level 1 child "Seq Scan on test d"'s cost is 11.860, another level 1 child "Hash"'s cost is also 11.860.
So if we only talk about the "Hash Join" plan node itself, the cost is :
34.330 - 11.860 -11.860 = 10.61
By doing this, we can get which plan node is the most cost-consuming step.
4. Sort
SQL plan checker supports sorting on each column.If you sort on "NodeCost", you can get the most cost-consuming plan node(Estimated).
If you sort on "Node(ms)", you can get the most time-consuming plan node(Actual).
5. Highlight warnings based on check items.
SQL plan checker can highlight below situations automatically:1: If no explain anlayze output and the estimated rows of seq table scan is 1, table may not be analyzed or it is empty table.
2: If the estimated rows of seq table scan is below 1 but avg/actual rows is larger than 1, table must be not analyzed.
3: If MaxRows is 50% of all rows,Or no AvgRows, it means data skew on Max Segment. (Threshold is MaxRows >= 10,000 rows)
4: Highlight each nested loop
5: If total slice number is above 100, then highlight it as a huge SQL.
6: Report spill files.
7: If "Broadcast Motion"+"Redistribute Motion" > 5 times, highlight it.
8: Highlight Loops of plan for SubPlan.
6. Supports "explain" and also "explain analyze" output.
Below statistics are supported.7. Graph plan tree
Eg:Term description:
* GeneralDetails: The whole SQL plan text for this node and its all children nodes.
ID: Primary key for each plan node starting from 1 in top-down order.
Slice: A slice is a portion of the plan that segments can work on independently.
Node: Plan node text
Type: Table scan or Indes scan.
Scanned Object: Table/partition name or index name which is scanned on.
----------------------
* Estimated
StartCost: Estimated start-up cost. This is the time expended before the output phase can begin.
TotalCost: Estimated total cost. This is stated on the assumption that the plan node is run to completion, i.e., all available rows are retrieved.
NodeCost: Estimated cost of this node. Planchecker uses its TotalCost to deduct sum of TotalCost of its all level 1 children nodes.
Rows: Estimated number of rows output by this plan node. (GPDB: for each segment)
Width: Estimated average width of rows output by this plan node (in bytes).
----------------------
* Actual(based on explain analyze output)
Rows: Output rows from one max segment.
AvgRows: Average rows for each segment.
Workers: How many segments doing work in this node
MaxRows: Max rows from one segment
Loops: "of ? scans" means how many times this sbuplan is executed
----------------------
* Time(based on explain analyze output)
First(ms): The start-up time before the first row can be returned.
End(ms): The total time to return all the rows.
Node(ms): Time taken by this node. Planchecker uses its End(ms) to deduct sum of End(ms) of its all level 1 children nodes.
----------------------
* Sort(based on explain analyze output)
Avg mem(KB): Avg executor memory for each segment.
Max mem(KB): Max executor memory for each segment.
spill file: How many segments spill files to disk.
spill reuse: How many segments reuse spill files.
No comments:
Post a Comment