Catalog

EXPLAIN As MySQL Performance analysis artifact , It is necessary to understand the results , However, I can't find a particularly complete interpretation on various search engines . It's all about the point , No details ( for example type The value of is not complete 、Extra Lack of complete introduction, etc ).

therefore , I've been liver for nearly a week , Sort it out . This should be the most comprehensive network 、 The most detailed EXPLAIN Read the article , Here is the full text .

The article is relatively long , Recommended collection .

TIPS

This article is based on MySQL 8.0 To write , theoretical support MySQL 5.0 And higher .

EXPLAIN Use

explain Can be used to analyze SQL Implementation plan of . The format is as follows :

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
{EXPLAIN | DESCRIBE | DESC}
tbl_name [col_name | wild]

{EXPLAIN | DESCRIBE | DESC}
[explain_type]
{explainable_stmt | FOR CONNECTION connection_id}

{EXPLAIN | DESCRIBE | DESC} ANALYZE select_statement

explain_type: {
FORMAT = format_name
}

format_name: {
TRADITIONAL
| JSON
| TREE
}

explainable_stmt: {
SELECT statement
| TABLE statement
| DELETE statement
| INSERT statement
| REPLACE statement
| UPDATE statement
}

Example :

1
2
3
4
5
6
7
8
9
10
11
12
EXPLAIN format = TRADITIONAL json SELECT tt.TicketNumber, tt.TimeIn,
tt.ProjectReference, tt.EstimatedShipDate,
tt.ActualShipDate, tt.ClientID,
tt.ServiceCodes, tt.RepetitiveID,
tt.CurrentProcess, tt.CurrentDPPerson,
tt.RecordVolume, tt.DPPrinted, et.COUNTRY,
et_1.COUNTRY, do.CUSTNAME
FROM tt, et, et AS et_1, do
WHERE tt.SubmitTime IS NULL
AND tt.ActualPC = et.EMPLOYID
AND tt.AssignedPC = et_1.EMPLOYID
AND tt.ClientID = do.CUSTNMBR;

The result output shows :

Field format=json The name of the time meaning
id select_id The unique identifier of the statement
select_type nothing Query type
table table_name Table name
partitions partitions Matching partition
type access_type Connection type
possible_keys possible_keys Possible index selection
key key Index of actual selection
key_len key_length Length of index
ref ref Which column of the index is referenced
rows rows Estimate the row to scan
filtered filtered Represents the percentage of data that meets the query criteria
Extra No, Additional information

Interpretation of results

id

The unique identifier of the statement . If explain The results include multiple id value , The larger the number, the earlier the execution ; And for the same id The line of , From top to bottom .

select_type

Query type , There are several values as follows :

Query type effect
SIMPLE Simple query ( not used UNION Or subquery )
PRIMARY The outermost query
UNION stay UNION The second and the following SELECT Marked as UNION. If UNION By FROM The subquery in clause contains , So the first one of it SELECT Will be marked as DERIVED.
DEPENDENT UNION UNION The second or subsequent query in , Relying on external queries
UNION RESULT UNION Result
SUBQUERY First in subquery SELECT
DEPENDENT SUBQUERY First in subquery SELECT, Relying on external queries
DERIVED Used to represent inclusion in FROM Clause in the subquery SELECT,MySQL Will recursively execute and put the results in a temporary table .MySQL Internally, it is called Derived table( Derived tables ), Because the temporary table is derived from the subquery
DEPENDENT DERIVED Derived tables , Relying on other tables
MATERIALIZED Physicochemical subquery
UNCACHEABLE SUBQUERY Subquery , The result cannot be cached , Must be reevaluated for each row of the external query
UNCACHEABLE UNION UNION Belong to UNCACHEABLE SUBQUERY The second or subsequent query of

table

Indicates which table this row is accessing , If SQL It defines the alias , And show the alias of the table

partitions

The partition of the current query matching record . For tables that are not partitioned , return null

type

Connection type , There are several values as follows , Performance is sorted from good to bad as follows :

  • system: The table has only one row ( It's equivalent to a system table ),system yes const Special case of type

  • const: Equivalent query scan for primary key or unique index , Return at most one row of data . const Very fast query speed , Because it only reads once

  • eq_ref: When all the components of the index are used , And the index is PRIMARY KEY or UNIQUE NOT NULL This type will be used , Performance is second only to system And const.

    1
    2
    3
    4
    5
    6
    7
    8
    -- Multi table associated query , Single match 
    SELECT * FROM ref_table,other_table
    WHERE ref_table.key_column=other_table.column;

    -- Multi table associated query , Joint index , Multi-line matching
    SELECT * FROM ref_table,other_table
    WHERE ref_table.key_column_part1=other_table.column
    AND ref_table.key_column_part2=1;
  • ref: When the leftmost prefix rule of the index is satisfied , Or when the index is not the primary key or the only index . If the index used matches only a small number of rows , The performance is also good .

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    -- Index based ( Non primary key , Non unique index ), Match to multiple lines 
    SELECT * FROM ref_table WHERE key_column=expr;

    -- Multi table associated query , Single index , Multi-line matching
    SELECT * FROM ref_table,other_table
    WHERE ref_table.key_column=other_table.column;

    -- Multi table associated query , Joint index , Multi-line matching
    SELECT * FROM ref_table,other_table
    WHERE ref_table.key_column_part1=other_table.column
    AND ref_table.key_column_part2=1;

    TIPS

    Leftmost prefix principle , The index matches the index in the left most first way . For example, create a composite index (column1, column2, column3), that , If the query condition is :

    • WHERE column1 = 1、WHERE column1= 1 AND column2 = 2、WHERE column1= 1 AND column2 = 2 AND column3 = 3 You can use the index ;
    • WHERE column1 = 2、WHERE column1 = 1 AND column3 = 3 Can't match the index .
  • fulltext: Full-text index

  • ref_or_null: This type is similar to ref, however MySQL Which rows will be searched additionally for NULL. This type is common in parsing subqueries

    1
    2
    SELECT * FROM ref_table
    WHERE key_column=expr OR key_column IS NULL;
  • index_merge: This type represents the use of index merge optimization , Indicates that more than one index is used in a query

  • unique_subquery: This type and eq_ref similar , But it did IN Inquire about , And the subquery is the primary key or unique index . for example :

    1
    value IN (SELECT primary_key FROM single_table WHERE some_expr)
  • index_subquery: and unique_subquery similar , It's just that subqueries use non unique indexes

    1
    value IN (SELECT key_column FROM single_table WHERE some_expr)
  • range: Range scan , Indicates that a specified range of rows has been retrieved , It is mainly used for limited index scanning . A more common range scan is with BETWEEN Clause or WHERE There are >、>=、<、<=、IS NULL、<=>、BETWEEN、LIKE、IN() Wait for the operator .

    1
    2
    3
    4
    5
    SELECT * FROM tbl_name
    WHERE key_column BETWEEN 10 and 20;

    SELECT * FROM tbl_name
    WHERE key_column IN (10,20,30);
  • index: Full index scan , and ALL similar , It's just index It's a full scan of the index data . When a query uses only a portion of the columns in the index , You can use this type . There are two scenarios that trigger :

    • If the index is the overlay index of the query , And the index query data can meet all the data needed in the query , Scan only index trees . here ,explain Of Extra The result of the column is Using index.index Often than ALL fast , Because the index size is usually smaller than the table data .
    • Find rows of data in the order of the index , A full table scan was performed . here ,explain Of Extra The result of the column will not appear Uses index.
  • ALL: Full table scan , The worst performance .

possible_keys

Show which indexes can be used by the current query , This column of data was created early in the optimization process , Therefore, some indexes may not be useful for the subsequent optimization process .

key

Express MySQL Index of actual selection

key_len

Bytes used by index . Because of the storage format , When the field is allowed to be NULL when ,key_len It's bigger than when it's not allowed to be empty 1 byte .

key_len Calculation formula : https://www.cnblogs.com/gomysql/p/4004244.html

ref

Indicates which field or constant will be combined with key Compare the fields used by the column .

If ref It's a function , Then the value used is the result of the function . To see which function , Can be found in EXPLAIN The sentence is followed by a SHOW WARNING sentence .

rows

MySQL Estimate the number of lines that will be scanned , The smaller the value, the better .

filtered

Represents the percentage of data that meets the query criteria , Maximum 100. use rows × filtered You can get the number of rows connected to the next table . for example rows = 1000,filtered = 50%, Then the number of rows connected to the next table is 500.

TIPS

stay MySQL 5.7 Before , To display this field, use explain extended command ;

MySQL.5.7 And higher ,explain Default will show filtered

Extra

Show additional information about this query , The values are as follows :

  • Child of ‘table’ pushed join@1

    This value will only be in NDB Cluster Under a .

  • const row not found

    For example, query statements SELECT … FROM tbl_name, And the watch is empty

  • Deleting all rows

    about DELETE sentence , Some engines ( for example MyISAM) Support to delete all data in a simple and fast way , If this optimization is used , This value is displayed

  • Distinct

    lookup distinct value , When the first matching line is found , Will stop searching for more rows for the current row combination

  • FirstMatch(tbl_name)

    Currently using semi join FirstMatch Strategy , See https://mariadb.com/kb/en/firstmatch-strategy/ , translate https://www.cnblogs.com/abclife/p/10895624.html

  • Full scan on NULL key

    An optimization method in subquery , Cannot be accessed by index null Use when you value

  • Impossible HAVING

    HAVING Clause is always false, No line will be hit

  • Impossible WHERE

    WHERE Clause is always false, No line will be hit

  • Impossible WHERE noticed after reading const tables

    MySQL Has read all of const( or system) surface , And found WHERE Clause is always false

  • LooseScan(m..n)

    Currently using semi join LooseScan Strategy , See https://mariadb.com/kb/en/loosescan-strategy/ , translate http://www.javacoder.cn/?p=39

  • No matching min/max row

    Nothing can satisfy, for example SELECT MIN(…) FROM … WHERE condition Medium condition The line of

  • no matching row in const table

    For associated queries , There is an empty table , Or no row can satisfy the unique index condition

  • No matching rows after partition pruning

    about DELETE or UPDATE sentence , The optimizer is partition pruning( Area pruning ) after , Can't find it delete or update The content of

  • No tables used

    When this query does not have FROM Clause or possess FROM DUAL Clause appears when . for example :explain select 1

  • Not exists

    MySQL Be able to LEFT JOIN Optimize , In finding a match for LEFT JOIN After the line of , No more rows in this table will be checked for the previous row combination . for example :

    1
    2
    SELECT * FROM t1 LEFT JOIN t2 ON t1.id=t2.id
    WHERE t2.id IS NULL;

    hypothesis t2.id Define a NOT NULL , here ,MySQL Can scan t1, And use t1.id Value lookup for t2 The lines in the . If MySQL stay t2 Found a matching line in , It will know t2.id Never for NULL, And it doesn't scan t2 Have the same id The rest of the value . in other words , about t1 Each line in ,MySQL Only need t2 Only one lookup is performed in , Instead of thinking about t2 The number of rows actually matched in .

    stay MySQL 8.0.17 And later , If this prompt appears , It can also be expressed in the form of NOT IN (subquery) or NOT EXISTS (subquery) Of WHERE The condition has been internally converted to an anti join . This will delete the subquery and put its table in the top-level query plan , This improves the query overhead . By combining semi join and anti join , The optimizer is more free to reorder the tables in the execution plan , In some cases , It can speed up the query . You can use the EXPLAIN The statement followed by a SHOW WARNING sentence , And analyze the results of Message Column , To see when the join transformation was performed on the query .

    Note

    Two table association only returns the data of the main table , And only return the data that is not associated with the primary table and the child table , This kind of connection is called anti connection

  • Plan isn’t ready yet

    Used EXPLAIN FOR CONNECTION, When the optimizer has not finished creating an execution plan for a statement executed in the specified connection , This value will appear .

  • Range checked for each record (index map: N)

    MySQL No proper index was found to use , But check to see if it can be used range or index_merge To retrieve the line , This prompt will appear .index map N The number of the index is from 1 Start , According to the table SHOW INDEX The same order is shown . Index map values N Indicates which bit mask values are candidates . for example 0x19( Binary system 11001) The value of means that the index will be considered 1、4 and 5.

    Example : In the following example ,name yes varchar type , But the condition gives the integer type , It involves implicit transformation .
    In the figure t2 And index is not used , It's because I will t2 in name The field collation is changed to utf8_bin The resulting link field collation does not match .

    1
    2
    3
    explain select a.* from t1 a left join t2 b
    on t1.name = t2.name
    where t2.name = 2;

    result :

id select_type table partitions type possible_keys key key_len ref rows filtered Extra
1 SIMPLE t2 NULL ALL idx_name NULL NULL NULL 9 11.11 Using where
1 SIMPLE t1 NULL ALL idx_name NULL NULL NULL 5 11.11 Range checked for each record (index map: 0x8)
  • Recursive

    Recursive query appears . See “WITH (Common Table Expressions)”

  • Rematerialize

    Seldom used , Use similar to the following SQL when , Will show Rematerialize

    1
    2
    3
    4
    5
    6
    SELECT
    ...
    FROM
    t,
    LATERAL (derived table that refers to t) AS dt
    ...
  • Scanned N databases

    It means that we are dealing with INFORMATION_SCHEMA When querying a table , Scanned several directories ,N The value of can be 0,1 perhaps all. See “Optimizing INFORMATION_SCHEMA Queries”

  • Select tables optimized away

    The optimizer determines :① Back at most 1 That's ok ;② To generate data for that row , To read a certain set of lines , This prompt will appear at . Generally, when some aggregate functions are used to access a field with an index , The optimizer will directly locate the required data row through the index at one time and display the whole query , Like this one down here SQL.

    1
    2
    3
    explain
    select min(id)
    from t1;
  • Skip_open_table, Open_frm_only, Open_full_table

    These values indicate that they apply to INFORMATION_SCHEMA Table query file open optimization ;

    • Skip_open_table: There is no need to open the table file , Information has been obtained by scanning the data dictionary
    • Open_frm_only: Just read the data dictionary to get the table information
    • Open_full_table: Information search without optimization . Table information must be read from the data dictionary and table file
  • Start temporary, End temporary

    Indicates that temporary tables are used by Duplicate Weedout Strategy , See https://mariadb.com/kb/en/duplicateweedout-strategy/ , translate https://www.cnblogs.com/abclife/p/10895531.html

  • unique row not found

    As for the shape SELECT … FROM tbl_name Query for , But no row can satisfy the condition of a unique index or primary key query

  • Using filesort

    When Query Contained in the ORDER BY operation , And when you can't use the index to complete the sorting operation ,MySQL Query Optimizer Have to choose the corresponding sorting algorithm to achieve . Sort from memory when data is small , Otherwise sort from disk .Explain It will not show the client which sort to use . Official explanation :“MySQL Need an extra pass , To find out how to retrieve rows in sort order . By browsing all rows and matching for all WHERE Clause holds the sort key and the row pointer to complete the sort . Then the keywords are sorted , And retrieve rows in sort order ”

  • Using index

    Only use the information in the index tree to retrieve column information from the table , Instead of having to do other lookups to read the actual line . When a query uses only columns that belong to a single index , You can use this policy . for example :

    1
    explain SELECT id FROM t
  • Using index condition

    Indicates that the index is filtered by condition first , After filtering the index, find all the data rows that meet the index criteria , Then use WHERE Clause to filter these rows of data . In this way , Unless it is necessary , Otherwise, the index information can be delayed “ Push down ” Read the entire row of data . See “Index Condition Pushdown Optimization” . for example :

    TIPS

    • MySQL Divided into Server Layer and engine layer , Push down refers to leaving the request to the engine layer for processing .

    • Understand this function , Can be created so INDEX (zipcode, lastname, firstname), And use the following instructions respectively ,

      1
      2
      3
      > SET optimizer_switch = 'index_condition_pushdown=off'; 
      > SET optimizer_switch = 'index_condition_pushdown=on';
      >

    Push down with index on or off , And compare :

    1
    2
    3
    4
    5
    > explain SELECT * FROM people
    > WHERE zipcode='95054'
    > AND lastname LIKE '%etrunia%'
    > AND address LIKE '%Main Street%';
    >

    The results of the implementation of .

    • index condition pushdown from MySQL 5.6 Start supporting , yes MySQL Optimization mechanism for specific scenarios , You can see what you are interested in https://blog.51cto.com/lee90/2060449
  • Using index for group-by

    Data access and Using index equally , All you need to do is read the index , When Query Use in GROUP BY or DISTINCT When clause , If the grouping field is also in the index ,Extra The message in will be Using index for group-by. See “GROUP BY Optimization”

    1
    2
    -- name The field has an index 
    explain SELECT name FROM t1 group by name
  • Using index for skip scan

    Indicates that Skip Scan. See Skip Scan Range Access Method

  • Using join buffer (Block Nested Loop), Using join buffer (Batched Key Access)

    Use Block Nested Loop or Batched Key Access Algorithm improvement join Performance of . See https://www.cnblogs.com/chenpingzhao/p/6720531.html

  • Using MRR

    Used Multi-Range Read Optimization strategy . See “Multi-Range Read Optimization”

  • Using sort_union(…), Using union(…), Using intersect(…)

    These indicate how index scans are merged into index_merge Connection type . See “Index Merge Optimization” .

  • Using temporary

    To solve the query ,MySQL You need to create a temporary table to save the results . If the query contains GROUP BY and ORDER BY Clause , This usually happens .

    1
    2
    -- name No index 
    explain SELECT name FROM t1 group by name
  • Using where

    If we don't read all the data in the table , Or not just indexing to get all the data you need , It will appear using where Information

    1
    explain SELECT * FROM t1 where id > 5
  • Using where with pushed condition

    Only used for NDB

  • Zero limit

    The query has a limit 0 Clause , You can't choose any line

    1
    explain SELECT name FROM resource_template limit 0

Extended EXPLAIN

EXPLAIN Additional extended information can be generated , May pass through EXPLAIN The statement follows a SHOW WARNING Statement to view extended information .

TIPS

  • stay MySQL 8.0.12 And higher , Extended information can be used for SELECT、DELETE、INSERT、REPLACE、UPDATE sentence ; stay MySQL 8.0.12 Before , Extended information only applies to SELECT sentence ;
  • stay MySQL 5.6 And earlier , Need to use EXPLAIN EXTENDED xxx sentence ; And from MySQL 5.7 Start , No need to add EXTENDED key word .

Examples of use :

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
mysql> EXPLAIN
SELECT t1.a, t1.a IN (SELECT t2.a FROM t2) FROM t1\G
*************************** 1. row ***************************
id: 1
select_type: PRIMARY
table: t1
type: index
possible_keys: NULL
key: PRIMARY
key_len: 4
ref: NULL
rows: 4
filtered: 100.00
Extra: Using index
*************************** 2. row ***************************
id: 2
select_type: SUBQUERY
table: t2
type: index
possible_keys: a
key: a
key_len: 5
ref: NULL
rows: 3
filtered: 100.00
Extra: Using index
2 rows in set, 1 warning (0.00 sec)

mysql> SHOW WARNINGS\G
*************************** 1. row ***************************
Level: Note
Code: 1003
Message: /* select#1 */ select `test`.`t1`.`a` AS `a`,
<in_optimizer>(`test`.`t1`.`a`,`test`.`t1`.`a` in
( <materialize> (/* select#2 */ select `test`.`t2`.`a`
from `test`.`t2` where 1 having 1 ),
<primary_index_lookup>(`test`.`t1`.`a` in
<temporary table> on <auto_key>
where ((`test`.`t1`.`a` = `materialized-subquery`.`a`))))) AS `t1.a
IN (SELECT t2.a FROM t2)` from `test`.`t1`
1 row in set (0.00 sec)

because SHOW WARNING The result is not necessarily an effective SQL, It doesn't have to be able to execute ( Because it contains a lot of special marks ). The special tag values are as follows :

  • <auto_key>

    Auto generated temporary table key

  • <cache>(expr)

    expression ( For example, scalar subqueries ) Executed once , And the value is saved in memory for future use . For results that include multiple values , Temporary tables may be created , You will see <temporary table> The words...

  • <exists>(query fragment)

    The subquery is converted to EXISTS

  • <in_optimizer>(query fragment)

    This is an internal optimizer object , It doesn't make any sense to the user

  • <index_lookup>(query fragment)

    Using index lookup to process query fragments , To find the right line

  • <if>(condition, expr1, expr2)

    If the condition is true, Then take expr1, Otherwise take expr2

  • <is_not_null_test>(expr)

    Verify expression is not NULL Test of

  • <materialize>(query fragment)

    Implement with subquery

  • materialized-subquery.col_name

    In the internal materialization temporary table col_name References to , To save the results of the subquery

  • <primary_index_lookup>(query fragment)

    Using primary keys to process query fragments , To find the right line

  • <ref_null_helper>(expr)

    This is an internal optimizer object , It doesn't make any sense to the user

  • /* select#N */ select_stmt

    SELECT And non extended EXPLAIN Output in progress id=N It's related to

  • outer_tables semi join (inner_tables)

    Semi join operation .inner_tables Show the table that hasn't been pulled out . See “Optimizing Subqueries, Derived Tables, and View References with Semijoin Transformations”

  • <temporary table>

    Indicates that an internal temporary table is created and the intermediate results are cached

When some tables are const or system Type , The expressions involved in the columns in these tables will be evaluated by the optimizer as early as possible , And is not part of the displayed statement . however , When using FORMAT=JSON when , some const Access to the table will be shown as ref.

Estimate query performance

Most of the time , You can estimate query performance by counting the number of disk searches . For smaller tables , You can usually find lines in a disk search ( Because the index may have been cached ), And for bigger tables , You can use B-tree Index to estimate : How many searches do you need to find a line :log(row_count) / log(index_block_length / 3 * 2 / (index_length + data_pointer_length)) + 1

stay MySQL in ,index_block_length Usually 1024 byte , Data pointers are generally 4 byte . For example , There is one 500,000 Table of ,key yes 3 byte , So according to the formula log(500,000)/log(1024/3*2/(3+4)) + 1 = 4 Search .

The index will need 500,000 7 3/2 = 5.2MB Storage space ( Suppose that the fill rate of a typical index cache is 2/3), So you can store more indexes in memory , Maybe it takes only one or two calls to find the desired line .

however , For write operations , You need four search requests to find out where to place the new index value , And then you usually need to 2 Search times to update the index and write in rows .

The previous discussion does not mean that your application performance will be affected by log N But it decreased slowly . As long as the content is OS or MySQL Server cache , As the watch gets bigger , It's just going to slow down a little bit . After the amount of data becomes too large to cache , It's going to slow down a lot , Until your application is constrained by disk search ( according to log N growth ). To avoid that , It can be increased according to the growth of data key Of . about MyISAM surface ,key The cache size of is determined by the name key_buffer_size Control of system variables , See Section 5.1.1, “Configuring the Server”

Reference documents

Related articles