I used to spend long hours staring at screen and running load test again and again to reduce the latency of that problematic API. Then, doing some small changes here and there and relying more on God than myself for miraculous improvement. Can you connect to this?
Voila! I learnt EXPLAIN.
I am going to share how I used EXPLAIN command to understand my queries better and eventually optimise them.
EXPLAIN is the one of the most powerful tool for understanding and optimising mysql queries. The EXPLAIN key tell you how a particular query is executed. Pick up any query and write explain in front of that query, the output will be different. It will return the processes that take place in the query, which indexes are being utilised and how the table is being scanned.
Output of explain is as following :
Id — Query ID Select_type — Type of statement. Possible outputs — (SIMPLE — Means it is a simple select query without any subqueries. PRIMARY — Select is outermost query in join. DERIVED — It is a subquery within a from clause) Table — Table referenced Type — Join type. This is an important field, it can indicate missing indexes. Frequent possible values - Const — The table has only one matching row which is indexed. This is the fastest type of join because the table has to be read only once. Eq_ref — all parts of an index are used by the join and the index is primary/ unique not null. This is the next best possible join type. This JOIN is very fast because for each row scanned in table A there can be only ONE row in table B which satisfies the JOIN condition. ref — It is like eq_ref but the key used need not be unique so there can be more than one row satisfying the condition. All — the entire table is scanned to find matching rows for the join. This is the worst join type and usually indicates the lack of appropriate indexes on the table. Possible_keys — The keys which could have been used, though they might not have been used. This column helps in optimisation as well because if this value is null, it indicated no possible indexes could be used. Key — index that was used Key_len — length of used key Ref — columns compared to index (the one in the key column) Rows — number of rows scanned/searched Extra — additional information
You can also use EXTENDED after the keyword EXPLAIN, it gives additional information on the execution of query.
So, let’s narrow down a little on what to look for in the explain output if you want to optimise the query. To understand better, let’s take a sample explain output :
mysql> EXPLAIN SELECT * FROM breakup WHERE x_id = 2 id: 1 select_type: SIMPLE Table: breakup type: ALL possible_keys: NULL key: NULL key_len: NULL ref: NULL rows: 6301 Extra: Using where
Whatever that meant! Right? Let’s try to dig a bit deeper!
- All the rows of the table are scanned, 6301, which is the size of the table.
- The key field is null, which means we are not using where on indexed field. To optimise a read query, the most basic way is to put index. Put index on the field on which you are joining the table or the one in where clause and see the difference.
Well, all that random babble! Does it do any good? Let’s try finding the impact of adding index on the key in “where”,i.e., x_id.
mysql> EXPLAIN SELECT * FROM transaction WHERE x_id = 2 id: 1 select_type: SIMPLE table: breakup type: ref possible_keys: x_id key: x_id key_len: 4 ref: const rows: 2 Extra: Using index condition; Using filesort
Let’s analyse this output, compare the fields with the explanations above and see how we made the query better. Use this command on your join queries. Effect of absence of index is more significant in a join query. The number of rows scanned can increase exponentially with the size of database if the table is not properly indexed.
mysql> EXPLAIN SELECT transaction.id, transaction.status, payment.y_id FROM transaction INNER JOIN `payment` ON transaction.payment_type = payment.type; id: 1 select_type: SIMPLE Table: transaction type: ALL possible_keys: NULL key: NULL key_len: NULL ref: NULL rows: 4542 Extra: null id: 1 select_type: SIMPLE Table: payment type: ALL possible_keys: NULL key: NULL key_len: NULL ref: NULL rows: 7748 Extra: Using where; Using join buffer (Block Nested Loop)
When we put index on the field on which we are implementing the join, i.e., “payment_type”, let’s analyse how the result changes –
id: 1 select_type: SIMPLE Table: transaction type: index possible_keys: payment_type key: payment_type key_len: 203 ref: NULL rows: 4542 Extra: Using where; Using index id: 1 select_type: SIMPLE Table: payment type: ref possible_keys: type key: type key_len: 82 ref: transaction.payment_type rows: 553 Extra: Using where; Using index The number of rows in payment scanned is nearly one-fourteenth of the number scanned without index. The time taken by the query reduced exponentially. The field “type” i.e., join type is ref instead of ALL.