7.2.2. Estimating Query Performance
In most cases, you can estimate query performance by counting
disk seeks. For small tables, you can usually find a row in one
disk seek (because the index is probably cached). For bigger
tables, you can estimate that, using B-tree indexes, you need
this many seeks to find a row:
log(row_count
) /
log(index_block_length
/ 3 × 2
/ (index_length
+
data_pointer_length
)) + 1
.
In MySQL, an index block is usually 1,024 bytes and the data
pointer is usually four bytes. For a 500,000-row table with an
index length of three bytes (the size of
MEDIUMINT
), the formula indicates
log(500,000)/log(1024/3*2/(3+4)) + 1
=
4
seeks.
This index would require storage of about 500,000 × 7
× 3/2 = 5.2MB (assuming a typical index buffer fill ratio
of 2/3), so you probably have much of the index in memory and so
need only one or two calls to read data to find the row.
For writes, however, you need four seek requests to find where
to place a new index value and normally two seeks to update the
index and write the row.
Note that the preceding discussion does not mean that your
application performance slowly degenerates by log
N
. As long as everything is cached by
the OS or the MySQL server, things become only marginally slower
as the table gets bigger. After the data gets too big to be
cached, things start to go much slower until your applications
are bound only by disk seeks (which increase by log
N
). To avoid this, increase the key
cache size as the data grows. For MyISAM
tables, the key cache size is controlled by the
key_buffer_size
system variable. See
Section 7.5.2, “Tuning Server Parameters”.