With some extra work, it is possible to make LOAD
DATA INFILE
run even faster for a
MyISAM
table when the table has many
indexes. Use the following procedure:
Optionally create the table with CREATE
TABLE
.
Execute a FLUSH TABLES
statement or a
mysqladmin flush-tables command.
Use myisamchk --keys-used=0 -rq
/path/to/db/tbl_name
.
This removes all use of indexes for the table.
Insert data into the table with LOAD DATA
INFILE
. This does not update any indexes and
therefore is very fast.
If you intend only to read from the table in the future,
use myisampack to compress it. See
Section 14.1.3.3, “Compressed Table Characteristics”.
Re-create the indexes with myisamchk -rq
/path/to/db/tbl_name
.
This creates the index tree in memory before writing it
to disk, which is much faster that updating the index
during LOAD DATA INFILE
because it
avoids lots of disk seeks. The resulting index tree is
also perfectly balanced.
Execute a FLUSH TABLES
statement or a
mysqladmin flush-tables command.
Note that LOAD DATA INFILE
performs the
preceding optimization automatically if the
MyISAM
table into which you insert data
is empty. The main difference is that you can let
myisamchk allocate much more temporary
memory for the index creation than you might want the server
to allocate for index re-creation when it executes the
LOAD DATA INFILE
statement.
You can also disable or enable the indexes for a
MyISAM
table by using the following
statements rather than myisamchk. If you
use these statements, you can skip the FLUSH
TABLE
operations:
ALTER TABLE tbl_name
DISABLE KEYS;
ALTER TABLE tbl_name
ENABLE KEYS;
To speed up INSERT
operations that are
performed with multiple statements, lock your tables:
LOCK TABLES a WRITE;
INSERT INTO a VALUES (1,23),(2,34),(4,33);
INSERT INTO a VALUES (8,26),(6,29);
UNLOCK TABLES;
This benefits performance because the index buffer is
flushed to disk only once, after all
INSERT
statements have completed.
Normally, there would be as many index buffer flushes as
there are INSERT
statements. Explicit
locking statements are not needed if you can insert all rows
with a single INSERT
.
For transactional tables, you should use START
TRANSACTION
and COMMIT
instead
of LOCK TABLES
to obtain faster
insertions.
Locking also lowers the total time for multiple-connection
tests, although the maximum wait time for individual
connections might go up because they wait for locks. For
example:
Connection 1 does 1000 inserts
Connections 2, 3, and 4 do 1 insert
Connection 5 does 1000 inserts
If you do not use locking, connections 2, 3, and 4 finish
before 1 and 5. If you use locking, connections 2, 3, and 4
probably do not finish before 1 or 5, but the total time
should be about 40% faster.
INSERT
, UPDATE
, and
DELETE
operations are very fast in MySQL,
but you can obtain better overall performance by adding
locks around everything that does more than about five
inserts or updates in a row. If you do very many inserts in
a row, you could do a LOCK TABLES
followed by an UNLOCK TABLES
once in a
while (each 1,000 rows or so) to allow other threads access
to the table. This would still result in a nice performance
gain.
INSERT
is still much slower for loading
data than LOAD DATA INFILE
, even when
using the strategies just outlined.