I thought I knew a fair bit about MySQL before I started working on Wesabe, but I was wrong. Once your tables start to get really massive, all sorts of fun issues come out of the woodwork. There are a thousand articles out there on the basics of database optimization (use indices, denormalize when necessary, etc.), so I'm not going to cover those again. Here are some tips--most specific to MySQL--that I've learned on the job:
"dolphins" by talkrhubarb
"dolphins" by talkrhubarb
- MySQL only uses one index per table when doing a query, and the query optimizer doesn't always pick the best column, so indexing everything is not necessarily a great idea. If you're always selecting on a certain set of columns, create a single multi-column index
- related to the above, don't put indices on columns with a low cardinality. You can check the cardinality of a column by doing
SHOW INDEX FROM table_name
- if any text or blob fields are part of your SELECT query, MySQL will write any temp tables to disk rather than memory. So where possible, use large VARCHAR fields (on 5.0.3 or higher) instead. See: http://railspikes.com/2007/9/14/one-reason-why-MySQL-sucks
- if you need to do case-sensitive matching, declare your column as BINARY; don't use LIKE BINARY in your queries to cast a non-binary column. If you do, MySQL won't use any indexes on that column.
SELECT ... LIMIT offset, limit, where offset is a large number == bad, at least with InnoDB tables, as MySQL actually counts out every record until it gets to the offset. Ouch. Instead, assuming you have an auto-incremented id field,
SELECT * FROM foo WHERE id >= offset AND id < offset+limit(props to Coda for that one).
- If you have a query in which you're looking for the most recent n items of something—e.g.
SELECT * FROM foo WHERE bar = ? ORDER BY created_at DESC LIMIT 10)—your query will be significantly faster if you create a multi-column index on whichever field(s) you're selecting on plus the timestamp field that you're ordering by (in this case,
CREATE INDEX idx_foo_bar_created_at ON foo (bar,created_at)). This allows MySQL to look up the records directly from the index, instead of pulling all the records where
bar = ?, sorting them (probably first writing them to a tempfile), and then taking the last 10.
EXPLAINis your friend. If you're seeing "Using temporary table, Using filesort" in the "extra" section of EXPLAIN, that's bad.
I'll add more as we come across them. If you have any other non-obvious tips, leave them in the comments or email me (brad@ this domain) and I'll add them too.