Kuro5hin.org: technology and culture, from the trenches
create account | help/FAQ | contact | links | search | IRC | site news
[ Everything | Diaries | Technology | Science | Culture | Politics | Media | News | Internet | Op-Ed | Fiction | Meta | MLP ]
We need your support: buy an ad | premium membership

[P]
Tweaking MySQL (4.x) Primer

By kpaul in Technology
Sun Jan 18, 2004 at 11:38:04 PM EST
Tags: Software (all tags)
Software

Your database driven website was doing well when only Uncle Ted and the guy down the street were viewing it, but now thanks to a brilliant piece of viral marketing your website is being pounded. Before shelling out money for a more powerful server, it would benefit you to see if you could optimize to increase performance.

In this article I'll share a few tips I've picked up over the last week or so dealing with optimizing your MySQL 4.x database server. Some are nifty little apps that help you understand what's going on, while others are ways to improve your SQL for better performance. I'm running FreeBSD 4.x personally, so YMMV for some of this if you're using a different OS.


Some Tools Needed

  • vi (or some other inferior text editor) - This will be used to change configuration files.
  • top - This monitors your server in realtime. It will help you see what effects your changes have on your server.
  • mytop - This nifty little app is like the UNIX top command, but it shows MySQL server stats in realtime. This is handy to see how often you're being hit per second, how many slow queries you have, etc.

MySQL Recon

In order to improve your performance, you must gather information about your MySQL server. The following will give you important information that will help you see where you can get a little performance boost. First, SSH to your server and log into the MySQL client.

MYSQL> show variables;
+---------------------------------+---------+-
| Variable_name | Value |
+---------------------------------+---------+-
| back_log | 50 |

... etc. etc. ...

| wait_timeout | 28800 |
+---------------------------------+---------+-

Some of the important variables for optimizing are; key_buffer, max_allowed_packet, table_cache, sort_buffer_size, read_buffer_size, myisam_sort_buffer_size, thread_cache, and query_cache_size.

Also useful is the show status command. This will tell you things like how many of each SQL command has been run, how long the server has been up, how your indexes are being used, and other useful information. If you find you don't like scrolling back in your terminal to see all the data, you can also append a like 'Variable_name'.

Both of the previous commands can be run from your shell of choice with the mysqladmin extended-status and mysqladmin variables.

Another way to see what's going on with your MySQL server is to use mytop, which allows you to see information about your database server's performance in realtime. You may have to install this on your system, but it's definitely worth it if you're trying to get the best performance out of your MySQL server.

An example of the output:

prompt> mytop -u user -p password DB
MySQL on localhost (4.0.17-log) up 0+01:19:39 [02:16:43]
Queries: 3.4k qps: 1 Slow: 1.0 Se/In/Up/De(%): 47/00/02/00
qps now: 0 Slow qps: 0.0 Threads: 1 ( 1/ 0) 00/00/00/00
Key Efficiency: 99.4% Bps in/out: 34.3/678.2

Id User Host/IP DB Time Cmd Query or State
-- ---- ------- -- ---- --- ----------
239 root localhost DB 0 Query show full processlist

Play around and get to know your server. If you're unsure of something, MySQL has a very indepth manual available online that will explain what all of the variables do.

How-to Tweak MySQL Variables:

There are three ways to set the variables in MySQL 4.0.17. First, you can add them to your my.cnf file and restart MySQL. Second, you can shutdown the MySQL server and restart it with command line options. New in the 4.x series of MySQL is the ability to use SET to change variables while the database server is still running.

MY.CNF SETTINGS

If you look in one of the mysql directories that were made when MySQL was compiled (mine are in /usr/local/share/mysql/) you will find four sample configuration files names my-huge.cnf, my-large.cnf, my-medium.cnf and my-small.cnf. Now, before you go off firing up MySQL with my-huge.cnf, thinking bigger is always better, please note that if your hardware is too limited, starting MySQL with this conf file can bring your website to its knees very quickly.

Linux Magazine has a really nice breakdown of the four cnf files that are distributed with MySQL 4.x.

  • my-huge.cnf: For systems with 1 GB or more RAM where MySQL is the primary service running on the machine.
  • my-large.cnf: For any systems with 512 MB of RAM where MySQL is the primary service running on the machine.
  • my-medium.cnf: For systems with less memory (32 MB 128 MB) where MySQL is sharing resources with other services.
  • my-small.cnf: For systems with less than 64 MB of memory where MySQL is sharing resources with other services and will not be utilized much.

Once you find one that broadly matches your circumstances, copy it to /etc/my.cnf. (Again note, this may be different on your particular system.)

COMMAND LINE OPTIONS:

If you already have a my.cnf file picked out, you can further tweak settigs by shutting down the server and restarting with command line options. Note, though, that once you get a working setup, you'll want to put it into a my.cnf file so that the variables are set if the machine is ever rebooted.

An example would be: prompt> mysqld_safe --skip-bdb --log-slow-queries & This would start the MySQL server with no cache for BDB tables (just make sure you're not using this table type ;) and turns the log on that will write slow queries (ones that take longer than the value of long_query_time.)

SET COMMAND

New to MySQL 4.x is the ability to change variables while the MySQL server is still running. To do this, you use the SET command. The following will set the long_query_time to 10.

MYSQL> set long_query_time=10;

Optimizing SQL

Explain is your friend. Doesn't make sense to you? Let me explain. Running the following will help you find out if queries are using the table's index or not and whether the query might need some tweaking:

MYSQL> EXPLAIN SELECT * FROM table WHERE non_indexed_field = 'value';

table type possible_keys key key_len ref rows Extra
table ALL NULL NULL NULL NULL 4829 Using where

This is not a very good query. One of the first things to look at is the 'type' column. The ALL in this case shows us that MySQL is looking in all the rows for data. The other possible values for type (in order from best to worst) are - system, const, eq_ref, ref, range, and index.

Possible_keys shows what indexes could be used, with key telling us which ones, if any. Rows is the value of the number of rows MySQL must use to get the data. The lower this number is, the better.

Let's see what happens when we make a change to our SQL statement:

MYSQL> EXPLAIN SELECT * FROM table WHERE indexed_field = 'value';

table type possible_keys key key_len ref rows Extra
table const PRIMARY PRIMARY 3 const 1

Note that the above was a best case scenario. The trick, though, is to check all your SELECT statements with explain to ensure that it's using keys correctly and is as optimized as it can be.

Granted, that was a simple SQL query. What about JOINS? Taking an example from the MySQL manual:

SELECT * FROM a,b LEFT JOIN c ON (c.key=a.key) LEFT JOIN d (d.key=a.key) WHERE b.key=d.key

From the MySQL manual:

The table read order forced by LEFT JOIN and STRAIGHT JOIN will help the join optimizer (which calculates in which order tables should be joined) to do its work much more quickly, as there are fewer table permutations to check.

In the above example, MySQL must do a full scan of b because the LEFT JOIN causes that table to be read before d. A simple change will help speed things up:

SELECT * FROM b,a LEFT JOIN c ON (c.key=a.key) LEFT JOIN d (d.key=a.key) WHERE b.key=d.key

Another thing to check is that you're using the correct field type in your tables. To check this, use the analyze procedure on your tables. The following example shows how to form the SQL statement to get a recommendation on what field types you should use for your table:

SELECT * FROM table_name PROCEDURE ANALYSE() This will give you information on all your fields, including the Optimal_fieldtype. Run ANALYSE on your tables and make any suggested changes.

Finally, indexes are also a way you can increase performance on MySQL applications. Be warned, though, that too many indexes (especially on the wrong fields) can hurt performance. If you have an index that isn't used, you can suffer a performance hit because the index has to be updated every time the table is updated.

Again, using EXPLAIN on your SQL SELECT statements will help you determine which fields need to be indexed. In general, though, indexes are used if the fields are in your WHERE clause.

Closing Thoughts

  • Use OPTIMIZE tablename on tables that are updated a lot.
  • If possible, don't use TEXT, BLOB, or VARCHAR as field types because they're variable length fields. If you have to use even one of these field types, though, you lose any benefit you would have gained by by not using any.
  • If you are using VARCHAR or TEXT as field types, run OPTIMIZE tablename occasionally.
  • After trying the things above, other optimizations include (in order of importance: RAM, the Disks, and finally CPU power.)

As the title indicates, this is a very rough primer to get you started on optimizing your MySQL server and the applications that use it. This, along with Google groups, should give you a good start on making sure your site is as zippy as it can be.

Sponsors

Voxel dot net
o Managed Hosting
o VoxCAST Content Delivery
o Raw Infrastructure

Login

Poll
Favorite database
o MySQL 27%
o mSQL 0%
o SQL Server 2000 3%
o Postgres 42%
o FileMaker Pro 0%
o Oracle 3%
o Sybase 0%
o DB2 Universal Database 2%
o I don't need no steenkin database. 18%
o other (see below) 1%

Votes: 77
Results | Other Polls

Related Links
o Google
o optimizing
o MySQL 4.x
o vi
o top
o mytop
o important variables
o show status
o breakdown of the four cnf files
o JOINS
o indexes
o Also by kpaul


Display: Sort:
Tweaking MySQL (4.x) Primer | 39 comments (29 topical, 10 editorial, 2 hidden)
Isn't tweaking MySQL pointless? (1.76 / 13) (#11)
by cbraga on Sun Jan 18, 2004 at 04:27:26 PM EST

Any application that demands performance should be running either on PostgreSQL, the best O.S. database IMO, or Firebird, the O.S. descendant of Borland's Interbase; or even SQL Server or Oracle. Period. Mysql is obsolete. Heck, it isn't even a real database it's just a crude SQL layer on top of your choice of a table storage library. The very fact that there's more than one table to choose from in MySQL is a testament to it's poor design.

Anyone who wants any sort of performance from a database should get one of those I mentioned.

ESC[78;89;13p ESC[110;121;13p

On the contrary... (2.50 / 6) (#14)
by leviramsey on Sun Jan 18, 2004 at 08:02:49 PM EST

The very fact that there's more than one table to choose from in MySQL is a testament to it's poor design.

Please explain.

The way I see it, that's an excellent design decision. Different applications require different levels of performance vs. features. There are applications where transactions are superfluous and the overhead added by the same is a detriment to performance (applications where there's a large number of reads relative to a write). This can even be extended to different tables within a database app.



[ Parent ]
Look at it this way (2.16 / 6) (#18)
by cbraga on Sun Jan 18, 2004 at 08:42:33 PM EST

A database is a system to which you give data and it's its job to store the data and retrieve it the best way possible. Robust databases need robust storage mechanisms which must be made specially to suit a databases's needs. Until innodb came about mysql severely lacked any resemblance of a decent storage mechanism since the other table storage options sucked so much. It lacked even something as basic as row-level locking.

Therefore, mysql's offering of multiple types of tables was just an attempt to shift the problem to the user. Right about now, they should just drop aupport for all table types except innodb.

ESC[78;89;13p ESC[110;121;13p
[ Parent ]

Not convincing (2.83 / 6) (#19)
by leviramsey on Sun Jan 18, 2004 at 09:03:56 PM EST

Perhaps MySQL should make InnoDB the default table type. However, the fact that other table types, in certain types of applications (those with large numbers of reads per write), perform better than InnoDB, argues against removing the alternative table types.

As an example, I run my business (see .sig) on MySQL (not the website, but as a stat database that I use to make my selections). It has an infinite ratio of reads to writes; all writing is bunched up long before I perform the analysis. I've run it many times under both InnoDB and MyISAM (mysqldump is wonderful for this sort of thing), and found MyISAM to be on average, 3 to 4 times faster, depending on the size of the dataset.

Now since my application takes around 4 minutes right now to analyze the current NBA season (I'm anticipating around 10 minutes by the time the NBA Finals roll around), and since certain values need to be reset after each run (things like the desired standard deviation and so forth), it takes me around 15 minutes or so to handicap a day's worth of NBA games. Double that for later in the playoffs. If I didn't have the MyISAM option, I'm looking at 120 minutes, during which lines are guaranteed to move.



[ Parent ]
Not convincing, either (2.37 / 8) (#20)
by cbraga on Sun Jan 18, 2004 at 09:48:05 PM EST

Since your application is read-only InnoDB obviously incurs penalties due to it's transaction support you were very happy to do away with — you're making a tradeoff of unneeded safety for speed and a point can be made for having multiple table types in that it allows those tradeoffs to be made.

However, I must raise two issues. Issue one is that PosgreSQL (that I'm very familiar with) will give great performance, comparable to MyISAM if only a bit slower (I'd say 20% slower, never as much as 80% as InnoDB on your case) while at the same time offering full transaction integrity, which brings us back to the point I made earlier, that mysql table's are still inadequate.

Issue number two is that a read-only application is a poor representation for database performance since databases are designed to handle multiple concurrent read/write requestes. Still, since a database obviously suits your needs better than a flat table file would (or does it?) you could make a point for the suitability of databases to those applications, but I'd bring you back to issue number 1 which would make that point irrelevant.

ESC[78;89;13p ESC[110;121;13p
[ Parent ]

Mmmmm (2.55 / 9) (#23)
by robertitico on Mon Jan 19, 2004 at 03:28:34 AM EST

First of all, I agree (kind of). Postgres is the best among free (and perhaps non-free) databases. But c'mon, what you said has not much sense, coming from a tech guy (you seem like one).

If this guy leviramsey needs a read only database, and he relies in Isam/MySQL tables to accomplish speed, why he should switch to Postgres and get 20% slower? For some of us, our bussiness is based on the efficiency (just talking about speed) of our applications. Yes, transactions are great, but for a read only database have not use at all. And personally, I don't think a read-only database is a poor designed one, it is just a solution for some specific problem. A very good one indeed (for this case).

read-only application is a poor representation for database performance since databases are designed to handle multiple concurrent read/write requestes

Well, the fact that a car is designed to transport people from point A to point B doesn't mean that you cannot use it for some other mean (listening to music, making babies, whatever...) if the situation requires it. Of course, you can still listen to music on a high definition stereo, with double buffered sound system bla bla. Or rent a room in the most fancy hotel. But for those guys that are in the middle of nowhere in their cars, horney with their mates and their new "Massive Attack" record, cars with stereo and nice seats are the solution. And this is about soulutions, isn't it?

I hope that you get the point with my poor example. By the way, as you propose to use a flat tables file for this Read-only scenario... I tell you that it is not a good choice, even if the flat table is smaller than the default database files size and you are a awk/sed/cut/grep master.

And please... SQL Server? That is not even a serious database.

[ Parent ]
Alternative to flat file or RDBMS (none / 2) (#38)
by bluehound on Thu Jan 22, 2004 at 07:54:55 PM EST

as you propose to use a flat tables file for this Read-only scenario... I tell you that it is not a good choice, even if the flat table is smaller than the default database files size and you are a awk/sed/cut/grep master.
I recently played around with the BerkeleyDB and Benchmark Perl modules. It's great... just "tie" a hash to a BTree or Hash implementation and there you go. Script available...
-shane
[ Parent ]
uh-huh (2.42 / 7) (#15)
by atarola on Sun Jan 18, 2004 at 08:03:37 PM EST

Anyone who wants any sort of performance from a database should get one of those I mentioned.
Two characters: /.


"Always code as if the guy who ends up maintaining your code will be a violent psychopath who knows where you live"
-- John F. Woods
[ Parent ]
HA! (2.71 / 7) (#17)
by cbraga on Sun Jan 18, 2004 at 08:37:37 PM EST

/. stuggles with mysql because it only does (or did until very recently) table locking and slashcode is full of workarounds to avoid locking whole tables as that killed the site. How dows that skeap well for mysql?

ESC[78;89;13p ESC[110;121;13p
[ Parent ]
Nope (3.00 / 5) (#22)
by Betcour on Mon Jan 19, 2004 at 03:01:06 AM EST

As far as I know, they switched to Innodb table handlers which have row-level locking (and hence scales very nicely as the load goes up).

[ Parent ]
After how much time? InnoDB only came later. (none / 3) (#25)
by cbraga on Mon Jan 19, 2004 at 06:30:29 AM EST

/. was struggling with db performance for a very long time by then.

ESC[78;89;13p ESC[110;121;13p
[ Parent ]
firebird? (1.25 / 4) (#32)
by anmo on Mon Jan 19, 2004 at 12:07:43 PM EST

I thought that was a web browser :-)

[ Parent ]
+1, Technology (1.29 / 17) (#13)
by polish surprise on Sun Jan 18, 2004 at 05:05:25 PM EST

Even if it's shitty and substandard technology.

--
Controversy is my middle name.

+1 FP. to get the goat thingy out of my view (nt) (1.07 / 13) (#21)
by mami on Sun Jan 18, 2004 at 11:21:05 PM EST



fav db: (none / 3) (#24)
by dimaq on Mon Jan 19, 2004 at 05:51:44 AM EST

CVS

Favorite database (none / 3) (#26)
by Cameleon on Mon Jan 19, 2004 at 06:42:28 AM EST

Ozone

Ozone is the shizzle (none / 2) (#27)
by bg on Mon Jan 19, 2004 at 07:44:09 AM EST

I'm sick of object-relational mapping. I found Ozone and was saved.

SQL is ... eww... With Ozone you just have persistant objects. No data type whackiness.

Now if only I could easily port my old MySQL applications.



- In heaven, all the interesting people are missing.
[ Parent ]
the future (none / 1) (#36)
by phred on Tue Jan 20, 2004 at 10:08:28 AM EST

is to persist ram and be done with all this disk / ram duality.

[ Parent ]
all of ram (none / 1) (#39)
by DEBEDb on Sat Jan 24, 2004 at 12:46:22 AM EST

including the instruction stack? including the part of memory where the contents of currently received network packet are? :)) that could do for nice effects...

[ Parent ]
Nice (none / 2) (#29)
by l3nz on Mon Jan 19, 2004 at 08:21:05 AM EST

But can you search objects according to a property or is it just an object store?

Popk ToDo lists - yet another web-based ToDo list manager. 100% AJAX free :-)
[ Parent ]

Yes and no (none / 2) (#35)
by Cameleon on Tue Jan 20, 2004 at 05:25:38 AM EST

The answer is yes and no. The database doesn't take care of the searching for you, but the storage is transparent. This means you write your application as normal, just defining one or a few root objects as stored in the database. They, and everything they contain, are then persistent. You find objects the same way you would in any other application, and don't have to worry about them being in a database.

The only minor thing is that to get one of the root objects from the database, you have to know its name, and there is no way to get all object names from the database, so you have to store them separately. But since it is most efficient to only have a few root objects (since this minimizes the number of transactions) this isn't a big problem. I believe retrieving all database objects might also be supported once version 1.2 comes out, but I'm not sure.

[ Parent ]

Interesting read (2.80 / 5) (#28)
by l3nz on Mon Jan 19, 2004 at 08:16:32 AM EST

I always found myself rather embarassed when optimizing SQL for mySQL because we all know what is best to what, but we don't know how much better something is. For instance: I know building a full index is good, but maybe a partial index will do and use a fraction of the disk space. I was in need for a tool that would let me evaluate how much time it took to repeatedely execute a query with no mySQL caching but disk access and measure an average out of it. So I wrote a simple PHP page that will test mySQL queries and show me average response times; I found it very easy to see if the gain in an optimization is 5% or 50%. If anybody's interested, comment it here or on my diary.

Popk ToDo lists - yet another web-based ToDo list manager. 100% AJAX free :-)

i'd love to see it... (none / 1) (#30)
by kpaul on Mon Jan 19, 2004 at 10:42:22 AM EST

post a link?

-kpaul


2014 Halloween Costumes
[ Parent ]

yes please show (nt) (none / 1) (#33)
by Run4YourLives on Mon Jan 19, 2004 at 01:06:49 PM EST



It's slightly Japanese, but without all of that fanatical devotion to the workplace. - CheeseburgerBrown
[ Parent ]
link please (none / 0) (#40)
by jozza on Fri Oct 21, 2005 at 04:11:39 AM EST

I'd love a link to this email

[ Parent ]
I strongly dislike MySQL (2.22 / 9) (#31)
by MyrddinE on Mon Jan 19, 2004 at 11:28:07 AM EST

It does not behave like a relational database. Using MySQL reminds me of back when I used to use GWBasic to program in. It's just a bad DB system, like BASIC is just a bad language. Missing features, sloppy type checking, silent 'fixing' of your data. Ugh.

If you don't believe me, take a look at this page, that lists many of the problems with MySQL. Things that are usually properly documented... but it's like documenting a bug. Just because you put it in writing doesn't make the behaviour RIGHT.

outdated and idiosyncratic (2.50 / 4) (#34)
by aminorex on Mon Jan 19, 2004 at 11:15:33 PM EST

Those criticisms of MySQL are largely outdated
and/or idiosyncratic.  Subselects are in 4,
stored procedures are in 5, to provide examples
of the outdated.  Truncating string overflow
and saturating finite fields are perfectly
reasonable behaviours that are suited to the
preferences of some users but not others -- you
can please some of the people....  There is a
third category of criticism represented in those
links as well:  The invalid, and misconcieved.
For example, complaining about connection timeout.
That's a socket issue.  Set SO_KEEPALIVE, silly.
There's really precious little meat here -- but
it does show that MySQL development is right on
track:  Most of the valid complaints are resolved
in the development release.


[ Parent ]
overly optimistic (none / 3) (#37)
by esquilax on Tue Jan 20, 2004 at 02:49:06 PM EST

Subselects are in 4, stored procedures are in 5, to provide examples of the outdated.

subselects are in 4.1, which is still beta. version 5 is just the dev tree. there is not a production-quality version of MySql that has these features. plus the fail-silently "do the right thing" features of MySQL are a major impediment to proper ACID compliance and the development of production-quality software.

[ Parent ]

Tweaking MySQL (4.x) Primer | 39 comments (29 topical, 10 editorial, 2 hidden)
Display: Sort:

kuro5hin.org

[XML]
All trademarks and copyrights on this page are owned by their respective companies. The Rest 2000 - Present Kuro5hin.org Inc.
See our legalese page for copyright policies. Please also read our Privacy Policy.
Kuro5hin.org is powered by Free Software, including Apache, Perl, and Linux, The Scoop Engine that runs this site is freely available, under the terms of the GPL.
Need some help? Email help@kuro5hin.org.
My heart's the long stairs.

Powered by Scoop create account | help/FAQ | mission | links | search | IRC | YOU choose the stories!