You might try stopping by the MySQL meetup that happens at MIT each
month (free soda/pizza):
http://mysql.meetup.com/137
There's a pretty regular crowd of serious users there, a couple of
them are working with huge amounts of data. One individual that comes
somewhat regularly has over a terabyte of data in InnoDB that has a
wealth of knowledge of how to tune it appropriately.
That doesn't answer your questions immediately, but might in the long
term.
On May 9, 2006, at 5:52 PM, Steve Revilak wrote:
>> Date: Mon, 8 May 2006 17:24:17 -0400
>> From: Alex Brelsfoard
>> To: boston-pm-PqP1ghmmPMdAfugRpC6u6w@xxxxxxxxxxxxxxxx
>> Subject: [Boston.pm] LARGE MySQL Database
>
>> Basically I am dealing with using, storing, and sorting a LOT of
>> data in a
>> mysql database.
>> With all the data in the table it makes for 404.8 Million rows.
>> In a backup
>> sql file that makes just under 80GB.
>>
>> I an using the InnoDB engine.
>>
>> I was just wondering if anyone else has had experience working with
>> databases this large, and using MySQL. I've run into some smaller
>> problems along the way due to the immensity of this table. In the
>> end, to do what we want I will be creating a smaller table, with a
>> subset of entries from the original. But the original needs to
>> exist as well.
>>
>> I'm looking for heads up warning for things I should watch out for
>> due to
>> the size of this thing, or any suggestions on speedier sorting and
>> querying.
>
> I've dealt with large volumes of data with innodb and MySQL. Not
> quite as large as yours -- maybe 200 million rows with 12 GB dump
> file.
>
> If you haven't normalized the pants out of the table, that's probably
> a good place to start. Anything you can do to trim a few bytes off
> the row size will make a big difference in the size of the overall
> table. (Is this table part of a data warehouse?). Smaller data will
> always be faster than bigger data.
>
> Of course, any structural change on a table that big won't be a quick
> one. For an internal system, that might not be a big deal. For a
> production system, it will probably be painful.
>
> For general server tuning, a few things to try
>
> - Make innodb_buffer_pool_size as big as you can, without causing
> the machine to swap.
>
> - if you're doing lots of writes (or your big table->table copy),
> set innodb_log_file_size to be a sizable percentage of
> innodb_buffer_pool_size; 30% or so. That will help to speed up
> normal operations (innodb will have to spend less time flushing
> pages out do disk). However, if you have an unclean shutdown,
> recovery will take longer.
>
> On my `big' innodb database, we're using two 200M innodb logs.
> That sped up bulk inserts a *lot*.
>
> - if some of the data is very rarely used, you might consider
> putting that portion of the data into archive tables. The
> downside of archive tables is that they don't support indexing.
> But they're very compact (just a blob of zlib data, more or
> less).
>
> Finally, if you haven't run across these
>
> http://dev.mysql.com/doc/refman/4.1/en/innodb-tuning.html
> http://dev.mysql.com/doc/refman/4.1/en/innodb-configuration.html
>
> Of course, some of this depends on what the data is, and what kind of
> queries you're running against it. Measuring performance before and
> after is an important step in the process.
>
> hth
>
> Steve
>
> _______________________________________________
> Boston-pm mailing list
> Boston-pm-PqP1ghmmPMdAfugRpC6u6w@xxxxxxxxxxxxxxxx
> http://mail.pm.org/mailman/listinfo/boston-pm
|