The alternative on Linux is to use O_DIRECT for log files. I don't think O_SYNC would be a good choice in this case because that allows the file to be stored in the OS buffer cache which again dedicates too much RAM to log files.
There is another potential problem when the log file can be cached in the OS buffer cache. The first write to an uncached page requires the page to be read into the buffer cache. This is not a good use of disk IO. We think that read-modify-write will not occur when O_DIRECT is used. To test this I added a new option for the innodb_flush_method configuration variable. When set to all_direct both database and log files are opened with O_DIRECT. I had to change a few places in the log code to use 512-byte aligned buffers when reading and writing.
The results from testing so far are not conclusive. I need to ask my peers to look at the data from vmstat and /proc/meminfo and we need more results.
I first tested this with a simple benchmark. I used sysbench to update 1 row per-transaction using a cached 1M row table. The test was run for 1, 2, 4, 8, 16, 32, 64, 128, 256, 512 and 1024 concurrent connections using MySQL 5.1.52 with the Facebook patch.
This lists the throughput in commits/second for innodb_flush_method=all_direct and O_DIRECT. Results below are for 1 to 128 connections to avoid line-wrap. Results for all_direct are better at low-concurrency and O_DIRECT is faster for 4 or more concurrent connections.
1 2 4 8 16 32 64 128 1952 2777 3530 3755 3829 3741 3760 3803 all_direct 1608 2479 3507 4541 4550 4644 4698 4581 O_DIRECT
But throughput on an artificial benchmark isn't the only way to determine whether the change is useful. I have also been running this on a test server that gets a mirror of the production workload. Response time for write transactions is unchanged, which is good. I also want to know whether this increases the read IO rate for the transaction log, but I need to put the log on a separate filesystem to collect that data and that is work for the future.
The final data I have is the output from vmstat -sa. This is from two test servers that get a mirror of the production workload. Each server uses 2 2G transaction log files. With O_DIRECT the 4G log files might be in the OS buffer cache which can consume 4G of RAM for no good reason. The value for active memory with O_DIRECT is about 1.7G larger than for all_direct. This might be a benefit from using all_direct and that might allow me to make the InnoDB buffer pool larger when all_direct is used.
Output for all_direct
74178160 total memory 73788576 used memory 66792708 active memory 5973644 inactive memory
Output for O_DIRECT
74178688 total memory 73782800 used memory 68543584 active memory 4138808 inactive memory Article
Nessun commento:
Posta un commento
Nota. Solo i membri di questo blog possono postare un commento.