Friday, November 7, 2008

PHP Profiling

For the past couple of days I've been troubleshooting performance problems with a Mediawiki installation on a server we maintain. It's run me through a lot of hoops trying to pinpoint the performance bottleneck. I didn't see anything obvious in the Apache or MySQL logs (although I did learn about the mysqldumpslow command, which makes digging through the slow query logs a lot easier).

Next I went to the Xdebug PHP extension. I've used it for debugging PHP problems before and I knew that it had a profiling component as well. Installation is a breeze if you have pecl installed:

pecl install xdebug

Edit php.ini and put in the following to enable profiling:

zend_extension=/usr/lib/php/modules/xdebug.so
[xdebug]
xdebug.default_enable=Off
xdebug.profiler_enable=On

I disabled the default stack traces since it was a production environment. I only left the profiler enabled for enough time for me to run my tests, since it generates a lot of information for each process. Of course, I guess I could have just enabled it for the Mediawiki directory if that had been a big issue, but it was a pretty low-traffic time, so I only had 15 files to sort through from my testing period. I copied over the /tmp/cachegrind.out.* files to my desktop and fired up KCachegrind to analyze the output.

It was much more enjoyable to use the color-coded GUI than looking through a text file. After finding the correct output file, I traced the bottleneck down to the MessageCache->lock() method. This Mediawiki installation uses memcached to cache db objects and it was just spinning there for insane amounts of time trying to lock objects in memcached.

Next I went to the Mediawiki debug log. As a side note, Mediawiki has its own profiling setup that you can use as well, although I prefer the GUI that Xdebug uses. I saw messages like this:

MessageCache::load(): cache is empty

Then 10 seconds or so later:

MessageCache::load(): unable to load cache, disabled

After a couple of false starts, including playing around with the APC cache as a replacement, it turns out that the memcached process Mediawiki was trying to use didn't exist. Apparently the startup commands had been put in /etc/rc.local and hadn't included the full path, so after the last reboot of the system, memcached hadn't restarted. A proper init script and a couple of chkconfig commands later, all was well.

No comments: