Friday, January 9, 2009

ApacheCon 2008: Java Monitoring and Troubleshooting

I'm finally getting around to watching the presentations from ApacheCon 2008. I plan to blog about them and highlight some of the best things about each session. It costs $99 for access to the presentations, but is more than worth it if you use Apache, Tomcat, Java or any other Apache project in a production environment.

The first one that I watched was Java Monitoring and Troubleshooting, since I've spent a lot of time in the past year chasing down bugs in Java apps. Almost every time I've thought "there's got to be an easier way to do this", now, thanks to Bill Au's presentation, I know some of the easier ways. The materials from his presentation are available here.

The entire presentation is great, but here were some of the highlights for me:

* I've used thread dumps generated by doing "kill -3" on java processes before, but although they contain useful information, I usually spend a lot of time digging for it. He highlights a couple of tools that he uses to analyze thread dumps more efficiently. Basically his methodology is to take three consecutive thread dumps 5 seconds apart. Then he uses the samurai.jar tool in conjunction with his own script overview.pl to analyze the 3 separate files. The overview.pl script gives a good big picture look at how many threads are in each state (runnable, locked, etc.) The samurai tool takes multiple thread dumps and shows how each thread changes state from dump to dump to give a historical perspective. His techniques are especially useful if you're looking for deadlocks or hung threads. Both of these tools are available along with the presentation material.

* His demonstration of how to compare the performance of separate garbage collection methods for your application. He starts by getting a log of garbage collection activity for your application for a reasonably long period (an hour or so) when it is under at least moderate use. To do this, use the -Xloggc: option on your java command line. Get a log file for each GC method that you want to compare. Then download the HPjmeter tool released by Hewlett Packard. Then you can open up the logs from HPjmeter and get all sorts of statistics, including how much overhead, the number of full garbage collection events, and average time for each garbage collection. And to make things even better, you can open multiple log files from different GC methods and graph them against each other to get an easy to read picture of the pros and cons of each.

* Another neat thing was his demo of jhat, the heap analysis tool released by Sun to view the objects in the heap. It's released in Java 6, but can also process heap dumps from 1.4 and 1.5. Once started, it provides a web interface to browse the heap and view objects. The interface make it really easy to track down objects that are taking up more than their fair share of space, but just in case you need more flexibility, there's Object Query Language (OQL). As the acronym suggests, it's a SQL-like interface to the heap. So you can issue statements like this one that finds all finalizable objects and the heap size used by each:

select { obj: f.referent, size: sum(map(reachables(f.referent), "sizeof(it)")) }
from java.lang.ref.Finalizer f
where f.referent != null


Neat stuff! Again, I encourage you to view the entire thing or at least read all of the presentation slides, because all of the information is really good.

No comments: