Thursday, 21 February, 2002

Priority Queue

A couple of weeks ago (see February 8), I mentioned that we were doing some multi-user debugging on the latest version of our software.  The project started when I was doing some stress testing for a potential client—they wanted to know how many transactions we could handle per second.  I tried Microsoft's Web application stress tool to gather the metrics, but it is too generalized to be of much use.  Instead, I wrote a program that would simulate multiple users hitting our site and taking surveys.  The first version of the program just hit the server as hard and as fast as it could, with as many threads as possible.  That was good for maximum throughput testing, but didn't simulate "typical" use, which is what we really wanted.  I'll be the first to admit that we should have done this long ago.  Ideally, we should have developed the test application in parallel with the product.  But we're all on limited budgets these days, and getting something out is often more important than delivering the highest quality software.  (Much to my dismay, I might add, but that's aseparate rant.)

Since the development team is busy building the system and it's my client who wanted the performance metrics, it fell to me to implement a testing tool.  Not that I mind—I get few opportunities these days to do real programming work.  I've spent the better part of two weeks on the project, and have assisted the development team in tracing a number of potentially embarrassing bugs, and some serious performance bottlenecks.

Along the way (and here's the point of today's ramblings), I had to implement a priority queue in which to store events.  My first pass just implemented it as a sorted list with a sequential search to locate the insertion point for new items.  That's all well and good for a small queue, but becomes terribly inefficient when the queue size grows beyond a couple hundred items.  Inserting an item into a sorted list takes on the order of N operations.  Removing an item takes a similar amount of time.  When the average queue size is 100 items, that's not much of a problem.  When N reaches 10,000, though, you start to notice the difference.  A binary heap implementation of priority queues takes an average of log2 N operations to insert or delete an item.  In a priority queue containing 10,000 items, inserting into a sorted list will take 10,000 operations (finding the insertion spot and then inserting the item).  A binary heap implementation will take about 14 operations.

Not having written a priority queue for at least a decade, I was at a loss when it came to implement that improvement.  Fortunately, you can find anything on the Internet.  A good place to start looking for priority queue implementations is this page, which has links to many different implementations and a good primer on priority queues in general.  I based my Delphi implementation on the ANSI C Reference Implementation found here.  That page also has a good description of the data structure and a good explanation of the implementation.