I handed over my $1 bill to pay for my remarkably cheap fast food coffee (I had too much blood in my caffeine stream, judge me later). In response, I got a receipt and an hourglass timer. My first reaction was, “WTF is this?” My second reaction was to ask aloud, “What is this?” (thankfully I had enough caffeine to filter my language that early in the morning).

It turns out that, at least in my neighborhood, McDonald’s has instituted a 60 seconds or it’s free policy. If the hourglass empties between the time the first cashier took my money and the time I receive my coffee (or whatever I ordered), then apparently I get a free sandwich.

This seems like a good idea at first, but not to me. I’ve had far too much experience with queues (both of the human and computer type) to think this is going to be anything other than a total train wreck. I ease up to the food window and, lo and behold, my gimmicky little plastic hourglass is empty.

I exchange my coffee for the empty hourglass, and then the cashier goes about her business. I explain that my hourglass was empty, and it takes them 30 seconds to decide what to do with me. After figuring out that my presence has basically caused an exception to normal processing, they tell me to pull forward. I then wait four full minutes to receive my little paper business card telling me that I am owed a free sandwich.

Those of you who have dealt with optimizing back-end systems for throughput, especially systems with message passing and queueing, will immediately recognize huge problems with the design of this promotion. First and foremost is that they have enforced a maximum latency requirement of 60 seconds (+ the amount of time it took me to get to the first cashier) on my order. If the customer perceived latency (remember the internal clock started ticking before the customer clock) exceeds 60 seconds, then McDonalds loses between 40 cents and a buck or so, depending on the sandwich the customer claims upon the next visit.

Unfortunately, that’s not the real problem, and that’s where most people start to horribly misunderstand queueing theory both in the real world and in pipeline processing for computers. The real problem is that exceptions in the processing pipeline have a direct impact on the processing of items further in the backlog. In other words, the fact that it took them 30 seconds to figure out what to do with me likely emptied the hourglass of the customer behind me, and the customer behind them is in a good position to get free goodies as well.

Next, they decide to push the queue exceptions to an overflow queue (the “please pull ahead and wait for us at the curb” routine). This is another spot where lots of people screw up queue implementations. In the case of McDonalds, the size of that overflow queue is 1, possibly 2 if you’re at a big place. Otherwise, you get dumped into the parking lot, which also has a finite number of spaces.

In server software, if the overflow queue overflows, the best case scenario is that all work halts, everything grinds to a halt, and the main feeder queue builds and builds and builds, all waiting for a single item to be processed at the head of the queue. In the real world, you end up with a pile of angry customers waiting in the queue.

Now, someone at McDonald’s was thinking, because the “free stuff” timer starts at window 1 and ends at window 2 when you get your food. This means that, at any time, regardless of the size of the queue, there are a maximum of 2 people (given the average distance between drive-thru windows) with running hourglass timers and potentially able to take advantage of the free sandwich offer. This basically means that, no matter how crappy that line gets, McDonald’s financial exposure will be limited to a small subset of the people in the queue. Even more interesting is that statistically, every time someone uses a “free thing” coupon, they spend more than they otherwise would have. So, even if McDonalds gives out a metric crap-ton of free sandwich coupons, they will likely come out ahead anyway, dollar-wise.

But, that doesn’t say anything for the customers stuck in the queue, where most of them are stuck in line because this giveaway is actually increasing queue processing time, and their delay will not be rewarded because the food prep actually gets ahead when the drive-thru line queues up and slows down.

So, now that you’ve actually made it through the entire blog post, reading the verbal equivalent of me bloviating about something that seems to have no relevance to computer programming… what was the point?

For developers building backend processing pipelines:

  • Optimize your exception processing plan. When everything is working properly, you might be able to process a hojillion messages per second, but when a single item clogs up your queue and tanks your throughput, that’s when managers come storming into the office with flamethrowers.
  • Spend time planning for poison pills and queue cloggers. Sometimes, processing a particular item can crash the processor. You should be able to quickly recover from this, identify the poison pill, and isolate it so you don’t process it again (at least not in your main pipeline). Similarly, if throughput is crucial (and it usually is), then if an item is going to take an inordinately long time to process, put it in another queue to allow the main one to drain … and remember that if your overflow queue overflows, you’re just as screwed as you were when you started.
  • Perceived performance is just as important as clocked performance. McDonalds is doing a little slight of hand with an hourglass gimmick to get you to focus on the smallest time interval in the entire process while the food processing started much earlier. If you can’t get your single pipeline to go any faster, maybe splitting into multiple parallel pipelines will offer the customer (or client, client app, etc) a better perceived performance?
  • Identify and mitigate choke points. If you’re doing throughput and latency analysis on your processing pipeline (you are, aren’t you?) then you should be able to easily identify the point at which the pipeline takes the longest. This is an ideal spot to fan out and perform that task in parallel, use map/reduce, or potentially decide to fork the pipeline. For McDonalds, the slowest activity is actually taking your order, so many places actually fork that pipeline and can take 2 orders in parallel.

Congratulations if you made it all the way to the end. Now you can see how even stupid things like waiting for my damn coffee can get me fired up about queueing theory and optimizing back-end server processing pipelines.