Capacity Management Twice a Year? No. Just, no.

One of my colleagues heard a cry for help from a client struggling to manage their infrastructure capacity.  One month they’re flush with resources, the next they’re dying for disk, CPU and, in some cases, network bandwidth.  This memo was nothing if not typical:

The client asked that we do an assessment once or twice a year.  The outcome of each assessment would be a document that shows what capacity is needed (presumably until the next assessment) so that no one is asking for “new disk”, “new servers/blades”, new “network bandwidth” or “new application development”.  All requests will be pulled through this planning exercise and denied otherwise, until the next review.

Simple enough to understand, but the approach is broken.  Another colleague stepped in and suggested that instead of a once or twice yearly “snapshot”, we…

…help them build their capacity management capability. Sort of like helping them build a fishing pole rather than giving them fish.

NOW we have some sense talking!  The short-term approach is not going to solve any problems beyond the next month’s provisioning requests.  Why not teach the client how to execute capacity management effectively, giving them some long-lasting capabilities to improve their lot in life? Well…

…because of this response from someone closest to the client:

They want a health check.  They want fish.  And they are sensitive about getting proposals that do not reflect exactly what they ask for.  They may explore the “teach us to fish” approach in the future, but right now they want an independent audit of their infrastructure’s performance and upcoming needs.

Sometimes, you cannot help those who won’t be helped.

Capacity Management is not like spring and fall cleaning you do twice a year, pat yourself on the back and retire to the comfort of your couch, safe in the knowledge that the dust bunnies are under control.  Capacity management is an ongoing operation, a process like managing your car’s gas tank.  You intuitively check the tank before you bail out for the beach or head to work.  Sometimes you have to be more careful (will I get stuck in traffic for 2 hours?) and sometimes you can take calculated risks (I am on “E” but if I fill up now, I will miss the meeting; I can do it tonight and I’ll just do a lot of coasting downhill).  Other situations call for a longer-term view (I’m leaving for the mountains in a week but the gas stations will be closed for a few days; better fill up now).  Every single time you move to consume some of that gas you check the needle, unconsciously or otherwise.  THAT is Capacity Management.

There’s nothing mysterious about IT Capacity Management in this regard; there are just more needles to manage, which means I can’t “check the tank” as easily.  There are also more variables, like unexpected projects, so you need some way to gain visibility to those variables.  We won’t be successful if we bash our prospective customer and we probably won’t be able to change their minds, either.  So let’s try to understand WHY they think this way.

The why is complicated.  Hiring someone to straighten out Capacity Management can be expensive.  The executive sees this as a simple problem: show her how much you have and how much you need, buy that and call her next year at budget time.  The business doesn’t see it that way.  The business makes plans but then takes a left turn because a new opportunity cannot be ignored.  Finance tends to assign accounting for IT capacity to individual projects, applications and/or business units who are famous for not sharing.  Every asset, application, project and user contributes to Capacity and its consumption.  At the micro level, these are simple problems and the gas tank probably is checked in near real time.

At a macro level, all of these variables are dependencies on one another.  Our simple problem just became complex.  To solve for it requires time and some consensus building, two things in short supply wherever you look.  Without some structure, the questions of what shall we measure, what are the thresholds, how do we know what constitutes “empty” and what do we fill the tank with all combine in a mish-mash that ends with our executive, above, saying:

TL;DR, just fix it by looking at it as a point in time problem

The failure is ours.  We have not explained how getting Capacity Management under control can save time, money, reduce risk and simplify IT operations.  More to the point: we have not done so in language that the executive understands.  We need to communicate in ways that allow her to see the call to action and understand the benefit of doing this right.

When you are bone dry, stranded on I-5 and calling for AAA, the cause and effect are obvious.  You have plenty of time to reflect on your stupidity while you wait for a tow or gallon of gas.  It is not so clear when random outages and once or twice a year scrambles for procurement are common.  Resolving the effect does not allow time for identifying the cause.  We in IT need to do a better job explaining that.

This entry was posted in Driving Transparency, Future of IT, The Nature of IT, This is Why We Can't Have Nice Things and tagged , , , . Bookmark the permalink.

Leave a Reply

Your email address will not be published. Required fields are marked *