The Always Imperfect TCO

posted Jul 30, 2017, 11:18 AM by Marc Kagan   [ updated Jul 30, 2017, 11:20 AM ]

Total Cost of Ownership (TCO) analysis has been a favorite tool of IT departments and especially vendors for many years.  The concept is deceptively simple at first glance.  Wikipedia defines TCO as:

Total cost of ownership (TCO) is a financial estimate intended to help buyers and owners determine the direct and indirect costs of a product or system. It is a management accounting concept that can be used in full cost accounting or even ecological economics where it includes social costs.

For manufacturing, as TCO is typically compared with doing business overseas, it goes beyond the initial manufacturing cycle time and cost to make parts. TCO includes a variety of cost of doing business items, for example, ship and re-ship, and opportunity costs, while it also considers incentives developed for an alternative approach. Incentives and other variables include tax credits, common language, expedited delivery, and customer-oriented supplier visits.

Many organizations find themselves attracted to the idea of Cloud and look for solid financial justification to support the business case for their journey.  At first glance the TCO analysis seems like a perfect fit – run the TCO on your existing data center, compare it to your projected Cloud deployment, and look at all the money saved…  For a mature IT organization with existing data center(s) and workgroup (office closet) solutions the analysis is often not so straight forward. 

As in most things, the devil is in the details.  The gotcha in the definition above is the word “indirect.”  Typically, it is fairly easy to collect the direct costs (Physical Servers, Virtual Servers, Storage, Networking, Labor) for your existing data center(s) but the indirect costs are often elusive.  As an example, items like the cost of racking and stacking servers can easily be overlooked.  These activities may be a direct cost if you have a vendor handle them, or they may be indirect as part of your general labor.  Do you know precisely what it costs you to rack and stack 1 server, 1 switch, 10 servers, a Firewall?

Other possible indirect costs may be even more challenging as they may be open to interpretation.  As mentioned in my previous post regarding Minding your Technical Debt – how do you account for the cost of fully depreciated hardware?  If that hardware is out of support – do you attribute a cost to the risk associated with not having a support contract?  What is the cost of running an unsupported OS or Application no longer receiving security patches?  These scenarios certainly have a cost when it comes time for audits, or worse, when you have the misfortune of your systems being penetrated due to un-patched vulnerabilities.

Additionally, if you are building a business case for one workload, and not the entire data center, the TCO analysis can be even more complicated.  How do you parse and allocate the costs attributed to only the workload in question, and more importantly, how do you calculate the costs associated with shared services (Active Directory, Networking, SAN Storage, Virtualization Host(s) - especially if intentionally over-subscribed)?

If you don’t accurately capture all costs (direct and indirect) the TCO analysis may not fully support your business case or it may be easily picked apart.  The reality is that it’s nearly impossible to have a perfect TCO calculation.  It is easier to calculate TCO for Cloud workloads due to billing granularity but it can still pose challenges.  When you are comparing TCO between non-Cloud and Cloud workloads you may find yourself comparing apples and watermelons if you aren’t careful.  Cloud costs, while granular, are often fully loaded in that they carry the costs for physical security, climate control, redundant power and many more services already embedded in the hourly rate.  If you compare the cost of an EC2 host with the cost of your fully depreciated server (zero cost) in your data center – and you don’t account for physical security, climate control, redundant power, etc your comparison and any conclusions drawn from this analysis will be deeply flawed.

So where does this leave us?  How do we create a meaningful business case for a Cloud Journey?  Despite all the challenges mentioned above I believe it is possible to complete a meaningful TCO comparison – you just need to be careful and methodical.  Make sure you are comparing apples to apples.  If necessary employ the help of professionals who have experience in this area or even engage directly with your Cloud vendor for guidance.  If you aren’t sure where to start, AWS has a good Excel model for creating a TCO analysis, and for larger engagements they have a whole cloud economics team.

Most importantly, I believe you MUST place a dollar value on items that your organization may not generally quantify in economic terms.  If your Cloud journey is going to result in a substantial improvement to your security posture you need to calculate a dollar value for that and include it in the comparison.  It could be as simple as estimating the cost of a lost customer, down time, security breach, etc.  If your Cloud journey will shrink the time it takes you to deploy applications, calculate a dollar value for the savings.  When you include these items in your analysis, make them easy to change (cell in a spreadsheet) in case others challenge your assumptions.  The most important part is getting others to acknowledge there is a dollar value for these benefits.

Ultimately there are many good reasons for organizations to consider a Cloud journey.  The purely financial argument supported by a TCO analysis is only one dimension.  Other areas like agility may be as beneficial, or more beneficial to your organization than direct cost savings.  When creating a business case, I encourage you to have a multi-dimensional story, be careful not to rely exclusively on the always imperfect TCO analysis.

As always – please provide feedback or ask questions.

Comments