Anyone migrating to the cloud needs to do at least some basic calculations of the costs around this. Besides the obvious strategic benefits of moving to the cloud such as – business agility, increases in productivity, etc. – the cold hard truth is in the end, the financial benefits will outweigh the other. And even if it isn’t, dedicated and committed organisations need to know the initial/intermittent costs, and eventually a calculation/simulation of the potential of cost savings. In this blog post, we are going to put some stakes into the ground for a solid calculation of the total cost of ownership (TCO) for a migration to the cloud.
What will it cost? When will it Amortise?
These are the immediate questions everybody faces when considering the move to the cloud. There are no easy answers to these questions, except that it will cost you some money, and chances are pretty good that the investment will amortise very soon. Concerns about all the fuss about cloud sprawl, taking control of your cloud, and promises of savings potential of up to 40% – these are real claims backed by evidence, but in a different problem space. They specifically address companies already operating in the cloud, i.e. those who have already completed their migration or are in the late stages of it. Your problem is different though: You are considering whether to migrate to the cloud (you should), how to do it (in a series of paced well-defined small steps), and how much it will cost (not much more if planned well and embedded into day-to-day DevOps). Below, we will outline the most important parts of a TCO calculation for migrating to the cloud. It is nowhere near being complete, but it will point you in the right direction. Now, let’s get started. Broadly speaking, calculating the TCO of migrating to the cloud must cover at least the following categories:
- IT assets
- Resource utilisation, consumption patterns, and IT architecture
- Personnel & skillsets
- Cost transparency and awareness.
The last aspect, cost transparency and awareness, is less of a technical, but more of a company cultural aspect which will cut across and impact all other aspects of your TCO analysis and more importantly, determine the overall success of your cloud migration. Typically, this will result in a series of cost simulations that compare various scenarios varying in key parameters, such as expected resource usage patterns, migration budget, time constraints, and other key factors which influence the resulting TCO and budget allocated for the migration. Let’s look a bit deeper into each of these aspects.
- IT assets
This aspect is the most prominent and most discussed benefit of migrating to the cloud: The switch from CAPEX (capital expenditure) to OPEX (operational expenditure) when migrating to the cloud is largely driven by the fundamental shift of how IT resources are procured. Costs related to IT assets are largely independent of the other factors as they concern only tangible, real-world hardware and facilities. These costs will change no matter the make-up of your IT workloads (see below). Cloud computing is and always will be an outsourcing endeavour, however with significantly different service delivery parameters and options.
The most important aspects of IT Asset Cost Analysis:
- Server Hardware
- Network Hardware
- Hardware Maintenance
- Power and Cooling
- Data Centre Space
It is important to remember that these costs do not evaporate. Rather, they are borne by your cloud provider and absorbed in the pricing for their resources which you decide to rent as part of your cloud deployment and operations. The economics of IT resource leasing from cloud providers has grown very similar to the economics in other verticals based on resource consumption, most strikingly in the energy market: Historically speaking, cloud computing used exclusively a ‘pay as you go’ (PAYG) based model where resources were leased on an on-demand basis. Anyone who knows the energy market – or the mobile data connectivity market – knows PAYG is the most expensive type of resource consumption. Your maximised freedom in resource consumption comes at the price of cloud hardware/asset provisioning costs for the cloud provider. Conversely, the more long-term cloud providers can calculate and predict resource amortisation costs, the fewer risks are reflected in their pricing. Say hello to reserved instances. The longer the timespan you reserve cloud resources for, the greater your savings – and the closer your cloud TCO analysis gets to traditional TCO analysis of on-premise IT resources.
Leaving the complex area of dynamic resource allocations aside for a moment (see below), the biggest value proposition of outsourcing to the cloud is its ability to scale resource provisioning at incredibly competitive pricing simply because of economies of scale (particularly hyper-converging cloud service providers such as Amazon, Microsoft, and Google) and aggressive automation of the recurring datacentre and IT resource tasks. The most profound paradigm shift in calculating the TCO, and eventually budgeting for a cloud-based IT infrastructure, is the switch from a largely static, fixed-cost based model to an iterative dynamic cost model. It is this paradigm shift that causes most cloud service users headaches, often resulting in wasting resources, and frequently increased spending when compared to on-premise computing. The cost savings potential is greatly influenced by the actual resource utilisation rates, resource consumption patterns, and workload architecture. Since this is so impactful, we will look at this next.
- Resource utilisation, consumption patterns, and IT architecture
In a classic IT asset provisioning paradigm, the dynamic between the provisioning (and budget) department, and the operations and development departments can often be characterised as a self-fulfilling prophecy. IT resources were procured based on budget lines, usually in regular fiscal intervals (e.g. annually, or 3-year intervals) largely controlled by fiscal and accounting rules concerning asset depreciation etc. Budget reviews often included an analysis of resource utilisation, all too often with the view of budget ‘optimisation’ (read: cuts) if budgets were not exhausted or resources underutilised. This total disregard of technical and technology in favour of purely budgetary and financial aspects inevitably leads to the affected departments spending their budgets in time as preventative measures against budget cuts. As a result, resources consumption becomes inefficient and wasteful, skewing monitoring and instrumentation of actual resource consumption for all the good reasons – fear of budget cuts causes a culture of shadow resource utilisation a systemic misrepresentation of actual resource utilisation and consumption by IT workloads. To calculate the impact on the overall TCO of your cloud migration, not only a cultural change in your organisation is necessary, but also an honest review of the following:
- IT asset utilisation
- IT workload resource consumption patterns
- Current and desired workload architecture
Arguably, IT asset utilisation is equally close to IT asset ownership calculations (see above). In our experience, we see asset utilisation not as a function of hardware management, but as a result of workload architecture, operations policies, and organisational processes. Succinctly speaking, the lower your resource utilisation even in sustained workload scenarios, the larger your cost-saving potential when migrating to the cloud: On-premise, you continue to depreciate underutilised IT assets, while in a public cloud approach, these resources will be allocated and leased to another public cloud service customer. While this is a cost factor across your entire current IT asset landscape, it is greatly affected by the individual IT workloads’ resource consumption models and patterns. Unless you gathered accurate resource consumption metrics for your existing workloads, most applications’ usage patterns fall into one of the following three categories:
- Sustained, steady-state resource consumption
- Continuous baseload with predictable usage spikes
- Uncertain and unpredictable usage without clear patterns
Clearly these are approximations, but useful in the first analysis of TCO. You can further refine and simulate your TCO with a broader set of consumption patterns, for example by assuming a higher base load and fewer spikes (or the converse of it), incorporating more refined base and stretch targets for future resource usage predictions, and much more. For each of these consumption models, select the most appropriate match of compute instances, storage services, network, and other IaaS components available. If you consider a pool of cloud providers in a shortlist, you’d need to do this analysis for each of the candidate cloud providers, and approximate as best as possible the corresponding resources to be able to compare the TCO results. For each scenario, approximate the use of reserved instances (terminology may vary across providers), on-demand/PAYG instances and storage needs. Typically, cover the extreme/edge cases of either exclusively use reserved resources or use no reserved resources at all first, to have a quick ballpark figure of min/max cost-saving potential. Then drill deeper into more refined resource usage scenarios. However, be advised that a strategy of optimising resource consumption can lead to increased cost: Discounts on various reserved instance options often outweigh the cost of a small waste in resource consumption or reservation. Hence, we suggest not to work on too detailed resource consumption patterns but work with a select shortlist of reasonably accurate approximations. As a next step, review IT workload architectures and work out how you can realise even more cost-saving potential by transitioning entire workloads (or components thereof) from a monolithic architecture that is equivalent to VM based cloud computing instances to a more loosely coupled (micro)services architecture that is better suited to use managed cloud services on the PaaS or eve SaaS level. Arguably, this aspect of a TCO calculation is equally applicable to the last category (cost transparency and awareness), or is frequently deferred until after the decision of cloud migration is made, as part of a campaign to transition to a well-architected cloud workload (see the first blog post in this series).
3. Personnel & Skill Sets
A migration to the cloud invariably also requires analysing the cost of personnel and their CPD. The extent and amount to which personnel costs can be reduced or avoided largely depends on your IT workload. The more your workload is capable of utilising and integrating PaaS and SaaS offerings, the more your personnel structure and skillsets will change. We find it important to note that while there is a clear and intuitive saving that can be realised on the IaaS/Data centre level of cloud migration, this is in our experience not true for software engineers, architects etc. Instead, their skillset will have to change – and that is a cost-increasing factor, unless you offset the costs associated with learning the necessary skills for cloud migration with the unrealised costs of CPD in skills and services now provided by your cloud service provider.
- Cost transparency and awareness
Traditionally, the TCO of on-premise infrastructure and maintenance was poorly documented or even calculated. Most, if not all hardware procurement was controlled and enforced by purchasing departments that were frequently disconnected from the consuming departments. As a result, the actual cost of the hardware was ‘lost in the books’, leaving development and operations in the dark of the true financial impact of their activities. While it is difficult to attach a price tag to individual components of your traditional IT workload, it is becoming increasingly easy to do so in a cloud computing environment. Cloud service providers continually improve their cost reporting capabilities, allowing ever more fine-grained reporting and accounting functions for ‘the culprits’. Utilise these to raise awareness, and to democratise decisions about the structure and cost of your IT estate across all its stakeholders. Arguably, this is frequently a function of post-migration calculation and optimisation. This is often closely linked with continuous IT architecture decisions regarding the replacement of components with managed services available from your cloud service provider. However, we argue that a strong and proactive cost culture that is not restrictive, but facts based (monitoring!) is key to a successful cloud migration strategy.
At first, calculating the TCO of a cloud migration sounds all too easy, and often figures are over-optimistically guesstimated rather than simulated or even calculated based on existing facts. This becomes even more true when factoring in that migration often induces higher costs in the intermediate phase until the cost savings of a full and completed migration start to have an effect. This bump in real cost is often the cause of a failed, stopped or resting in limbo cloud migration – even more so if these migration costs are not communicated as to be expected. Considering that SW engineers frequently misjudge and underestimate the effort required for a certain task or project (at times actual effort was up to 4x higher than estimated!) the real cost of cloud migration is almost always higher than initially calculated/budgeted. As a rule of thumb, the TCO of a cloud migration can be approximated by:
Savings potential = (1) Asset CAPEX costs + (3) DC personnel cost – (2) cloud workload resource costs (OPEX) – (3) personnel upskilling cost – (4) change in business process around IT costs
The final figure can then be approximated by multiplying the savings potential number with a ‘correction factor’ – some say it is more accurately described as ‘reality check factor’ between 0 and 1 as a dampening factor representing the increased cost during the migration phase. The most contributing factors to the TCO – more accurately the savings potential – are undoubtedly the current cost of IT assets and the real (optimisable) costs of a cloud workloads estate. These two – and the correction factor – are the most miscalculated figures in a TCO analysis. We have only scratched the surface of how to calculate the TCO of your infrastructure migration, and we understand the planning of a migration project is a full project of its own. Digital Craftsmen is skilled and experienced in supporting you in your project to calculate your TCO, up to full turnkey TCO project delivery. We are happy to support you in reviewing:
- Cloud cost and sprawl
- Cloud estate security
- Your cloud performance or
- A full digital transformation strategy including TCO calculations.
For more information join us at /services/cloud/professional-services/ Call the craftsmen team on 020 3745 7706 or email us on [email protected] to find out more about our services, verified by ISO27001 and Cyber Essentials Plus accreditations.