In which cloud operating model is a customer responsible for maintaining the operating system?

The Cloud Operating Model is a high-level representation of how an organization will deliver on its Cloud Strategy. It is a blueprint for organizing effectively to deliver the capabilities and outcomes required to deliver value through cloud services. Once the Cloud Strategy has been defined, ratified, and communicated, the Cloud Operating Model defines the operational processes required for an organization to execute on the strategy.

Cloud services (either created or brokered) cannot be supported effectively by traditional Enterprise IT organizations. The siloed approach to operations – network, operating system, storage, middleware, and security teams – is less effective when supporting cross-functional, cross-domain services. These specialized teams remain critical to providing the subject matter expertise in their respective domains, but a dedicated set of resources, processes, and technologies is required for the effective creation and management of cloud services.

The Cloud Strategy

In my previous post “The Importance of a Cloud Strategy” I discussed why a good cloud strategy is imperative for a successful IT transformation initiative. The Cloud Strategy aligns business outcomes with technical outcomes and defines a high-level framework for governance and management of resources in a multi-cloud world. Where the Cloud Strategy defined the “what” and “why” of cloud service adoption, the Cloud Operating Model defines the “how” and “who” for the ongoing governance, management and operationalization of cloud services.

In which cloud operating model is a customer responsible for maintaining the operating system?

The Cloud Platform

“Cloud Platform” describes the technology platform that supports the delivery and operations of IT services that adhere to the principles of cloud computing (self-service, shared resource pools, metered chargeback, etc). The Cloud Platform is a critical component of the Cloud Operating Model. It supports the creation and management of cloud services and includes the technical capabilities required to adopt some of the operational best practices used by the hyper-scale cloud providers to increase operational efficiency in enterprise organizations, such as self-driving operations and programmable provisioning.

In which cloud operating model is a customer responsible for maintaining the operating system?

The Cloud Platform provides the technical capabilities needed to support the following types of cloud services:

  • Private Cloud Services – Built internally, typically using on-prem infrastructure, on the cloud platform.
  • Brokered Services – Public cloud services brokered and managed via the cloud platform.
  • Public Cloud Services – Services built and maintained outside of the cloud platform with the potential for the platform to discover and bring under management.

The VMware Cloud Management Portfolio provides all of the capabilities needed for a comprehensive cloud platform.

An Operating Model by any other name…

Like everything related to “cloud” in the industry, there is no agreed-upon definition for “Cloud Operating Model”. Similar concepts are described as “Cloud Adoption Framework”, “Cloud Operating Architecture”, “Service-Oriented Model” or any other number of combinations. I define the “Cloud Operating Model” as the operational processes required to maximize the benefit from the adoption or creation of cloud services. In other words, which operating concepts will the organization adopt in order to operate more like a cloud service provider – to be able to rapidly scale services and resources, onboard or create new services efficiently, and provide a secure, trusted cloud platform for internal and external consumers.

Like all operating models, the Cloud Operating Model consists of the people, processes, and technology required to execute successfully.

In which cloud operating model is a customer responsible for maintaining the operating system?

People

Despite the relentless focus on automation to increase operational efficiency, people are still the most critical component of the operating model.  What changes are needed to the organizational structure to operate effectively? How will we organize to prioritize communication and collaboration? Are there new roles to be defined? Do we need to recruit new skillsets, or can we train existing resources? How do we ensure there are clear lines of accountability and responsibility? Who are our consumers and stakeholders?

Consumer Personas

It is important to understand the different personas that will be the consumers of your services. In the past, application teams (either developers or business units buying packaged software) would interact with IT via a project manager and/or ticketing system to request a resource from a standard offering of VMs and operating systems. IT Operations was usually responsible up to the operating system (and standard packages) and beyond that responsibility was handed to the application team. In the Cloud Operating Model, if you do not understand the use-cases and requirements of the consumer you will not be able to meet their needs.

When IT did not know which applications or data were on systems they often defaulted to a position of “err on the side of caution”. This meant architecting every environment with the highest level of data protection, resiliency, and security. This is an expensive, inflexible position to take and does not scale. The Cloud Operating Model requires that you understand the applications running in the environment, that data classification processes are in place, and that you can automate the provisioning requests to place resources in the environments that meet their needs. Want to deploy a development machine for a week of testing? Great, choose the resource from the self-service catalog and allow the automated logic to place the system on a cheaper tier of infrastructure with a built-in lease period where the resource will be decommissioned when it expires.

For each service delivered using the cloud service model, the persona should be well understood – what they need, when they need it, why they need it, how long they will need it for, and how they want to consume the services. A high-level analysis of consumer personas should be included in the cloud strategy, but the onboarding or creation of each service should provide more detail on the target and future consumers of each service.

In which cloud operating model is a customer responsible for maintaining the operating system?

Stakeholder Personas

Your stakeholders can be your biggest supporters or your biggest detractors. Stakeholders are defined in the Cloud Strategy, and the Cloud Operating Model must provide a feedback loop to make sure that the success of an initiative is communicated to all the right people. IT teams are historically not great at self-promotion but need to break the perception of the expensive, slow, inflexible gatekeeper of resources.

KPIs for each service should be created for each stakeholder and shared regularly, preferably via an automated report or dashboard capability. Sharing success is essential to maintaining support from the critical stakeholders who have multiple competing initiatives to prioritize and fund.

KPIs will differ between different stakeholders for the same service. For example, a KPI for the CISO might be that “CVE patches are applied within x days”, whereas a VP of IT KPI might be “% of server builds used the IaaS service”.

In which cloud operating model is a customer responsible for maintaining the operating system?

Operational Teams

We are no longer delivering infrastructure components; we are delivering solutions consumed as cloud services and supporting them requires that the operational teams are organized around the services. Depending on your approach to transformation (evolutionary or revolutionary), this may be one big reorganization effort or a phased approach. A good first step is the creation of a cross-functional team with experts from the critical IT areas represented. Often this matures into a “Cloud Center of Excellence” that is responsible for setting standards, overseeing governance and making decisions about the onboarding or creation of new services. As the organization matures further this team will either be superseded or augmented by service delivery teams that are dedicated to – and organized around – the services delivered by the Cloud Service Model.

In which cloud operating model is a customer responsible for maintaining the operating system?

The example shown here is not a reporting structure, rather a hierarchy of governance, best practices, and shared services.

The reporting structure is important, however. Candidly, I have encountered too many enterprise organizations where the VP stakeholders – infrastructure, applications, architecture, and strategy – are not aligned (or worse, competing).  This makes it really hard to transform successfully. The earlier all stakeholders are included in the requirements gathering and planning stages for the new delivery model the more support you will get from those critical leadership roles. To identify the critical stakeholder in your organization continue moving up in the organizational chart until you reach the resource that can make the final decision on all aspects of the plan. This is usually the CIO but could be a VP if you have a combined organizational structure. Stakeholder alignment should have been achieved as part of the cloud strategy ratification. If you find yourself encountering resistance at the operating model layer you may have more work to do on the creation and communication of the cloud strategy.

Process

For generations IT has operated environments that place reliability and stability above all else – after all, if users cannot access the solution then IT is considered to be failing. This led to a host of frameworks, processes, and protections that, while created in good faith, led to environments paralyzed by processes.   It was often easier to simply say no – to change, to new requests, to customizations – than to navigate the warren of process and policy. Think ITIL, Change Management Boards, and Architecture Review Boards.

Automation can help in some areas, but it is not enough. Fundamental change is needed. Consider a traditional VM provisioning process. In many environments, a change ticket is required (48 hours advance notice) with representation at the Change Review Board (held twice a week) just to bring a VM online. Nobody asks questions about the change. Human intervention, in this case, adds zero value. A pre-approved change should be automated as part of the server build process. Now there is a record of the activity in the system for tracking purposes (which is really all that was needed).

There are hundreds of similar processes involved in the day-to-day operations of an enterprise IT organization. Some can be automated; some aren’t actually needed anymore (but nobody can remember why they were created in the first place and are hesitant to change). IT operations teams have been groomed to perform in a reactive, CYA, don’t “move fast and break things” for 30+ years. Enterprise IT cannot continue to survive in this mode.

The Cloud Operating Model requires a complete review of existing operational standards, policies, and processes to determine which can be ported from the legacy way of operating, and those that should be updated or retired.

There may be a requirement to create new processes and rewrite policies. Common operating processes need to be reviewed, streamlined or replaced to provide an efficient, consistent experience across public and private cloud environments. Examples are listed below (this is not an exhaustive list):

  • Data classification and data protection
  • Compliance and security requirements and processes
  • Workload placement policy and process
  • Access control
  • Service/resource change management
  • Service/resource lifecycle management
  • Financial model – chargeback or funding of shared, short-term resources, budgeting, reporting
  • Integration process with backend systems and shared system requirements
  • Governance processes and policy enforcement processes
  • Responsibility and Accountability
  • Monitoring, SLAs, reporting, alerting, logging, incident response, service monitoring
  • Resource rightsizing, dynamic scaling
  • License management

DevOps and Agile

Agile is a software development methodology that embraces the change inherent to project requirements over time. It prioritizes iterative change and cross-functional collaboration (unlike traditional methods like waterfall software development that are more rigid). The success of Agile for software development has led to the adoption of these concepts across the larger IT organization with varying degrees of success. What is certain is that the adoption of Agile for software development requires an IT organization to be able to respond more quickly to resource requests, which McKinsey & Company call “Agile Infrastructure”. The recommendations that McKinsey and Company suggest for agile infrastructure is very much aligned with the Cloud Operating Model concepts described in this post.

DevOps defines a culture where development and operational teams are embedded to own all aspects of a service or application lifecycle, to deliver software faster and more reliably.

“DevOps”, “Agile” and other processes can be considered components of the Cloud Operating Model. There is enough overlap in organizational structure and processes that they do not conflict. My recommendation is to adopt the pieces of each that make the most sense for your organization.

Technology

Technology provides the foundation for the Cloud Operating Model, and the choices made here can make it either much easier or much harder to be successful. For this reason, VMware is quite prescriptive regarding our technology recommendations to support the Cloud Operating Model, starting with the multi-cloud capabilities that we consider essential to deliver and manage cloud services. We believe the tools used to provide supporting functionality must be able to support the heterogeneous resources required to build the solutions. We believe that there is an essential set of capabilities required to support a Cloud Operating Model and that they are consistent across your private, hybrid and multi-cloud environments.

We layer these capabilities from a solid, consistent infrastructure foundation (inclusive of private and public resources), up through the stack to application delivery and observability. I introduced these capabilities at the start of this blog post, let’s review again here.

In which cloud operating model is a customer responsible for maintaining the operating system?

Our VMware Cloud Foundation solution along with our cloud management portfolio (vRealize Suite and CloudHealth by VMware) offers all of the capabilities that we consider critical to a private, hybrid and multi-cloud environment. I go into more detail regarding capabilities essential to support a cloud platform in an upcoming blog post.

In which cloud operating model is a customer responsible for maintaining the operating system?

Next Steps

The change does not have to happen all at once, although there are certainly customers who are implementing these changes more aggressively than others. The reward is certainly worth the effort, but many organizations have no resource capacity to do anything beyond keeping the lights on. This is why we recommend starting with the adoption of technologies to improve the operational efficiency layer of the operating model; not only will this free up hardware resources for reallocation, but it will free up human resources so that they can focus on building the service delivery capabilities that are critical to the Cloud Operating Model.

Final Thoughts

People remain the most critical component of the Cloud Operating Model and making sure teams have the right support in place to make this transition is essential – from retraining existing resources to hiring new; from updating hiring practices and compensation models to attract and retain talent; to updating on-call and flex-time policies. VMware has made this transition internally and are guiding thousands of customers through various stages of the transition. Reach out to your VMware team to hear more about how we can help.

The Importance of a Cloud Strategy – Keep your cloud strategy simple, flexible and effective with this easy to use guide.

To Cloud or Not To Cloud – Factors to consider when evaluating brokering public cloud services versus creating private or hybrid cloud services.

Evolve Your Automation Journey – Automate like a Cloud Service Provider Part I. Take the next step in the journey from IT Operations to IT Service Provider with automation and orchestration.

Mature the Delivery of IT Services – Automate like a Cloud Service Provider Part II. How to build IT Services.

References

McKinsey & Co – Agile Infrastructure

Forrester – 2018 the Year of Enterprise DevOps

CIO – Can Infrastructure be Agile

Google – Site Reliability Engineer

In which cloud services is the customer not responsible for managing the operating system?

Platform as a Service (PaaS) The consumer does not manage or control the underlying cloud infrastructure but has control over operating systems, storage, and deployed applications; and possibly limited control of select networking components (e.g. host firewalls).

Who is responsible for maintaining software when using a cloud service?

The cloud service customer is responsible for securing and managing the virtual network, virtual machines, operating systems, middleware, applications, interfaces, and data.

In which cloud service model is the customer only responsible for the data & user management?

Software-as-a-Service (SaaS) With the CSP managing the entire infrastructure as well as the applications, customers are only responsible for managing data, as well as user access/identity permissions.

What cloud model gives you complete control of the operating system?

Infrastructure as a Service (IaaS) gives IT professionals control over low-level computing resources like storage, memory, CPUs, and lets them choose and control the OS that runs on cloud servers.