Designing the Cloud Target Operating Model

The business case for cloud migration in financial services typically rests on three promises: lower infrastructure cost, greater operational agility, and improved resilience. After a decade of widespread cloud adoption in the sector, the evidence on whether these promises are delivered is mixed. Institutions that have achieved them share a common characteristic — they redesigned their operating model alongside their infrastructure. Institutions that have not share a different one: they treated cloud migration as a technology initiative, moved workloads, and then found that their cost base had not materially reduced, their delivery speed had not meaningfully improved, and their resilience architecture reflected the same fundamental limitations as before, now expressed in cloud infrastructure rather than on-premise hardware.

The difference is the Cloud Target Operating Model (TOM). This article examines what a Cloud TOM actually comprises, why it is so frequently deferred until after migration when it should precede it, and what the key design decisions are that leadership teams need to resolve before cloud programmes scale. It draws on experience designing and implementing cloud operating models in a BaFin-regulated commercial real estate finance bank, leading a managed services operating model transition at a major European bank's personal finance division, and restructuring IT operations at a regulated FOREX bank — three very different contexts, but with consistent lessons about where cloud programmes succeed and fail.

What a Cloud TOM Is — and Is Not

A Cloud Target Operating Model is not a technical architecture. It is not a cloud platform design, a landing zone specification, or an infrastructure diagram. Those are outputs of cloud architecture work, and they matter — but they describe how the technology is built, not how the organisation operates it.

A Cloud TOM addresses five distinct dimensions of how an organisation runs its cloud environment. Each dimension involves design decisions that are independent of the underlying technology choices — decisions about accountability, governance, resource organisation and financial management that must be made explicitly, because if they are not, they will be made implicitly through default behaviours that typically replicate the problems of the on-premise operating model in a cloud context.

Dimension 01

Organisational Structure

How cloud capability is organised — centralised Cloud Centre of Excellence, federated within product or domain teams, or a hybrid model. Who owns the cloud platform itself versus who consumes it. How the boundary between platform engineering and application teams is drawn.

Dimension 02

Governance & Accountability

Who is accountable for cloud architecture standards, security posture, compliance, and cost. How decisions about cloud usage are made and by whom. How the shared responsibility model with the cloud provider maps onto internal accountability. How the regulator's oversight requirements are met.

Dimension 03

Financial Management (FinOps)

How cloud costs are tracked, allocated, reported and optimised. Who is accountable for cloud spend at team or domain level. Whether the model is showback (visibility without chargeback) or chargeback (cost allocated to consuming teams). How the variability and immediacy of cloud costs is managed against traditional IT budgeting cycles.

Dimension 04

People, Skills & Culture

What cloud skills the organisation needs versus what it has. Whether the strategy is to retrain existing staff, hire cloud-native talent, or outsource cloud operations. How the cultural shift from traditional IT operating norms — change advisory boards, lengthy release cycles, infrastructure-as-tickets — to cloud-native working is managed and led.

Dimension 05

Service Management Adaptation

How ITIL (Information Technology Infrastructure Library — the globally adopted framework for IT service management best practice)-based service management processes — incident management, change management, problem management, capacity planning — are adapted for a cloud environment where infrastructure is ephemeral, changes are frequent and automated, and capacity is elastic. How SLAs with business units and regulatory commitments are maintained across a hybrid cloud/on-premise estate.

The reason these five dimensions are so often under-addressed is that each of them involves organisational decisions that are politically more difficult than technology decisions. Who is accountable for cloud cost is a question about resource allocation and team incentives. How the Cloud Centre of Excellence relates to product teams is a question about power and autonomy. Whether to retrain or replace existing infrastructure staff is a question that affects real people's careers. These are not decisions that technology teams can make unilaterally, and the discomfort of making them is a strong incentive to defer them — which is exactly what most cloud programmes do.

"The cloud operating model cannot be designed after the migration is complete. By that point, the default behaviours are established, the team structures have formed around the new technology, and the cost of changing them has increased substantially. The TOM must be designed — at least in outline — before the migration scales."

The Key Design Decisions

Within each of the five dimensions, there are specific structural decisions that must be made. The following are the ones that, in practice, have the most significant downstream consequences — and that are most often left unresolved until their absence becomes a problem.

Decision

Option A

Option B (and implications)

Cloud platform ownership model

Central platform team owns landing zones, security baselines, networking and shared services. Application teams consume.

Federated — each domain builds its own cloud environment. Faster initially, creates sprawl, inconsistency and compliance risk at scale. Rarely the right choice in a regulated institution.

Cloud operations: build vs. outsource

Build internal capability — invest in cloud engineers, retain operational knowledge, slower ramp-up.

Managed service provider for cloud operations — faster to market, ongoing dependency, cost visibility challenges, requires robust MSP governance. The managed services transition at BNP Paribas demonstrated that the governance overhead of an MSP model is consistently underestimated — covering infrastructure, networks and security while retaining OS and application management locally created a boundary that required precise contractual and operational definition to function.

FinOps model

Showback — cloud costs are visible to consuming teams but not formally allocated. Lower friction, weaker incentive to optimise.

Chargeback — cloud costs are allocated to consuming teams' budgets. Stronger optimisation incentive, requires robust tagging strategy and cost allocation governance. In a regulated institution, chargeback also creates the internal transfer pricing clarity that auditors and regulators increasingly expect.

Regulatory accountability owner

CTO / Head of Cloud as single accountable party for cloud compliance with BaFin, DORA and BAIT requirements.

Distributed — compliance accountability shared across platform, security and risk functions. Creates gaps in practice; regulators expect to be able to identify a single responsible party for cloud governance.

Cloud skills strategy

Retrain existing staff — builds on institutional knowledge, slower, higher success rate for staff with aptitude and motivation.

Hire cloud-native talent — faster capability acquisition, cultural integration challenges, higher cost, institutional knowledge gap. In practice, most regulated institutions need both: retrain selectively and hire for specific skill gaps rather than making a binary choice.

The FinOps Imperative: Why Cloud Cost Management Cannot Be an Afterthought

Of all the dimensions of a Cloud TOM, FinOps — the discipline of managing cloud financial operations — is the one that most consistently surprises organisations that have not addressed it explicitly. Traditional IT cost management is built around capital expenditure cycles: hardware is purchased, depreciated, and replaced on predictable timelines, with costs known and budgeted in advance. Cloud costs are fundamentally different: they are variable, immediate, and generated by the behaviour of development teams who may have no visibility into the cost implications of their technical choices.

The practical consequence of treating FinOps as an afterthought is that cloud cost significantly exceeds budget. The patterns are consistent: development environments left running over weekends; oversized virtual machines selected for peak load and never downsized; expensive managed services adopted for convenience and never evaluated against cheaper alternatives; data egress costs from moving data between cloud regions or to on-premise systems that nobody modelled at architecture time. Each individual instance is modest; at the scale of a bank's cloud estate, the cumulative effect is substantial.

Establishing FinOps capability requires three things that must be designed into the TOM from the outset. First, a tagging strategy — every cloud resource must be tagged with the team, workload, environment and cost centre responsible for it. Without consistent tagging, cost allocation is impossible and accountability cannot be established. Second, a cost visibility tool — either native cloud provider tooling (Azure Cost Management, AWS Cost Explorer) or a third-party FinOps platform — that makes cloud spend visible at team level in near-real time. Third, a cost accountability mechanism — whether showback or chargeback — that creates incentives for teams to make cost-conscious architectural and operational decisions. The tagging strategy must be defined before migration begins, not retrospectively applied to a cloud estate that has grown without consistent metadata.

Designing for the Regulated Environment

In a BaFin-regulated institution, the Cloud TOM has several requirements that do not arise in unregulated industries. Each of them should be designed into the operating model from the outset, because retrofitting them after the operating model has established is expensive and disruptive.

Single Accountable Party for Cloud Governance

BaFin's supervisory expectations, and DORA's ICT governance requirements, presuppose that there is a single identifiable party within the institution who is accountable for the governance of each material cloud arrangement — including accountability for the third-party risk management of the hyperscaler relationship. In practice, this means that the Cloud TOM must designate a named role — typically at CTO or equivalent level — with documented accountability for cloud governance, and that this designation must be reflected in the institution's DORA governance documentation.

Change Management in a Cloud Environment

Traditional change advisory board processes — weekly meetings, multi-week lead times, manual approval workflows — are incompatible with cloud operating velocity. DORA's change management requirements, however, do not disappear because the infrastructure has moved to the cloud. The Cloud TOM must define a change management process that satisfies regulatory requirements for change control and auditability while enabling the deployment frequency that cloud infrastructure supports. In practice, this means policy-as-code enforcement of change controls rather than manual approval gates, with automated audit trail generation that can be produced for regulatory review.

Outsourcing Oversight for Managed Services

Where the Cloud TOM includes a managed service provider for cloud operations — as was the case in the BNP Paribas datacentre consolidation, where a new managed services model was established alongside the infrastructure migration — the oversight requirements of the EBA (European Banking Authority — the EU body responsible for prudential regulation and supervision standards across the European banking sector) Outsourcing Guidelines and DORA must be built into the operating model. This means designating an internal oversight function for each material MSP relationship, establishing the performance reporting and escalation processes required for ongoing oversight, and ensuring that the cloud operational knowledge that the MSP holds does not become an undocumented institutional dependency that cannot be transferred if the MSP relationship needs to change.

From Practice: The Managed Services Boundary Problem

In leading the BNP Paribas datacentre consolidation — which involved transitioning all services to a new managed services operating model covering infrastructure, networks and security, while retaining local management of operating system and application layers — the most consistently challenging operational problem was not the migration itself but the definition and management of the boundary between managed and self-managed layers. Every incident crossed the boundary. Every change involved both sides. Every performance problem required investigation on both sides before responsibility could be determined. The lesson is that in a split managed/self-managed model, the boundary must be defined with engineering precision in the contractual and operational documentation — not described in general terms that leave operational ambiguity to be resolved case by case under pressure. The cost of imprecision at the design stage is paid continuously in operational friction throughout the contract.

Operating Model Maturity: A Phased Approach

The full Cloud TOM does not need to be operational before the first workload migrates. What needs to be in place before migration scales is the foundational layer — the decisions and structures on which everything else depends. A phased maturity model, where different elements of the TOM are established at different points in the cloud journey, is both realistic and appropriate.

Foundation Phase
Before migration scales

Non-negotiable prerequisites: cloud governance policy approved; single accountable party designated; tagging strategy defined and enforced; landing zone architecture including network topology and security baseline; data classification framework applied to cloud eligibility; DORA third-party risk documentation for the hyperscaler; and basic cost visibility tooling in place. Without these, migration creates technical and compliance debt faster than it can be repaid.

Operational Phase
During active migration

Establish as workloads land: FinOps showback reporting operational; cloud change management process integrated with existing ITSM tooling; platform team and application team RACI documented and communicated; cloud skills programme underway; initial cloud operating costs baselined and compared to on-premise equivalents. The comparison is important — cloud costs are only lower than on-premise if workloads are right-sized, unused capacity is eliminated, and reserved pricing is used for stable workloads.

Optimisation Phase
12–24 months post-migration

Mature capabilities: FinOps chargeback if appropriate to the organisation's model; cloud cost optimisation programme (Reserved Instances, Savings Plans, architectural rightsizing); platform engineering capability building reusable components and self-service capabilities for application teams; cloud operating model performance metrics established and reported to governance bodies; DORA operational resilience testing extended to cover cloud-hosted workloads.

The Transition Period: Where Most Problems Occur

The transition period — when the organisation is running both the old on-premise operating model and the new cloud operating model simultaneously — is when the most significant operational problems occur. Both models require resources. Both require management attention. Both have regulatory obligations. And the two models have fundamentally different operating rhythms, governance requirements, and skill profiles.

The most common failure during transition is allowing the old model to persist longer than necessary because removing it is more disruptive than maintaining it. Each month that legacy infrastructure continues to operate alongside cloud infrastructure is a month of double running costs, double governance overhead, and divided organisational attention. The Cloud TOM should include an explicit legacy decommissioning plan — with target dates and accountability for each system's migration and retirement — not as a theoretical endpoint but as a managed programme of work with the same governance rigour as the migration itself.

The experience of the MIG Bank IT restructuring is instructive here: when IT Operations needed to be fundamentally reshaped — with a 50% cost reduction target achieved within six months — the speed of the transformation was only possible because the scope of what needed to change was defined clearly upfront, the old model was wound down decisively rather than preserved alongside the new, and the new operating model was designed before the transition began rather than evolved through trial and error. A cloud TOM transition that lacks this decisiveness will take twice as long and cost significantly more than one that does not.

Questions for Leadership

Has your organisation made the five TOM design decisions explicitly — or are they being decided implicitly through default behaviours and accumulated technical choices?
Is there a single named, accountable party for cloud governance and the hyperscaler third-party risk management relationship — one that can be identified to BaFin or DORA auditors?
Does your tagging strategy cover every cloud resource, and is it enforced automatically rather than relying on team compliance?
Do your development and application teams have real-time visibility into the cloud costs they are generating — and do they have incentives to optimise them?
Does your cloud change management process satisfy DORA's auditability requirements without imposing on-premise CAB timelines on cloud deployment velocity?
If you have a managed services provider for cloud operations, is the boundary between managed and self-managed responsibilities defined with sufficient precision that every incident, change, and performance issue can be unambiguously allocated?
Does your TOM include an explicit legacy decommissioning plan with dates and accountability — or is the old infrastructure being preserved indefinitely alongside the new?
Has your organisation baselined the actual cost of running workloads on cloud versus on-premise — and has that comparison been reviewed by someone with the authority to act on the findings?

Conclusion: The Operating Model Is the Programme

The reason cloud programmes so frequently fail to deliver their promised benefits is not technical. The technology works. Azure, AWS and Google Cloud are mature, capable platforms that can host banking workloads reliably, securely and cost-effectively. The failure is organisational: the operating model does not change, so the technology sits beneath the same governance structures, the same accountability frameworks, the same cost management approaches, and the same skill profiles as the on-premise environment it replaced. The cloud's potential for agility, cost optimisation and resilience remains unrealised because the organisation has not redesigned itself to exploit it.

Designing the Cloud TOM is not a preparatory step before the real work begins. It is the real work. The infrastructure migration is the execution phase of a programme whose design phase is the operating model. Institutions that understand this — and invest accordingly in the governance, accountability, financial management, skills and service management decisions that a cloud operating model requires — consistently outperform those that treat operating model design as something that can follow migration once the technology is in place.

In a regulated environment, the operating model has an additional dimension: it must not only enable effective cloud operations but must satisfy regulators that those operations are governed, audited and accountable to the standards that the regulatory framework requires. These two requirements — operational effectiveness and regulatory accountability — are not in conflict. A well-designed Cloud TOM satisfies both simultaneously. A cloud programme that addresses one without the other will, sooner or later, encounter the cost of the omission.

Written by Peter Pitkin · Senior IT Consultant & Enterprise Architect · Germany
Views reflect practical experience of cloud operating model design and implementation across regulated German financial institutions.

Designing the Cloud Target Operating Model: The Decisions That Cannot Be Deferred