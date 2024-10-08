BY Patrick Barch for capital one software5 minute read

The data that businesses have access to is growing exponentially. Just 10 years ago, global data creation and consumption sat at 15.5 zettabytes. This is expected to grow to 181 zettabytes in 2025—around 170% growth in a decade—and now businesses are keen to leverage this data to take advantage of artificial intelligence (AI).

However, the growth of data combined with the cost of AI—and the fact that the unit costs of managing data in the cloud can make it feel like there’s an unlimited budget—raises the stakes on how companies manage and justify increased data spend. This is why the concept of financial operations (FinOps) is under the spotlight today.



FinOps is a concept that emerged to help companies maximize the business value of the cloud and, as a result, manage their spend accordingly. But as more of a company’s cloud spend is incurred on data, shared data platforms, and AI and machine learning, it’s worth examining whether tailored techniques should be applied to those specific use cases.



Here’s how we think about extending three pillars of our FinOps strategy to our data spend, drawing on the principles shared by the FinOps Foundation. 1. Visualize and allocate your spend.

You can’t ensure that your spend is valuable if you can’t see where it’s going. That’s why the first step in applying FinOps to data spend should be visualization—mapping budgets to users, teams, projects, business units, or infrastructure. Given the increasing complexities in any organization’s data architecture and costs, this is essential. Traditionally, FinOps cost allocation has worked by tagging specific instances of cloud services like EC2, DynamoDB, and Lambda to specific business entities. But now, an outsize portion of spend is going toward platforms shared between multiple business units. That means you need a strategy for gaining granular visibility into how much they are using it at the user or job level. Only then can you accurately allocate costs.

Let’s take Snowflake, for example. Capital One uses Snowflake at scale across the business, so we need visibility into what’s happening in Snowflake to ensure we’re seeing the value of our data spend. Here we might be looking at the utilization of our warehouse, query volumes, queued and concurrent queries, spillage, and more. Those are specific data points that we can look at to understand how our system is performing, and from there we can get a sense of waste. This might sound like a lot of manual work—and it is if you don’t have the right toolkit. To work efficiently at scale, teams must be equipped with platforms that will help them assess data with ease. We use Capital One Slingshot to seamlessly track our Snowflake spend at a granular level. Slingshot pulls all of our performance and cost metrics into a single place, with custom tagging enabling us to split those into specific categories. What’s more, Slingshot even shares intelligent recommendations to help us in the next part of applying FinOps principles to your data spend: optimization. 2. Optimize your deployments.

With a clear view of your spend, you can start making informed decisions on optimization. It might be that this becomes a belt-tightening exercise as you spot high spend in specific areas—for example, reporting and analytics. Or, it could be that you make smaller adjustments to right-size your infrastructure—finding places where you’re overprovisioned and can cut back.

In another scenario, you might discover that different business units are running on separate warehouses, none of which are at full utilization. In this case, you could be paying for two warehouses, each at 50% utilization, when the obvious preference would be to pay for one at 100%. Tackling this is known as bin packing—consolidating workloads into fewer warehouses or containers. Optimizations like these can come with their own complexities, but by following the right governance best practices—for example, following a comprehensive tagging strategy and maintaining a clear database of dependencies and metadata—they don’t need to be painful. And, given the significant efficiencies on the table, the benefits quickly add up. 3. Implement changes continuously.

The final stage in applying a FinOps strategy to your data spend is implementation. It’s here that you make changes to your workloads, infrastructure, and operations based on what you learned through visualization and optimization. This isn’t only about taking technical actions (for example, migrating workloads between warehouses). It’s also about providing the teams involved with the right direction and resources.

advertisement

At Capital One, a core element of implementation is providing responsible teams with clear and continuously updated guidance on policy, tooling, ownership, and more. There is a central team dedicated to this—they’re responsible for centralizing all best practices and defining implementation rules and guardrails, for example, by providing aggressive timelines for shutting down idle infrastructure. While guidance is essential, implementation best practices aren’t about restraining your users. Just like drivers of a car, most users aren’t experts on how the engine is running—that doesn’t mean they don’t want to operate it safely and responsibly. This is why implementation requires close collaboration and communication. While central teams may have insight into spend patterns and optimization opportunities, they need buy-in from the business teams who actually manage workloads to deliver on them. It’s important that central teams can demonstrate that scaling down compute won’t negatively impact specific business units—ensuring those needs are met is a key part of the optimization journey.