The G-word
A practical and nimble alternative to stewards and councils.
I have to admit that the term “data governance” gives me bad vibes. My mind immediately goes to these massive enterprise IT programs with Data Owners, Data Stewards, Data Custodians, Data Librarians1, and a quarterly Data Council meeting where six people argue for two days about whether “Active User” should include someone who merely logged in or not.
I have never been in those environments where the downside risk of “bad data” is existential. Banks and insurance companies can afford these layers of governance overhead because “oops” is measured in regulator fines. These costs are embedded in the business model since forever, so it is easier to pass them down to customers. If Netflix all of a sudden were forced to implement something like that before recommending you a movie, it would be hard to double the subscription price overnight.
However, even in a SaaS business, some level of data governance is needed for operational clarity. The way I see it is that bare-bones data governance is about three levels of ownership:
Definition Ownership: this is about semantics, things like “what counts as an Active User?”, or “what is Net Revenue?”, and it includes who can sign off on a definition, or change it over time.
Implementation Ownership: this is about engineering, quality and operations, how are these entities and metrics actually computed and persisted, reliably and at scale. Pipelines, tests, backfills, observability, on-call, SLAs, lineage, schema evolution … all that stuff.
Accountability Ownership: this is about setting targets and being on the hook for meeting them.
Those three types of ownership are often held by different groups:
Definition belongs with the business domain (Product, Finance, Sales, Support, etc.).
Implementation belongs to the Data org, if you want a real single source of truth = one canonical place to get the high-quality stuff.
Accountability belongs with the business domain too, because only they can pull the levers.
The Data org must be involved in all three, but it must not own all three. In “Definition” the Data org’s job is to facilitate the design, preventing badly designed metrics or metrics that can be gamed easily. In “Accountability” the Data org’s job is to advise on target setting to avoid sandbagging, and to help explain variance between targets and actuals.
A good example of this working well is how we standardized “engagement” across a SaaS product with many features. Before we did this, engagement was the classic data horror story. Product teams were using raw telemetry data directly, metrics definitions were queries buried inside dashboards, there was no observability, no transparency, and no single place to improve anything.
For Definition, we made Product Operations responsible for defining the feature taxonomy, a map of what features exist and how they are named. Then we enriched the taxonomy with additional groupings to support different business functions. Finance wanted to group features one way, Support another, Sales another. We did not allow competing taxonomies, but we created one canonical taxonomy, and multiple groupings as overlays.
Product teams care about engagement because engagement reflects how their feature is actually used. We wanted them accountable for “what counts as intentional use” of their feature. We managed that with a rather simple config file per feature, that is, a YAML description of the telemetry events that constitute active, intentional engagement.
The Data org owned the system that turned that config into reality. We transformed YAML into SQL that recorded engagement activity and produced metrics like DAU/WAU. We have all sorts of checks to make sure that the config was fresh and up to date. For example, if we spotted new data coming through telemetry that was not captured by the config, we generated alerts to check with the product team how to proceed, and whether it was necessary to make changes.
It was a reasonably good level of governance, standardization and auditability, achieved without too much overhead and committees.
The key to the success was that all the parties involved belonged to the same path in the org chart. In fact, the standardization of ARR did not go as well. The sales org had a small team of analysts. Their original goal was sensible: build reporting for sales reps. Then something common happened: promo-driven development kicked in unchecked, their scope expanded, ambitions grew, and suddenly that team started building core metrics infrastructure because it was visible and rewarded.
They (implicitly) claimed they had to drive everything “in house”: Definition, Implementation, and Accountability of ARR. I did not disagree with two thirds of that. Definition and Accountability? Fine, they have domain knowledge, they work closely with sales reps that are on the hook for ARR outcomes. Maybe too closely, but let’s give them the benefit of the doubt, and assume that they will not trade integrity for making friends. But Implementation? Please no. They were improvised data engineers, their operational discipline was simply not there.
But we lost that battle. Due to org chart dynamics and lack of coordination at the C-level, we landed in a situation where an extremely important metric like ARR was not part of the single source of truth, the “metrics cube.” It lived in another physical location, and was generated through a different compute path made of spaghetti SQL buried in an unobservable managed service. Their source of “truth” changed all the time, tables kept being replaced with v2 and v3 suffixes, without any notice. There were multiple tables per SKU inside a data mart that looked like a Turkish bazaar, good luck knowing which one was the good one.
Even if from a company perspective it was far from perfect, quite far actually, we were happy for the level of rigour of the data governance piece that we could control. The next step would have been a semantic layer, but that’s maybe for another post.
Do you like this post? Of course you do. Share it on Twitter/X, LinkedIn and HackerNews
I made this up, but maybe it really exists



This breakdown of governance into Definition, Implementation, and Accountability is super practical. Too many orgs think governance means piling on process, but splitting ownership this way actualy makes things lighter and clearer. I've seen similar breakdowns work with product taxonomy, where letting teams own definitions but centralizing the infra prevented chaos without slowing people down. The ARR example shows how org chart boundries kill even good governance models.