Technical note

Facts, Dimensions, and the Moment a Metric Means Something

  • Analytics Engineering
  • Data Modeling
  • SQL
  • KPI Definitions

A metric does not become meaningful because someone gives it a name.

Revenue. Margin. Active Customer. Repeat Buyer. CAC. Stock Cover.

The name is only the surface.

A metric becomes meaningful when the team agrees the event being measured, the grain of the data, the dimensions that describe it, and the timing rule that places it into a reporting period.

That is where facts and dimensions matter.

Not as textbook terminology.

As trust controls.

The symptom: everyone agrees the metric name, but not the metric

At Brightside Bikes, “Active Customer” sounds simple.

But different teams may mean different things:

  • A customer who placed any order in the last 90 days.
  • A customer who bought a bike in the last 12 months.
  • A customer who bought an accessory after buying a bike.
  • A customer with an open service plan.
  • A customer with a valid email and no refund-only history.

The metric name is shared.

The metric logic is not.

That creates reporting friction because the argument is hidden behind familiar words.

Facts and dimensions help surface the real decision.

Facts record business events

A fact table records something that happened.

For Brightside Bikes, facts might include:

  • An order was placed.
  • An order line was sold.
  • A payment was captured.
  • A refund was issued.
  • A product was shipped.
  • A stock snapshot was received.
  • A customer clicked an advert.

The fact is usually measurable. It has amounts, counts, timestamps, and foreign keys that connect it to context.

Example:

select
  order_line_id,
  order_id,
  customer_id,
  sku_id,
  order_created_at,
  quantity,
  gross_revenue_amount,
  discount_amount,
  refund_amount,
  net_revenue_amount
from fct_order_lines;

This table says one row represents one order line.

That is the grain.

Dimensions describe the event

A dimension table adds context.

For Brightside Bikes, dimensions might include:

  • Customer attributes.
  • Product and SKU attributes.
  • Channel and campaign attributes.
  • Calendar attributes.
  • Warehouse or location attributes.

Example:

select
  sku_id,
  sku_code,
  product_name,
  product_category,
  bike_family,
  is_accessory,
  standard_unit_cost
from dim_sku;

The dimension does not usually record the sale itself.

It explains the sale.

That context is what lets the business ask better questions:

  • Which product category drove revenue?
  • Did bike buyers later buy accessories?
  • Which customer segment is profitable?
  • Which SKU has margin risk?

The grain is the contract

Grain is the most important sentence in a model’s documentation.

One row represents what?

If the grain is unclear, users can accidentally count the same event twice, join at the wrong level, or compare metrics that are not measuring the same business process.

For example, this may look reasonable:

select
  date_trunc('week', o.order_created_at) as reporting_week,
  sum(o.net_revenue_amount) as net_revenue,
  count(distinct c.customer_id) as customers
from fct_orders o
left join dim_customer c
  on o.customer_id = c.customer_id
group by 1;

If fct_orders has one row per order, that may be fine.

If the query later joins to fct_order_lines, revenue can be duplicated unless the model handles the grain change deliberately.

That is why grain is not a documentation nicety.

It is a reporting control.

Time rules decide when a metric happened

Many reporting disputes are really time disputes.

For Brightside Bikes:

  • Shopify may record order creation time.
  • Stripe may record payment capture time.
  • Finance may recognise revenue later.
  • Refunds may occur weeks after the original order.
  • Inventory snapshots may arrive at a daily or weekly cadence.

Each timing rule can be valid for a different decision.

The issue is not that one time field is always correct.

The issue is whether the model says which time rule it uses.

Weekly trading revenue may use order date. Finance revenue may use recognition date. Cash reporting may use settlement date.

Those should be different definitions, not accidental inconsistencies.

Dimensions can change the meaning of the same fact

Suppose Brightside Bikes wants revenue by product category.

The order line fact records the sale. The SKU dimension assigns each SKU to a category.

select
  date_trunc('week', ol.order_created_at) as reporting_week,
  s.product_category,
  sum(ol.net_revenue_amount) as net_revenue
from fct_order_lines ol
left join dim_sku s
  on ol.sku_id = s.sku_id
group by 1, 2;

This query depends on the quality of dim_sku.

If product categories are missing, duplicated, or historically overwritten, the metric changes.

That is why dimensions need ownership too.

They are not passive labels.

They shape the business interpretation.

Facts and dimensions help separate questions

One benefit of dimensional modeling is that it separates the event from the context.

That helps teams avoid mixing business questions.

For example:

  • fct_orders answers what was bought and when.
  • dim_customer explains who bought it.
  • dim_sku explains what kind of product it was.
  • dim_calendar explains the reporting period.
  • mart_weekly_operating_report presents the approved weekly view.

This separation makes the reporting path easier to inspect.

If revenue changed, the team can ask whether the source fact changed, a dimension changed, or the final mart logic changed.

That is much faster than debugging one giant dashboard calculation.

A metric needs all four parts

For a metric to be trusted, define:

ComponentQuestion
FactWhat event or state are we measuring?
GrainWhat does one row represent?
DimensionWhich context describes the event?
Time ruleWhen does the event count?

For Brightside Bikes, “Active Bike Owner” might mean:

  • Fact: customer has at least one completed bike order.
  • Grain: one row per customer.
  • Dimension: customer lifecycle status and product category.
  • Time rule: active if bike order date is within the last 12 months.

That can become explicit SQL:

select
  customer_id,
  max(order_created_at) as latest_bike_order_at,
  case
    when max(order_created_at) >= current_date - interval '12 months'
      then true
    else false
  end as is_active_bike_owner
from fct_order_lines ol
left join dim_sku s
  on ol.sku_id = s.sku_id
where s.product_category = 'Bike'
  and ol.is_cancelled = false
group by 1;

The code is only trustworthy if the business agrees the definition.

The model makes that definition inspectable.

What to inspect first

When a metric is disputed, inspect:

  • What fact is being measured?
  • What is the grain of the model?
  • Which dimensions are joined in?
  • Are dimension keys unique?
  • Which time field places the metric into a period?
  • Are late-arriving events handled?
  • Are facts being aggregated before or after joins?
  • Does the dashboard change the grain again?

Most metric disputes become clearer once these questions are answered.

What good looks like

Good reporting does not require every user to understand dimensional modeling.

It requires the data team to use modeling discipline so users do not have to guess.

For Brightside Bikes, a leadership dashboard should not leave executives wondering whether revenue is order-level, order-line-level, refund-adjusted, recognised, settled, or campaign-attributed.

The model should make that meaning clear before the number reaches the meeting.

Facts record what happened.

Dimensions explain it.

Grain controls how it can be counted.

Time rules decide when it belongs.

Together, those choices are the moment a metric starts to mean something.

Technical note

Use this as supporting context

This article is part of a low-profile technical series behind the Reporting Trust framework. For the executive-level problem, start with the business-facing articles on mismatched numbers and source-to-report lineage.

Start with the business problem