Technical note
Facts, Dimensions, and the Moment a Metric Means Something
A metric does not become meaningful because someone gives it a name.
Revenue. Margin. Active Customer. Repeat Buyer. CAC. Stock Cover.
The name is only the surface.
A metric becomes meaningful when the team agrees the event being measured, the grain of the data, the dimensions that describe it, and the timing rule that places it into a reporting period.
That is where facts and dimensions matter.
Not as textbook terminology.
As trust controls.
The symptom: everyone agrees the metric name, but not the metric
At Brightside Bikes, “Active Customer” sounds simple.
But different teams may mean different things:
- A customer who placed any order in the last 90 days.
- A customer who bought a bike in the last 12 months.
- A customer who bought an accessory after buying a bike.
- A customer with an open service plan.
- A customer with a valid email and no refund-only history.
The metric name is shared.
The metric logic is not.
That creates reporting friction because the argument is hidden behind familiar words.
Facts and dimensions help surface the real decision.
Facts record business events
A fact table records something that happened.
For Brightside Bikes, facts might include:
- An order was placed.
- An order line was sold.
- A payment was captured.
- A refund was issued.
- A product was shipped.
- A stock snapshot was received.
- A customer clicked an advert.
The fact is usually measurable. It has amounts, counts, timestamps, and foreign keys that connect it to context.
Example:
select
order_line_id,
order_id,
customer_id,
sku_id,
order_created_at,
quantity,
gross_revenue_amount,
discount_amount,
refund_amount,
net_revenue_amount
from fct_order_lines;
This table says one row represents one order line.
That is the grain.
Dimensions describe the event
A dimension table adds context.
For Brightside Bikes, dimensions might include:
- Customer attributes.
- Product and SKU attributes.
- Channel and campaign attributes.
- Calendar attributes.
- Warehouse or location attributes.
Example:
select
sku_id,
sku_code,
product_name,
product_category,
bike_family,
is_accessory,
standard_unit_cost
from dim_sku;
The dimension does not usually record the sale itself.
It explains the sale.
That context is what lets the business ask better questions:
- Which product category drove revenue?
- Did bike buyers later buy accessories?
- Which customer segment is profitable?
- Which SKU has margin risk?
The grain is the contract
Grain is the most important sentence in a model’s documentation.
One row represents what?
If the grain is unclear, users can accidentally count the same event twice, join at the wrong level, or compare metrics that are not measuring the same business process.
For example, this may look reasonable:
select
date_trunc('week', o.order_created_at) as reporting_week,
sum(o.net_revenue_amount) as net_revenue,
count(distinct c.customer_id) as customers
from fct_orders o
left join dim_customer c
on o.customer_id = c.customer_id
group by 1;
If fct_orders has one row per order, that may be fine.
If the query later joins to fct_order_lines, revenue can be duplicated unless the model handles the grain change deliberately.
That is why grain is not a documentation nicety.
It is a reporting control.
Time rules decide when a metric happened
Many reporting disputes are really time disputes.
For Brightside Bikes:
- Shopify may record order creation time.
- Stripe may record payment capture time.
- Finance may recognise revenue later.
- Refunds may occur weeks after the original order.
- Inventory snapshots may arrive at a daily or weekly cadence.
Each timing rule can be valid for a different decision.
The issue is not that one time field is always correct.
The issue is whether the model says which time rule it uses.
Weekly trading revenue may use order date. Finance revenue may use recognition date. Cash reporting may use settlement date.
Those should be different definitions, not accidental inconsistencies.
Dimensions can change the meaning of the same fact
Suppose Brightside Bikes wants revenue by product category.
The order line fact records the sale. The SKU dimension assigns each SKU to a category.
select
date_trunc('week', ol.order_created_at) as reporting_week,
s.product_category,
sum(ol.net_revenue_amount) as net_revenue
from fct_order_lines ol
left join dim_sku s
on ol.sku_id = s.sku_id
group by 1, 2;
This query depends on the quality of dim_sku.
If product categories are missing, duplicated, or historically overwritten, the metric changes.
That is why dimensions need ownership too.
They are not passive labels.
They shape the business interpretation.
Facts and dimensions help separate questions
One benefit of dimensional modeling is that it separates the event from the context.
That helps teams avoid mixing business questions.
For example:
fct_ordersanswers what was bought and when.dim_customerexplains who bought it.dim_skuexplains what kind of product it was.dim_calendarexplains the reporting period.mart_weekly_operating_reportpresents the approved weekly view.
This separation makes the reporting path easier to inspect.
If revenue changed, the team can ask whether the source fact changed, a dimension changed, or the final mart logic changed.
That is much faster than debugging one giant dashboard calculation.
A metric needs all four parts
For a metric to be trusted, define:
| Component | Question |
|---|---|
| Fact | What event or state are we measuring? |
| Grain | What does one row represent? |
| Dimension | Which context describes the event? |
| Time rule | When does the event count? |
For Brightside Bikes, “Active Bike Owner” might mean:
- Fact: customer has at least one completed bike order.
- Grain: one row per customer.
- Dimension: customer lifecycle status and product category.
- Time rule: active if bike order date is within the last 12 months.
That can become explicit SQL:
select
customer_id,
max(order_created_at) as latest_bike_order_at,
case
when max(order_created_at) >= current_date - interval '12 months'
then true
else false
end as is_active_bike_owner
from fct_order_lines ol
left join dim_sku s
on ol.sku_id = s.sku_id
where s.product_category = 'Bike'
and ol.is_cancelled = false
group by 1;
The code is only trustworthy if the business agrees the definition.
The model makes that definition inspectable.
What to inspect first
When a metric is disputed, inspect:
- What fact is being measured?
- What is the grain of the model?
- Which dimensions are joined in?
- Are dimension keys unique?
- Which time field places the metric into a period?
- Are late-arriving events handled?
- Are facts being aggregated before or after joins?
- Does the dashboard change the grain again?
Most metric disputes become clearer once these questions are answered.
What good looks like
Good reporting does not require every user to understand dimensional modeling.
It requires the data team to use modeling discipline so users do not have to guess.
For Brightside Bikes, a leadership dashboard should not leave executives wondering whether revenue is order-level, order-line-level, refund-adjusted, recognised, settled, or campaign-attributed.
The model should make that meaning clear before the number reaches the meeting.
Facts record what happened.
Dimensions explain it.
Grain controls how it can be counted.
Time rules decide when it belongs.
Together, those choices are the moment a metric starts to mean something.