Technical note

Star Schema, One Big Table, Data Vault: How to Choose Without Dogma

By Gavin Somers · 9 min read · Published 18 June 2026

Data modelling debates can become strangely religious.

Star schema. One big table. Data Vault. Normalised models. Wide marts. Semantic layers.

Each pattern has serious practitioners and real use cases.

But the useful question is not, “Which pattern is best?”

The useful question is, “Which pattern makes this reporting path more trustworthy for this business decision?”

For Reporting Trust, modelling patterns are not ideology.

They are tradeoffs.

The symptom: the architecture debate replaces the reporting problem

At Brightside Bikes, the leadership team does not care whether the weekly operating report is powered by a star schema, one big table, or a vault-inspired integration layer.

They care whether the numbers are trusted.

They want to know:

Why Shopify Net Revenue differs from Finance Revenue.
Which product categories are driving margin.
Whether paid media is acquiring profitable customers.
Whether inventory constraints are suppressing sales.
Whether the board pack can be reconciled to operational reporting.

If the modelling debate does not improve those answers, it is a distraction.

The architecture should serve the reporting contract.

Star schema: strong for repeated business questions

A star schema separates facts from dimensions.

The fact table records the event. Dimension tables describe the event.

For Brightside Bikes, a simple star might include:

fct_order_lines
dim_customer
dim_sku
dim_calendar
dim_channel

This pattern works well when the business repeatedly asks similar questions across stable dimensions.

For example:

select
  c.reporting_week,
  s.product_category,
  ch.channel_group,
  sum(ol.net_revenue_amount) as net_revenue,
  sum(ol.quantity) as units_sold
from fct_order_lines ol
left join dim_calendar c
  on ol.order_date = c.calendar_date
left join dim_sku s
  on ol.sku_id = s.sku_id
left join dim_channel ch
  on ol.channel_id = ch.channel_id
group by 1, 2, 3;

The benefit is clarity.

Users can understand the event and the context separately. Tests can protect keys and grain. Dimensions can be reused across reports.

The risk is that star schemas require discipline.

If dimensions are poorly maintained, keys are duplicated, or the fact grain is unclear, the star does not protect trust by itself.

One big table: strong for usability, risky for ambiguity

One big table means creating a wide model that contains many of the fields a dashboard or analyst needs in a single place.

For Brightside Bikes, that might be:

One row per order line.
Revenue, cost, margin, customer segment, product category, channel, and week all included.

This can be useful.

It reduces join mistakes for downstream users. It can make BI dashboards faster. It gives analysts one obvious place to start.

Example:

select
  order_line_id,
  order_id,
  customer_id,
  order_week,
  product_category,
  channel_group,
  customer_lifecycle_stage,
  net_revenue_amount,
  standard_unit_cost,
  contribution_margin_amount
from mart_order_line_commercials;

The benefit is accessibility.

The risk is hidden complexity.

If the wide table mixes grains, includes stale dimension attributes, or silently bakes in multiple definitions, users may trust it because it is convenient rather than because it is well governed.

One big table can be a good consumption layer.

It should not become a dumping ground for unresolved modelling decisions.

Data Vault: strong for auditability and change, heavy for direct reporting

Data Vault patterns separate business keys, relationships, and descriptive history.

The approach can be useful when source systems change often, auditability is critical, or the business needs to preserve historical relationships carefully.

For Brightside Bikes, a vault-inspired integration layer might track:

Customer business keys across Shopify, HubSpot, and support systems.
Product keys across Shopify and inventory files.
Order relationships across Shopify, Stripe, and fulfilment.
Historical changes to customer or SKU attributes.

The benefit is traceability.

The risk is complexity.

Data Vault structures are usually not friendly for direct executive reporting. They often need downstream business marts, star schemas, or wide tables before users can consume them confidently.

That does not make the pattern wrong.

It means the pattern solves a different problem.

The decision is about trust risk

The right modelling choice depends on what can go wrong.

Pattern	Useful when	Watch out for
Star schema	Repeated reporting across stable facts and dimensions	Poorly defined grain or duplicate dimension keys
One big table	Dashboard usability and fewer downstream joins matter	Hidden business rules and mixed grains
Data Vault	Auditability, source change, and historical traceability matter	Too much complexity for direct reporting users

The best architecture may use more than one pattern.

Brightside Bikes might use staging models for raw source cleanup, a vault-like integration layer for customer identity history, star schemas for core order reporting, and wide marts for specific dashboards.

That is not inconsistency.

It is architecture serving different jobs.

Model layers can have different audiences

One mistake is expecting one model shape to satisfy every user.

Engineers may need traceability.

Analysts may need flexible facts and dimensions.

Executives may need stable marts.

Finance may need reconciliation and audit trails.

The same source data can support multiple layers:

Raw and staging models preserve source meaning.
Integration models resolve keys and relationships.
Business models define facts, dimensions, and approved logic.
Marts present decision-ready views.

The important thing is that the path between layers is clear.

Without lineage, multiple model shapes can become another source of confusion.

Brightside Bikes: choosing by reporting question

For weekly trading performance, Brightside Bikes may use a star schema.

The business needs to slice order lines by product, channel, week, and customer segment. Facts and dimensions fit that job well.

For the executive operating dashboard, the team may publish a wide mart.

The dashboard should not have to join six models every time someone opens it. It needs approved metrics at the right grain.

For customer identity resolution, the team may need a more historical integration pattern.

Shopify, HubSpot, and support records may identify the same customer differently. A simple dashboard table may hide those changes.

The correct answer is not one pattern everywhere.

The correct answer is controlled translation between patterns.

SQL should reveal the modelling intent

Good model SQL should make the intended pattern obvious.

For a star-style mart:

select
  ol.order_line_id,
  ol.order_id,
  ol.customer_id,
  ol.sku_id,
  ol.channel_id,
  ol.order_date,
  ol.net_revenue_amount,
  ol.quantity
from fct_order_lines ol;

For a dashboard-ready wide mart:

select
  ol.order_week,
  s.product_category,
  ch.channel_group,
  count(distinct ol.order_id) as orders,
  sum(ol.net_revenue_amount) as net_revenue,
  sum(ol.contribution_margin_amount) as contribution_margin
from fct_order_lines ol
left join dim_sku s
  on ol.sku_id = s.sku_id
left join dim_channel ch
  on ol.channel_id = ch.channel_id
group by 1, 2, 3;

Both can be valid.

They serve different users.

What to inspect first

Before choosing a modelling pattern, inspect:

Who will use the model?
What decision does it support?
How stable are the source systems?
How often do definitions change?
Does the business need audit history?
Is the main risk usability, traceability, performance, or correctness?
Where should joins happen?
What model grain does the dashboard need?
How will tests protect keys and relationships?

The answer should come from the reporting risk, not personal preference.

What good looks like

Good architecture does not mean one pattern wins every argument.

It means each layer has a clear job.

For Brightside Bikes:

Source-aligned models preserve operational facts.
Integration models resolve identities and relationships.
Star schemas make core business processes reusable.
Wide marts make priority dashboards fast and clear.
Reconciliation models explain differences between legitimate definitions.

That is how modelling patterns reduce the Invisible Data Tax.

They stop teams from rebuilding meaning in every dashboard and meeting.

The goal is not a perfect schema.

The goal is a reporting path the business can inspect, trust, and maintain.

Star Schema, One Big Table, Data Vault: How to Choose Without Dogma

The symptom: the architecture debate replaces the reporting problem

Star schema: strong for repeated business questions

One big table: strong for usability, risky for ambiguity

Data Vault: strong for auditability and change, heavy for direct reporting

The decision is about trust risk

Model layers can have different audiences

Brightside Bikes: choosing by reporting question

SQL should reveal the modelling intent

What to inspect first

What good looks like

Useful terms for this article

Data Value Chain

Semantic Layer Gate

Reporting Contract

Keep reading

How to Model JSON Event Data With SQL

How to Use AI in Analytics Engineering Without Damaging the Repo

Use this as supporting context