Showing posts with label built. Show all posts
Showing posts with label built. Show all posts

Friday, March 30, 2012

Merge Star Schema

Hi all,
We have two star schemas built into our warehouse: one called
vehicle data and one called parts data. For some reason the warehouse
was built with these quite separate 'data marts' (not sure if that's
the right term)? Even though the parts will always relate to the
vehicle.
How should these be merged? Do we connect via the dimension
tables? Or do we need a some other new type of tables to carry the
keys?
Would another strategy be to place all the parts facts
into the vehicle fact table and then merge the dimension tables also?
Many thanks in advance for any pointers.
Sam
Hello SGPG,
I hope you haven't given up on this thread. It's hard to get an
idea of your problem with out a bit more information. If you could
post a brief description of each fact and dimension table that would be
a great help.
Here are a couple of points to consider in the mean time
Star Schemas can be very hard to consolidate. You see it's all a
question of grain...
Grain is the lowest unique level of the fact table. In most star
schemas the grain defines the grouping of the metrics. Grain is defined
by the dimensions and the primary key of the fact table.
You can consolidate fact tables if the metrics make sense and the fact
table grains are the same.
For example
You have two star schemas each maintain a different set of metrics.
Below are Fact Invoices (what has been sold to the customer) and Fact
Orders (what the customer has ordered
Fact Invoices
InvoiceDate
CustomerNumber
ProductCode
InvoiceNumber
InvoiceLineNumber
InvoiceUnitPrice
InvoiceQty
InvoiceLineTotal
Fact Orders
OrderDate
DueDate
CustomerNumber
ProductCode
InvoiceNumber
InvoiceLineNumber
OrderUnitPrice
OrderQty
OrderLineTotal
Both fact tables have a common set of dimensions and grain. We know
that Invoice Number and Invoice line number are the same. Also we have
the common dimension of Customer and product. So the lowest level that
the metrics make sense is (CustomerNumber, ProductCode, InvoiceNumber,
InvoiceLineNumber)
In this example the order dates and the invoices do not affect the
grain of the common fact table. So the consolidate fact table would
look something this.
Fact Sales
InvoiceDate
OrderDate
DueDate
CustomerNumber
ProductCode
InvoiceNumber
InvoiceLineNumber
OrderUnitPrice
OrderQty
OrderLineTotal
InvoiceUnitPrice
InvoiceQty
InvoiceLineTotal
Although this example assumes that every order has an invoice. It is
possible to have orders that do not have invoices in this star schema.
The invoice metrics would be represented as 0 with no invoice date.
Hope this Helps
Myles Matheson
Data Warehouse Architect
|||Thanks Miles. I do not have the detail of the example I posed as it's
a while since I saw it. My question stemmed from, what I saw, as
something of a design flaw as a vehicle fact table was built with
dimension tables around it. Later a parts fact was built which seemed
to be the beginning of a data mart approach to data warehousing. Your
solution might not have worked in this case as parts were usually
aggregated up (i.e. a vehicle might have 10 or more parts installed).
I wondered how this should have been designed to incorporate both parts
and vehicles.
sql

Merge Star Schema

Hi all,
We have two star schemas built into our warehouse: one called
vehicle data and one called parts data. For some reason the warehouse
was built with these quite separate 'data marts' (not sure if that's
the right term)? Even though the parts will always relate to the
vehicle.
How should these be merged? Do we connect via the dimension
tables? Or do we need a some other new type of tables to carry the
keys?
Would another strategy be to place all the parts facts
into the vehicle fact table and then merge the dimension tables also?
Many thanks in advance for any pointers.
SamHello SGPG,
I hope you haven't given up on this thread. It's hard to get an
idea of your problem with out a bit more information. If you could
post a brief description of each fact and dimension table that would be
a great help.
Here are a couple of points to consider in the mean time
Star Schemas can be very hard to consolidate. You see it's all a
question of grain...
Grain is the lowest unique level of the fact table. In most star
schemas the grain defines the grouping of the metrics. Grain is defined
by the dimensions and the primary key of the fact table.
You can consolidate fact tables if the metrics make sense and the fact
table grains are the same.
For example
You have two star schemas each maintain a different set of metrics.
Below are Fact Invoices (what has been sold to the customer) and Fact
Orders (what the customer has ordered
Fact Invoices
InvoiceDate
CustomerNumber
ProductCode
InvoiceNumber
InvoiceLineNumber
InvoiceUnitPrice
InvoiceQty
InvoiceLineTotal
Fact Orders
OrderDate
DueDate
CustomerNumber
ProductCode
InvoiceNumber
InvoiceLineNumber
OrderUnitPrice
OrderQty
OrderLineTotal
Both fact tables have a common set of dimensions and grain. We know
that Invoice Number and Invoice line number are the same. Also we have
the common dimension of Customer and product. So the lowest level that
the metrics make sense is (CustomerNumber, ProductCode, InvoiceNumber,
InvoiceLineNumber)
In this example the order dates and the invoices do not affect the
grain of the common fact table. So the consolidate fact table would
look something this.
Fact Sales
InvoiceDate
OrderDate
DueDate
CustomerNumber
ProductCode
InvoiceNumber
InvoiceLineNumber
OrderUnitPrice
OrderQty
OrderLineTotal
InvoiceUnitPrice
InvoiceQty
InvoiceLineTotal
Although this example assumes that every order has an invoice. It is
possible to have orders that do not have invoices in this star schema.
The invoice metrics would be represented as 0 with no invoice date.
Hope this Helps
Myles Matheson
Data Warehouse Architect|||Thanks Miles. I do not have the detail of the example I posed as it's
a while since I saw it. My question stemmed from, what I saw, as
something of a design flaw as a vehicle fact table was built with
dimension tables around it. Later a parts fact was built which seemed
to be the beginning of a data mart approach to data warehousing. Your
solution might not have worked in this case as parts were usually
aggregated up (i.e. a vehicle might have 10 or more parts installed).
I wondered how this should have been designed to incorporate both parts
and vehicles.