Friday, March 30, 2012
Merge Star Schema
We have two star schemas built into our warehouse: one called
vehicle data and one called parts data. For some reason the warehouse
was built with these quite separate 'data marts' (not sure if that's
the right term)? Even though the parts will always relate to the
vehicle.
How should these be merged? Do we connect via the dimension
tables? Or do we need a some other new type of tables to carry the
keys?
Would another strategy be to place all the parts facts
into the vehicle fact table and then merge the dimension tables also?
Many thanks in advance for any pointers.
Sam
Hello SGPG,
I hope you haven't given up on this thread. It's hard to get an
idea of your problem with out a bit more information. If you could
post a brief description of each fact and dimension table that would be
a great help.
Here are a couple of points to consider in the mean time
Star Schemas can be very hard to consolidate. You see it's all a
question of grain...
Grain is the lowest unique level of the fact table. In most star
schemas the grain defines the grouping of the metrics. Grain is defined
by the dimensions and the primary key of the fact table.
You can consolidate fact tables if the metrics make sense and the fact
table grains are the same.
For example
You have two star schemas each maintain a different set of metrics.
Below are Fact Invoices (what has been sold to the customer) and Fact
Orders (what the customer has ordered
Fact Invoices
InvoiceDate
CustomerNumber
ProductCode
InvoiceNumber
InvoiceLineNumber
InvoiceUnitPrice
InvoiceQty
InvoiceLineTotal
Fact Orders
OrderDate
DueDate
CustomerNumber
ProductCode
InvoiceNumber
InvoiceLineNumber
OrderUnitPrice
OrderQty
OrderLineTotal
Both fact tables have a common set of dimensions and grain. We know
that Invoice Number and Invoice line number are the same. Also we have
the common dimension of Customer and product. So the lowest level that
the metrics make sense is (CustomerNumber, ProductCode, InvoiceNumber,
InvoiceLineNumber)
In this example the order dates and the invoices do not affect the
grain of the common fact table. So the consolidate fact table would
look something this.
Fact Sales
InvoiceDate
OrderDate
DueDate
CustomerNumber
ProductCode
InvoiceNumber
InvoiceLineNumber
OrderUnitPrice
OrderQty
OrderLineTotal
InvoiceUnitPrice
InvoiceQty
InvoiceLineTotal
Although this example assumes that every order has an invoice. It is
possible to have orders that do not have invoices in this star schema.
The invoice metrics would be represented as 0 with no invoice date.
Hope this Helps
Myles Matheson
Data Warehouse Architect
|||Thanks Miles. I do not have the detail of the example I posed as it's
a while since I saw it. My question stemmed from, what I saw, as
something of a design flaw as a vehicle fact table was built with
dimension tables around it. Later a parts fact was built which seemed
to be the beginning of a data mart approach to data warehousing. Your
solution might not have worked in this case as parts were usually
aggregated up (i.e. a vehicle might have 10 or more parts installed).
I wondered how this should have been designed to incorporate both parts
and vehicles.
sql
Merge Star Schema
We have two star schemas built into our warehouse: one called
vehicle data and one called parts data. For some reason the warehouse
was built with these quite separate 'data marts' (not sure if that's
the right term)? Even though the parts will always relate to the
vehicle.
How should these be merged? Do we connect via the dimension
tables? Or do we need a some other new type of tables to carry the
keys?
Would another strategy be to place all the parts facts
into the vehicle fact table and then merge the dimension tables also?
Many thanks in advance for any pointers.
SamHello SGPG,
I hope you haven't given up on this thread. It's hard to get an
idea of your problem with out a bit more information. If you could
post a brief description of each fact and dimension table that would be
a great help.
Here are a couple of points to consider in the mean time
Star Schemas can be very hard to consolidate. You see it's all a
question of grain...
Grain is the lowest unique level of the fact table. In most star
schemas the grain defines the grouping of the metrics. Grain is defined
by the dimensions and the primary key of the fact table.
You can consolidate fact tables if the metrics make sense and the fact
table grains are the same.
For example
You have two star schemas each maintain a different set of metrics.
Below are Fact Invoices (what has been sold to the customer) and Fact
Orders (what the customer has ordered
Fact Invoices
InvoiceDate
CustomerNumber
ProductCode
InvoiceNumber
InvoiceLineNumber
InvoiceUnitPrice
InvoiceQty
InvoiceLineTotal
Fact Orders
OrderDate
DueDate
CustomerNumber
ProductCode
InvoiceNumber
InvoiceLineNumber
OrderUnitPrice
OrderQty
OrderLineTotal
Both fact tables have a common set of dimensions and grain. We know
that Invoice Number and Invoice line number are the same. Also we have
the common dimension of Customer and product. So the lowest level that
the metrics make sense is (CustomerNumber, ProductCode, InvoiceNumber,
InvoiceLineNumber)
In this example the order dates and the invoices do not affect the
grain of the common fact table. So the consolidate fact table would
look something this.
Fact Sales
InvoiceDate
OrderDate
DueDate
CustomerNumber
ProductCode
InvoiceNumber
InvoiceLineNumber
OrderUnitPrice
OrderQty
OrderLineTotal
InvoiceUnitPrice
InvoiceQty
InvoiceLineTotal
Although this example assumes that every order has an invoice. It is
possible to have orders that do not have invoices in this star schema.
The invoice metrics would be represented as 0 with no invoice date.
Hope this Helps
Myles Matheson
Data Warehouse Architect|||Thanks Miles. I do not have the detail of the example I posed as it's
a while since I saw it. My question stemmed from, what I saw, as
something of a design flaw as a vehicle fact table was built with
dimension tables around it. Later a parts fact was built which seemed
to be the beginning of a data mart approach to data warehousing. Your
solution might not have worked in this case as parts were usually
aggregated up (i.e. a vehicle might have 10 or more parts installed).
I wondered how this should have been designed to incorporate both parts
and vehicles.
Merge SQL 7 to 2000 problem
a SQL 7 publisher/distributor and a SQL 2000 subscriber
(push). After the subscription is initialized, the agent
fails with the error "The process could not query row
metadata at the Subscriber." The error details
says "Could not find stored procedure ''." A clip from
the log is below. This same merge replication works fine
from SQL 7 to SQL 7, but fails to SQL 2000. I have tried
2 different SQL 2k boxes with the same error.
The replication job logs in using SQL Server
authentication. Account is system administrator and dbo
on subscriber. Distributor runs under sa. Both Servers
are at latest SP levels. KB search has turned up no help.
Thanks for any ideas!
~~~~~~~~~~~~~~~~~~snip~~~~~~~~~~~~~~~~~~~
Percent Complete: 55
Processing article 'RequestStatusHistory'
Repl Agent Status: 3
chrs4.ITWorkRequest: {call sp_MSenumcolumns (?,?)}
chrs4.ITWorkRequest: {call sp_MSenumchanges(?,?,?,?,?)}
CHHIST.ITWorkRequest: {call sp_MSgetrowmetadata
(?,?,?,?,?,?,?)}{call sp_MSgetrowmetadata(?,?,?,?,?,?,?)}
{call sp_MSgetrowmetadata(?,?,?,?,?,?,?)}{call
sp_MSgetrowmetadata(?,?,?,?,?,?,?)}{call
sp_MSgetrowmetadata(?,?,?,?,?,?,?)}{call
sp_MSgetrowmetadata(?,?,?,?,?,?,?)}{call
sp_MSgetrowmetadata(?,?,?,?,?,?,?)}{call
sp_MSgetrowmetadata(?,?,?,?,?,?,?)}{call
sp_MSgetrowmetadata(?,?,?,?,?,?,?)}{call
sp_MSgetrowmetadata(?,?,?,?,?,?,?)}{call
sp_MSgetrowmetadata(?,?,?,?,?,?,?)}{call
sp_MSgetrowmetadata(?,?,?,?,?,?,?)}{call
sp_MSgetrowmetadata(?,?,?,?,?,?,?)}{call
sp_MSgetrowmetadata(?,?,?,?,?,?,?)}
Percent Complete: 0
The process could not query row metadata at the
Subscriber.
Repl Agent Status: 6
Percent Complete: 0
Category:COMMAND
Source: Failed Command
Number:
Message: {call sp_MSgetrowmetadata(?,?,?,?,?,?,?)}{call
sp_MSgetrowmetadata(?,?,?,?,?,?,?)}{call
sp_MSgetrowmetadata(?,?,?,?,?,?,?)}{call
sp_MSgetrowmetadata(?,?,?,?,?,?,?)}{call
sp_MSgetrowmetadata(?,?,?,?,?,?,?)}{call
sp_MSgetrowmetadata(?,?,?,?,?,?,?)}{call sp_M
Repl Agent Status: 3
Percent Complete: 0
Category:SQLSERVER
Source: CHHIST
Number: 2812
Message: Could not find stored procedure ''.
Repl Agent Status: 3
Could not find stored procedure ''.
Disconnecting from Publisher 'chrs4'
Disconnecting from Subscriber 'CHHIST'
Disconnecting from Publisher 'chrs4'
Disconnecting from Distributor 'chrs4'
George,
this is not a supported configuration. For merge replication, a SQL 7.0
publisher can only publish to a SQL 7.0 Subscriber.
Rgds,
Paul Ibison
(recommended sql server 2000 replication book:
http://www.nwsu.com/0974973602p.html)
|||Paul,
Thanks for the reply. I read the "...Different Versions"
article in books online, but obviously scanned too
quickly. I see the limitation now for merge, but not for
snap and trans. You might help out "Nick Horrocks" with
an answer to his thread "Unable to create Merge
subscription".
George
>--Original Message--
>George,
>this is not a supported configuration. For merge
replication, a SQL 7.0
>publisher can only publish to a SQL 7.0 Subscriber.
>Rgds,
>Paul Ibison
>(recommended sql server 2000 replication book:
>http://www.nwsu.com/0974973602p.html)
>
>.
>
|||Thanks for the prompt - have posted to Nick as well.
Rgds,
Paul Ibison
(recommended sql server 2000 replication book:
http://www.nwsu.com/0974973602p.html)
>--Original Message--
>Paul,
>Thanks for the reply. I read the "...Different Versions"
>article in books online, but obviously scanned too
>quickly. I see the limitation now for merge, but not for
>snap and trans. You might help out "Nick Horrocks" with
>an answer to his thread "Unable to create Merge
>subscription".
>George
>
>replication, a SQL 7.0
>.
>
Merge snapshot problem with views and ArticleType property with SQ
tables, views, procs, and UDFs using SQLDMO. I have no problems creating the
publication with tables only. When I try to add procs using the following
code I get the error
The @.schema_option parameter for a procedure or function schema article can
include only the options 0x0000000000000001 or 0x0000000000002000.
I assume I am not setting up the objMrgArt object properly. Any help is
appreciated.
For i = 1 To objRepSps.Count
Set objRepSp = objRepSps(i)
Set objMrgArt = New SQLDMO.MergeArticle2
objMrgArt.Name = objRepSp.Name
objMrgArt.SourceObjectName = objRepSp.Name
objMrgArt.SourceObjectOwner = objRepSp.Owner
objMrgArt.ArticleType = SQLDMORep_ProcSchemaOnly
objMrgPub.MergeArticles.Add objMrgArt
Next i
Since different article types accept different set of schema options (or in
SQLDMO terminology, MergeArticle.CreationScriptOptions) and the (arguably
semi-buggy) SQLDMO behavior of having a set of default CreationScriptOptions
that is not compatible with all article types, you would need to explicitly
specify a compatible set of CreationScriptOptions when creating a non-table
article. Extending the example that you have given below, you can write:
For i = 1 To objRepSps.Count
Set objRepSp = objRepSps(i)
Set objMrgArt = New SQLDMO.MergeArticle2
objMrgArt.Name = objRepSp.Name
objMrgArt.SourceObjectName = objRepSp.Name
objMrgArt.SourceObjectOwner = objRepSp.Owner
objMrgArt.ArticleType = SQLDMORep_ProcSchemaOnly
objMrgArt.CreationScriptOptions =
SQLDMO_CREATIONSCRIPT_TYPE.PrimaryObject Or
SQLDMO_CREATIONSCRIPT_TYPE.ExtendedProperties
objMrgPub.MergeArticles.Add objMrgArt
Next i
To ensure that the base stored procedure and the associated extended
properties are replicated with the initial snapshot.
HTH.
-Raymond
This posting is provided "as is" with no warranties and confers no rights.
"mspradley" <mspradley@.discussions.microsoft.com> wrote in message
news:57624519-68F3-4073-BF90-D2DD80EBCEF3@.microsoft.com...
> I am trying to create a merge publication with a snapshot that includes
all
> tables, views, procs, and UDFs using SQLDMO. I have no problems creating
the
> publication with tables only. When I try to add procs using the following
> code I get the error
> The @.schema_option parameter for a procedure or function schema article
can
> include only the options 0x0000000000000001 or 0x0000000000002000.
> I assume I am not setting up the objMrgArt object properly. Any help is
> appreciated.
>
> For i = 1 To objRepSps.Count
> Set objRepSp = objRepSps(i)
> Set objMrgArt = New SQLDMO.MergeArticle2
> objMrgArt.Name = objRepSp.Name
> objMrgArt.SourceObjectName = objRepSp.Name
> objMrgArt.SourceObjectOwner = objRepSp.Owner
> objMrgArt.ArticleType = SQLDMORep_ProcSchemaOnly
> objMrgPub.MergeArticles.Add objMrgArt
> Next i
>
Merge snapshot frequency
1.merge replication with 3000 subscribers
2.I have dynamic publication (NOT dynamic snapshot)
3.we observed that synchronization is slow whenever we do a data import at
the publisher
4.because of this we ran the snapshot agent immediately after the data import
5.after the snapshot run, we find the subscriber syncs faster
I would like to know what are the implications of running snapshot agent.
Will it remove the delta changes which are pending for sync? Or is it okay if
I run snapshot as much as I like?
Expect slowness whenever you do a data import as there is always the impact
of the dataload and then there is the added data you have to merge.
Are you saying if you run the snapshot you get faster sync's? This only
makes sense if you re-initialize your subscribers after regenerating the
snapshot.
If you have anonymous subscribers the snapshot is always generated each time
you run it. If you have named it is only regenerated if subscribers expire
or require reinitialization.
Hilary Cotter
Looking for a SQL Server replication book?
http://www.nwsu.com/0974973602.html
Looking for a FAQ on Indexing Services/SQL FTS
http://www.indexserverfaq.com
"Ravi Lobo" <RaviLobo@.discussions.microsoft.com> wrote in message
news:45E03380-6DE7-4AB1-B20D-A97C18B04717@.microsoft.com...
>I have the following scenario,
> 1.merge replication with 3000 subscribers
> 2.I have dynamic publication (NOT dynamic snapshot)
> 3.we observed that synchronization is slow whenever we do a data import at
> the publisher
> 4.because of this we ran the snapshot agent immediately after the data
> import
> 5.after the snapshot run, we find the subscriber syncs faster
>
> I would like to know what are the implications of running snapshot agent.
> Will it remove the delta changes which are pending for sync? Or is it okay
> if
> I run snapshot as much as I like?
>
>
|||Thank you Hilary for you time. I have some more clarifications,
1.I have sql server ce subscribers
2.Hence I need to use anonymous subscription
I have the following questions here,
a)Can I use pre-generated snapshot in my case? (Anonymous + sql ce
subscribers)
b)I also have dynamic filters on the publisher. What impact I will have by
re-running the snapshot second time, on the subscriber?
"Hilary Cotter" wrote:
> Expect slowness whenever you do a data import as there is always the impact
> of the dataload and then there is the added data you have to merge.
> Are you saying if you run the snapshot you get faster sync's? This only
> makes sense if you re-initialize your subscribers after regenerating the
> snapshot.
> If you have anonymous subscribers the snapshot is always generated each time
> you run it. If you have named it is only regenerated if subscribers expire
> or require reinitialization.
> --
> Hilary Cotter
> Looking for a SQL Server replication book?
> http://www.nwsu.com/0974973602.html
> Looking for a FAQ on Indexing Services/SQL FTS
> http://www.indexserverfaq.com
>
> "Ravi Lobo" <RaviLobo@.discussions.microsoft.com> wrote in message
> news:45E03380-6DE7-4AB1-B20D-A97C18B04717@.microsoft.com...
>
>
Merge Snapshot execution does not create any BCP files for data tables.
After running Merge Snapshot Subscription Job No BCP files are created
for the data tables:
here is my code for the Merge Articles:
Dim lbResult As Boolean
Dim lobjMArt As MergeArticle2
Dim lbHasIdentity As Boolean
Set lobjMArt = New MergeArticle2
lobjMArt.Name = asTableName
lobjMArt.SourceObjectName = asTableName
lobjMArt.SourceObjectOwner = "dbo"
UpdateLog "CreationScriptOptions Default Value is: " &
lobjMArt.CreationScriptOptions
If abHasIdentity Then
lobjMArt.AutoIdentityRange = True
lobjMArt.PublisherIdentityRangeSize = 1000000
lobjMArt.SubscriberIdentityRangeSize = 1000000
lobjMArt.IdentityRangeThreshold = 85
End If
pobjSQLDMOMerge.MergeArticles.Add lobjMArt
Set lobjMArt = Nothing
lbResult = True
The Job History Shows:
bulk copying snapshot data for article [TableName] (0 rows).
This is shown for all tables. 90 % of my tables contain records. For
Example the Patient Table has 675 records.
What am i Missing?
Thanks In Advance.
Brian
The agent just ran and created the bcp files.
The agent is scheduled to run @. 12:00 am. Why did it work through
scheduled execution and not through code.
Am I missing something from this code:
Dim lbResult As Boolean
Dim lsErrSource As String
Dim lsJobID As String
Dim liX As Long
Dim lobjDis As SQLDMO.Distributor2
Dim lobjDisPubs As SQLDMO.DistributionPublishers
Dim lobjDisPub As SQLDMO.DistributionPublication2
Dim lobjJobs As SQLDMO.Jobs
Dim lobjJob As SQLDMO.Job
Dim loJobHis As SQLDMO.JobHistoryFilter
Dim loQR As SQLDMO.QueryResults2
Dim liY As Integer 'Rows
UpdateLog "Refresh Snapshot Function"
'===Load an Default Vars
lsErrSource = "clsPublisher.RefreshSnapShot"
'===Parse the Connection String:
If ParseConnectionString(asConnect) Then
If ConnectToServer() Then
'pobjSQLServer now equals the Sever object.
'We need to get a copy of the Publication DB.
Set lobjDis = pobjSQLServer.Replication.Distributor
Set lobjDisPubs = lobjDis.DistributionPublishers
'Set lobjDisPub =
lobjDis.DistributionPublishers(psPubServer).Distri butionPublications.Item("Promise:Promise")
Set lobjDisPub =
lobjDisPubs(psPubServer).DistributionPublications. Item(1)
UpdateLog ("DisPub Name: " &
lobjDisPubs(psPubServer).DistributionPublications. Item(1).Name)
lsJobID = lobjDisPub.SnapshotJobID
psJobName = lobjDisPub.SnapshotAgent
Set lobjDisPub = Nothing
Set lobjDisPubs = Nothing
Set lobjDis = Nothing
Set lobjJobs = pobjSQLServer.JobServer.Jobs
'Get the Job to Start it.
UpdateLog ("Job Info: ID:" & lsJobID & " Name: " &
psJobName)
Call SaveSetting("Horizon Healthware Inc", "Promise",
"ReplicationJobName", psJobName)
Set lobjJob = lobjJobs.Item(psJobName)
UpdateLog ("Job: Starting Job.")
lobjJob.Invoke
liX = 0
lobjJob.Refresh
Do While lobjJob.CurrentRunStatus <>
SQLDMOJobExecution_Idle
liX = liX + 1
If liX = 200000 Then
lobjJob.Refresh
DoEvents
liX = 0
End If
Loop
' loJobHis.JobName = asJobName
' loJobHis.OldestFirst = False
'
' Set loQR =
pobjSQLServer.JobServer.EnumJobHistory(loJobHis)
'
' liX = 1
'
'
' For liX = 1 To loQR.Columns
' 'UpdateLog ("Job History ColumnName :" &
loQR.ColumnName(liX) & " Value:" & loQR.GetColumnString(1, liX))
'
' Next
Set loQR = Nothing
Set loJobHis = Nothing
Set lobjJob = Nothing
Set lobjJobs = Nothing
Call DisconnectFromServer
lbResult = True
Else
Err.Raise hhwErrorNum.lErrUnableToConnectToServer,
lsErrSource, ptErrorDesc.sErrParsingConnectionString
End If
Else
lbResult = False
Err.Raise hhwErrorNum.lErrParsingConnectionString, lsErrSource,
ptErrorDesc.sErrUnableToConnectToServer
End If
Thanks
Brian
sql
Merge sinchronization problem
Hi there.
I have big problem with SQL2005 merge sinchronization from single subscriber. This problem reveals itself on one database publications (it worked till today fine) but not others. Sincronization job writes to histrory :
The merge process was unable to create a new generation at the 'Subscriber'. Troubleshoot by restarting the synchronization with verbose history logging and specify an output file to which to write. (Source: MSSQL_REPL, Error number: MSSQL_REPL-2147200994).
This message can be found in sincronization status after 30 minutes idle with status
Upload changes to publisher
During this time the replication agent job locks another one. I think subscriber can not upload changes from subscriber, which have accumulated them since sinchronization failure start.
How can I log verbose history to file?
Will be appreciate for help.
Here is the info how to logging...
How to enable replication agents for logging to output files in SQL Server
http://support.microsoft.com/kb/312292
|||Did you ever find a solution to the problem. I have had the same error on a couple of different subscribers and the verbose logging does not show where the problem is!?
Thanks
Stuart
|||
I am also experiencing similar problems.
In my case, we are using pull subscriptions from 24 different subscribers all to one published database. Distributor is same as publisher.
No answers yet, but we're getting MS help on it -- sort of. Very difficult to track down without being able to get useful data, and getting useful data is proving to be difficult.
