Wednesday, March 28, 2012

merge replication with large amounts of updates at publisher

I have merge repl setup and it has been running with no problem... until a
scheduled task that runs once every night and updates a very large amount of
records was introduced. Merge repl is scheduled to run every hour. Now the
merge attempt directly following the nightly task that updates a large
amount of udates fails, and continues to fail every hour. It eventually does
succeed sometime in the evening. The scheduled task and initial repl failure
occur around 4am eastern time. I would think that would be a somewhat low
traffic time on the net... maybe not. It usually succeeds sometime between 6
and 8 pm eastern time.
1) when a large amount of updates occur, is every record in its entirety
sent to the subscriber? or does the merge repl system automatically package
things into more effecient jobs of some sort?
2) when repl does finally succeed it reports taking over 1 hour. Since my
schedule is every hour could that be part of, or THE problem? meaning a
merge job is scheduled to start but the last one is still running? could
that be why its failing?
3) could any concurrency/locking type issues occur at the subscriber while
the merge agent is operating? especially since it is continually failing and
retrying and failing and retrying?
4) is 50,000 records just way to many to expect to be able to merge? (over
the internet)
any info is greatly appreciated. Thanks.
1) replication metadata is compared to determine which row(s) need
updating/deleting/inserting. Basically the way it works is that you have a
rowguid which identifies each row. If a row is modified triggers write the
rowguid and some tracking information to the msmerge_contents table.
The merge agent will read all rows in the msmerge_contents table have been
added since the last time it ran. It will then compare the version
(generation) of this row with the version of the row on the
publisher/subscriber. The publisher/subscriber with the latest generation is
determined to be the winner, and then the source table on the publisher or
subscriber is consulted and the data values are extracted, and a
insert/update/delete statement is fired on the subscriber/publisher.
So initially its just the rowguid and version information which travels
across the wire, and then it could be the row itself.
2) I think you would benefit by setting QueryTimeOut to something large, it
600. You can set this by right clicking on your merge agent, selecting agent
properties, clicking steps, click run agent, and then in the commands
section, click edit, and look for -QueryTimeOut and set it to 600. It its
not there add it. The merge agent will run for a predetermined time and then
may fail. Sometimes it will take 3 or 4 times before it completely processes
all transactions. Many DBA wrap their merge agents in infinite loops. The
job scheduler for SQL Server will not run two simultaneous instances of the
same merge agent.
3) Its possible. Normally you will get a deadlock message in your merge
agent or msrepl_errors. Normally you run into locking/concurrency problems
when you have a large number of merge agents running simultaneously. There
is a setting to limit the number of concurrent merge agents. Right click on
your publication, select publication properties, and then click the
subscriber tab. Many DBAs will stagger their merge agents so they don't run
simultaneously. For instance they will schedule them to run every 17 minutes
or so. There is nothing magic about 17, its just a larger prime number.
4) Not at all, but this depends on hardware of the publisher and subscriber
and the bandwidth available. You might benefit from the slow link profile if
you have low bandwidth.
"djc" <noone@.nowhere.com> wrote in message
news:OV931hVrEHA.332@.TK2MSFTNGP14.phx.gbl...
>I have merge repl setup and it has been running with no problem... until a
> scheduled task that runs once every night and updates a very large amount
> of
> records was introduced. Merge repl is scheduled to run every hour. Now the
> merge attempt directly following the nightly task that updates a large
> amount of udates fails, and continues to fail every hour. It eventually
> does
> succeed sometime in the evening. The scheduled task and initial repl
> failure
> occur around 4am eastern time. I would think that would be a somewhat low
> traffic time on the net... maybe not. It usually succeeds sometime between
> 6
> and 8 pm eastern time.
> 1) when a large amount of updates occur, is every record in its entirety
> sent to the subscriber? or does the merge repl system automatically
> package
> things into more effecient jobs of some sort?
> 2) when repl does finally succeed it reports taking over 1 hour. Since my
> schedule is every hour could that be part of, or THE problem? meaning a
> merge job is scheduled to start but the last one is still running? could
> that be why its failing?
> 3) could any concurrency/locking type issues occur at the subscriber while
> the merge agent is operating? especially since it is continually failing
> and
> retrying and failing and retrying?
> 4) is 50,000 records just way to many to expect to be able to merge? (over
> the internet)
> any info is greatly appreciated. Thanks.
>
|||Thanks Hilary.
1) You suggested adding the QueryTimeOut setting directly to the command in
the agent. I just want to verify that the dialog box returned from right
clicking on the merge agent (from the replication monitor) and choosing
Agent Profile allows you to do the same thing. The dialog has several
template profiles like the Slow Link one you mentioned and others. I have
played with the QueryTimeOut value from there. Just want to make sure thats
ok as well since you suggested adding this option to the command itself..
any difference?
2) "The merge agent will run for a predetermined time and then may fail.
Sometimes it will take 3 or 4 times before it completely processes all
transactions. Many DBA wrap their merge agents in infinite loops. The job
scheduler for SQL Server will not run two simultaneous instances of the same
merge agent."
Well, wrapping the agents in an infinite loop is beyond my current skill
level although if it comes to that then I will have to learn. When you said
that the agent will run for a predetermined time and then fail. Sometimes 3
or 4 times before it completely processes all transactions... after failing
does it pick up where it left off? or does it start all over again? If it
picks up where it left off does that mean that its possible that out of a
50,000 record merge job that takes 6 times to complete that the records
updated during each of the failed attempts are committed before the final
completion of the whole 50,000 job?
3) In Books Online I read about the MaxUploadChanges and MaxDownloadChanges.
It said you can set a limit on the number of records updated per merge agent
session. But it didn't say what happens if you hit that limit. Lets say you
set a limit to 2500 records but you have 30,000 records that need to be
merged. After is stops at 2500 will the remaining changes still be processed
2500 at a time during subsequesnt merge agent runs? If so this may allow me
to split up my 1 time large update into smaller chunks since it only happens
once over night?
thanks again for all your help. It is very much appreciated.
-djc
"Hilary Cotter" <hilary.cotter@.gmail.com> wrote in message
news:Ov2WLwXrEHA.1204@.TK2MSFTNGP12.phx.gbl...
> 1) replication metadata is compared to determine which row(s) need
> updating/deleting/inserting. Basically the way it works is that you have
a
> rowguid which identifies each row. If a row is modified triggers write the
> rowguid and some tracking information to the msmerge_contents table.
> The merge agent will read all rows in the msmerge_contents table have been
> added since the last time it ran. It will then compare the version
> (generation) of this row with the version of the row on the
> publisher/subscriber. The publisher/subscriber with the latest generation
is
> determined to be the winner, and then the source table on the publisher or
> subscriber is consulted and the data values are extracted, and a
> insert/update/delete statement is fired on the subscriber/publisher.
> So initially its just the rowguid and version information which travels
> across the wire, and then it could be the row itself.
> 2) I think you would benefit by setting QueryTimeOut to something large,
it
> 600. You can set this by right clicking on your merge agent, selecting
agent
> properties, clicking steps, click run agent, and then in the commands
> section, click edit, and look for -QueryTimeOut and set it to 600. It its
> not there add it. The merge agent will run for a predetermined time and
then
> may fail. Sometimes it will take 3 or 4 times before it completely
processes
> all transactions. Many DBA wrap their merge agents in infinite loops. The
> job scheduler for SQL Server will not run two simultaneous instances of
the
> same merge agent.
> 3) Its possible. Normally you will get a deadlock message in your merge
> agent or msrepl_errors. Normally you run into locking/concurrency problems
> when you have a large number of merge agents running simultaneously. There
> is a setting to limit the number of concurrent merge agents. Right click
on
> your publication, select publication properties, and then click the
> subscriber tab. Many DBAs will stagger their merge agents so they don't
run
> simultaneously. For instance they will schedule them to run every 17
minutes
> or so. There is nothing magic about 17, its just a larger prime number.
> 4) Not at all, but this depends on hardware of the publisher and
subscriber
> and the bandwidth available. You might benefit from the slow link profile
if[vbcol=seagreen]
> you have low bandwidth.
>
> "djc" <noone@.nowhere.com> wrote in message
> news:OV931hVrEHA.332@.TK2MSFTNGP14.phx.gbl...
a[vbcol=seagreen]
amount[vbcol=seagreen]
the[vbcol=seagreen]
low[vbcol=seagreen]
between[vbcol=seagreen]
my[vbcol=seagreen]
while[vbcol=seagreen]
(over
>
sql

No comments:

Post a Comment