Monday, December 22, 2014

Distributed Replay

Recently, I had the privilege to do some load testing for a client.  I did some investigation into different tools that would do the job.  I looked at HammerDB and Distributed Replay based on suggestions I got from #SQLHelp.   My client has a very old, very specialized application.  The fact that Distributed Replay (DR) is a Microsoft tool, and I could get data from production to run on a QA system made it an easy decision to use DR.  I really needed to test the client’s specific workload exactly.
This was a new tool for me which I had to learn.   As a DBA this is not new, we all do this almost daily.  It is also likely why most of us do this job, because we are always learning something new.   When I started into the process of doing a test run of my process, I found I needed more than Books On Line (BOL).  This is also not new, which is why I love that so many of us in this industry blog.   I know that someone other than me will someday need more information than the BOL has, simply because I found a few “juicy” tidbits of information that to my knowledge were not documented.  So my next few posts will be about Distributed Replay.  I blog when I have something, I think others need to know about,  and I did find a few items out about distributed replay that I want to share, and hope that by sharing I will make the process easier for someone else.
Let’s start with an introduction
Microsoft documents that Distributed Replay is a feature to help you assess the impact of future SQL Server upgrades, the impact of hardware or operating system upgrades, and for SQL Server tuning.   SQL Server tuning is a large category that is mentioned at the end of the paragraph and not expounded on.   However, the best way to look at it is, that it is similar to SQL Server Profiler but is not limited to replaying the workload from a single computer.  When replaying an intensive OLTP application workload that has many concurrent connections, profiler can become a resource nightmare.  This is where distributed replay comes in; it is excellent for situations where the concurrency in the captured trace is so high that a single replay client cannot sufficiently simulate it.
I want to note that although Distributed Replay was introduced in SQL Server 2012, it can collect data from a 2008 or 2008R2 system and replay it on a SQL Server 2012 server.

In my next few posts I will explain how to configure Distributed Replay, how to use it and what I did to collect useful information.

No comments:

Post a Comment