Monitor SQL Server CDC with Customer Data Collector

I’ve posted several times over the years here @MSSQLDUDE on the benefits of SQL Server Management Data Warehouse (MDW) for baselining your SQL Server performance and sending those measurements to a central server for monitoring. MDW makes use of the SQL Server Data Collector feature and is a feature you get out of the box. Here are some of my writings on MDW here @MSSQLDUDE: http://en.search.wordpress.com/?q=mdw&site=mssqldude.wordpress.com.

There is a debate among SQL Server DBAs about the worthiness of MDW and Data Collectors because they can appear to be a “heavy burden” on your server and provide overhead that you may not wish to occur on your production server. That being said, the way things work in our world and jobs in databases is a give and take with performance, monitoring, scalability and providing functionality to our business users & customers to allow them to do their jobs. Any monitoring and baselining product incur a small performance cost to your servers. There are a number of options out there in the market and many DBAs write their own scripts to store DMV data into tables and files to provide similar functionality.

What I wanted to point you all to is something that I think proves the value of Data Collectors (and MDW) through extending the functionality with custom data collectors.

I’ve been working on a CDC project recently in a large BI/DW scenario and one thing that you notice when working on CDC instead of longer-run SQL Server features like Replication, is that there is not much support in SSMS or other built-in SQL Server tools for CDC.

But what is really nice is this custom data collector for CDC that is included on MSDN from this terrific article onMonitoring the Change Data Capture Process. Scroll down to the Data Collector portion and the code to paste into SSMS is right there for you to create and start your data collector. This is a great way to monitor your CDC log processes.

BTW, you have to make an update to the XML in code from MSDN … Look for the SELECT @parameters code and replace the entire SELECT … XML statement with this:

SELECT @paramters = CONVERT(xml, 
    N'<ns:TSQLQueryCollector xmlns:ns="DataCollectorType">
        <Query>
          <Value>SELECT * FROM UserDB.sys.dm_cdc_log_scan_sessions</Value>
          <OutputTable>cdc_log_scan_data</OutputTable>
        </Query>
      </ns:TSQLQueryCollector>');

Enjoy! Best, Mark

Let’s Start with Change Data Capture

Change Data Capture (CDC) is a capability that SQL Server added starting with SQL Server 2008. It is a capability that other databases such as Oracle and 3rd party providers of general CDC behavior. In the past, many database administrators would use SQL Server’s transactional or snapshot replication to achieve a similar data refresh and data capture functionality in data warehouse systems.

CDC is very powerful and with the capabilities of CDC in SQL Server 2008, it is much easier to keep applications such as data warehouse and business intelligence systems refreshed with recent transactional system changes. If you were to use CDC in an SSIS pipeline, for example, you could query for the changed rows and perform transformations on that updated dataset.

I also started the blog off with CDC because I previously wrote an article for SearchSQLServer that I can just point you to for more insights here!

But there is also a much more lengthy, and very good, article in MS SQL Tips here that shows the process in its entirety as part of an SSIS package. Check it out.