Monday, 15 February 2016

Transactional Replication and Change Data Capture: The Log Reader Agent Conflict

Behind close doors of SQL Server, the following issue may be raised when Transactional Replication and Change Data Capture (CDC) are deployed and running together in the same database server, and because something was done incorrectly managing CDC jobs. We do know that two SQL Jobs are created for the CDC process when CDC is deployed which are 'cdc.MyDB_capture' and 'cdc.MyDB_cleanup'.
Looking into the first one, I would like to say that the 'cdc.MyDB_capture' job executes 'sys.sp_MScdc_capture_job' system stored procedure and it invokes 'sp_cdc_scan' to read internally the Transaction Log and capture the changes done in the database via the Log Reader Agent (created initially for Transactional Replication purposes). In other words, the 'cdc.MyDB_capture' job is the agent of CDC process which reads the Transaction Log by using the Log Reader Agent. Therefore, Transaction Replication and CDC running for the same database cannot use the same Log Read Agent at the same time. Otherwise, we will get this error:

The capture job cannot be used by Change Data Capture to extract changes from the log when transactional replication is also enabled on the same database. When Change Data Capture and transactional replication are both enabled on a database, use the logreader agent to extract the log changes.

The error message is really clear. Put differently, it is not possible that two Log Reader Agent instances are running on your database at the same time. When transactional replication is configured then the cdc.MyDB_capture job is (or should have been) dropped automatically and, if you uninstall Transactional Replication then cdc.MyDB_capture job is created again. To be perfectly honest, this behaviour is because Transactional Replication has the highest priority to use the Log Agent Reader. So, if you have transactional replication running for your database and cdc.MyDB_capture job is still enabled (and running) then you will have to disable or drop it manually since it will be failing and raising the error above. Thanks for reading.

Saturday, 13 February 2016

Looking into some concepts about the well-known tempdb database

Today's post is going to look into some concepts about the well-known tempdb database. As DBAs, we have to keep an eye on it not only at the beginning of the installation, but also as part of our monitoring tasks on a daily basis. To do so, we really need to understand how it works and what enhancements have come from SQL Server 2005 to forward. To begin with, the tempdb is the shared database per instance in SQL Server which is used to stored and manage temporary objects. It has a number of changes since SQL Server 2005, that is, there are new tempdb usages and internal optimizations enhancements. Nevertheless, the tempdb architecture is mostly unchanged since SQL Server 2000. 
In a user database, the transactions have the ACID attributes: atomicity, concurrency, isolation, and durability whereas in the tempdb database the transactions lose the durability attribute which basically means that they do not persist after a SQL Server shut down event. Most of these internal operations on tempdb do not generate log records as there is no need to roll back. So, these operations are faster. Furthermore, some of the database options cannot be modified for tempdb and others are limited or restricted. Here they are:
  • Auto Shrink is not allowed for tempdb.
  • Database Shrink and File Shrink capabilities are also limited.
  • The database CHECKSUM option cannot be enabled. 
  • A database snapshot cannot be created on tempdb. 
  • DBCC CHECKALLOC and DBCC CHECKCATALOG are not supported. 
  • Only offline checking for DBCC CHECKTABLE is performed.

Tempdb Space Usage: The following types of objects can occupy tempdb space: Internal Objects, Version Stores, and User Objects.

Internal Objects: Internal Objects metadata is stored in memory (it means that the metadata is hidden in the tempdb and it does not appear in catalog views such as 'sys.all_objects') and each of them occupies at least nine pages (one IAM page and eight data pages) in tempdb. Page Allocations and Update operations on it does not generate log records. The Internal Objects are used
  • to store intermediate runs for sort
  • to store intermediate results for hash joins and hash aggregates
  • to store XML variables or other large object (LOB) data type variables (text, image, ntext, varchar(max), varbinary(max), and all others)
  • by queries that need a spool to store intermediate results
  • by keyset cursors to store the keys
  • by static cursors to store a query result
  • by Service Broker (Query Notification and Event Notification) to store messages in transit
  • by INSTEAD OF triggers to store data for internal processing
  • by DBCC CHECKDB (it internally uses a query that may need to spool intermediate results)
Version Stores: They do not appear in catalog views such as 'sys.all_objects'. Here are more concepts:
  • Version stores are used to store row versions generated by transactions for features such as snapshot isolation, triggers, MARS (multiple active result sets), and online index build.
  • The online index build version store is for row versions from tables that have online index build operations on them.
  • The common version store is for row versions from all other tables in all databases.
  • The version store consists of append-only store units which are highly optimized for sequential inserts and random look up. Inserts into the version store do not generate log records.
  • Each unit can store many row versions. If there are versions to be stored, a new store unit is created about every minute.
In the following cases the versions of rows are generated for
  • SNAPSHOT isolation and read committed snapshot isolation (the versions are generated by DML operations in the database when the respective database options are enabled)
  • AFTER triggers (the versions are generated for all the update operations by the transaction that fired the trigger during the INSERT, DELETE, or UPDATE statement in any database, independent of database options. INSTEAD OF triggers do not generate versions)
  • MARS (the versions are generated by the UPDATE or DELETE statement when there is a pending SELECT on the same transaction and the same connection)
  • building an ONLINE index
User Objects: They do appear in catalog views such as 'sys.all_objects. Now some details about it:
  • The 'sp_spaceused' system stored procedure can show the size occupied by these objects.
  • User Objects include both user-defined tables and indexes, and system catalog tables and indexes.
  • Operations on User Objects in tempdb are mostly logged. Bulk copy program (BCP), bulk insert, SELECT INTO, and index rebuild operations are bulk logged.
  • User-defined tables include the global temporary tables such as ##t, and local temporary tables such as #t.
  • Local temporary tables also include table variables such as @t and the mapping index for online clustered index build with the SORT_IN_TEMPDB option.
Performance Enhancements in SQL Server 2005 or later
  1. The tempdb logging optimization avoids logging the “after value” in certain log records in tempdb.
  2. Instant data file initialization works by not zeroing out the NTFS file when the file is created or when the size of the file is increased.
  3. There is less use of the UP type page latch when allocating pages and extents in tempdb. Proportional fill has been optimized to reduce UP latch contention.
  4. Proportional fill has been optimized to reduce UP latch contention.
  5. There is now deferred drop in tempdb.
  6. Worktable caching is improved.
  7. SQL Server 2005 or later caches the temporary table that is created by using a CREATE TABLE or SELECT INTO statement.
Recommendations for managing
  1. The tempdb files must be configured with initial size and auto-growth based on your workloads. Do not let with the default sizes.
  2. The tempdb files must be located on RAID0 (for better performance) or RAID1 (if you need have more writes than reads) or RAID5 (if you have more reads than writes). RAID10 is the best option but no all companies can justify this.
  3. The tempdb files must be located on separated disks to avoid contention issues and improves the performance.
  4. Tempdb database must be created with one data file per physical processor if the quantity of CPUs is eight or less. If there are more then eight CPUs then you can start off with eight data files and after increase the number of files by four in case there was PAGELATCH contentions on one of the allocation bitmap pages (including PFS pages, GAM pages and SGAM pages) until the issue is solved, if not, add four files more, and so on.
  5. Do not forget, tempdb data files or log file should not grow so much in a short time, if this happens then something is working wrong, so you need to analyze the workloads performance and detect the query that you have to optimize.
That is all for now. Let me know any remarks you may have about tempdb database. Thanks for reading.

Monday, 8 February 2016

Using KEEP_CDC option to keep CDC metadata

Having a database with Change Data Capture (CDC) enabled, we will need to take this recommendation into consideration during the restore process. There is one option to be used as part of RESTORE syntax. This is KEEP_CDC option which allows us to restore the CDC metadata as well when the database is restored to another SQL Server instance (or restore it in the same instance with a different a name, indeed). So, here is the example:
RESTORE DATABASE [TESTDB2] FROM  DISK = N'D:SQLBackupTESTDB.bak' WITH  FILE = 1, KEEP_CDC

While verifying that CDC metadata was restored as well, we will see that not only the database and tables keep CDC option enabled, but also all data inside captured by CDC is still there.
-- Checking if CDC is enabled for database TESTDB2.

select is_cdc_enabled,name from sys.databases where name='TESTDB2'

-- Checking if CDC is enabled for table MyTable.

select is_replicated, is_tracked_by_cdc, * from sys.tables

select * from [cdc].[change_tables]  

-- checking the data tracked for table MyTable.

select * from cdc.dbo_MyTable_CT 

This is not all. Now we must create the CDC jobs by executing:
USE TESTDB2

EXEC sys.sp_cdc_add_job 'capture'

EXEC sys.sp_cdc_add_job 'cleanup' 

Finally, you also can verify the jobs were created for CDC
USE TESTDB2

EXEC [sys].[sp_cdc_help_jobs] 

Having done that, the restore process of database (with CDC included) has been completed successfully. Thanks for reading!
HELLO, I'M PERCY REYES! I've been working as a senior SQL Server Database Engineer for over 20 years; I'm a three-time Microsoft Data Platform MVP. I'm a cryptographer conducting research on cryptographic Boolean functions and their applications.