Monday, 13 May 2024
Boosting SQL Server Efficiency: Why You Should Update Statistics Manually
Saturday, 20 April 2024
Detailed Explanation of SQL Server Query Hints
SQL Server query hints are directives that can be included in your SQL queries to guide the SQL Server Query Optimiser in executing them. Typically, SQL Server automatically selects the most efficient execution plan based on the query and underlying data. However, query hints enable you to override the optimiser’s default choices, giving you greater control over specific aspects of query execution.
While these hints can significantly enhance performance in particular scenarios, they must be used cautiously and strategically. Incorrect application of hints can lead to degraded performance or unpredictable outcomes. Therefore, it is essential to thoroughly test your queries both with and without hints to evaluate their impact. Ensure that hints are employed appropriately and suited to your specific workloads to achieve optimal results.
Below is a detailed look at common SQL Server query hints, examples of their usage, and the potential risks associated with each.
Monday, 4 March 2024
Understanding the SQL Server Query Optimiser: A Comprehensive Overview
The primary goals of the Query Optimiser include:
- Generating Execution Plans: The optimiser creates a roadmap that describes how SQL Server will access the data required by the query.
- Maximising Efficiency: By evaluating different potential execution paths, the optimiser selects the one that minimises resource utilisation, such as CPU, memory, and disk I/O.
- Improving Response Times: A well-optimised query results in faster execution, contributing to better overall system performance.
Monday, 12 February 2024
Detecting Poor Cursor Usage in SQL Server
Thursday, 5 January 2023
New Features in SQL Server 2022 for DBAs
SQL Server 2022 brings a wealth of new features that empower DBAs to enhance database performance, security, and usability. By leveraging these capabilities, organizations can optimise their data management strategies, ensuring better compliance, improved efficiency, and advanced analytics capabilities. The introduction of intelligent query processing, enhanced security features, and support for serverless architectures positions SQL Server 2022 as a robust choice for modern data solutions.
Here’s a summary of the key features and their benefits:
1. Intelligent Query Processing Enhancements
Overview: SQL Server 2022 expands on the Intelligent Query Processing (IQP) capabilities introduced in SQL Server 2019. New features include:
- Parameter Sensitive Plan Optimization: This feature allows SQL Server to create multiple execution plans for a single query based on the specific parameter values used during execution, leading to better performance.
Case Scenario: In a scenario where a stored procedure is called with highly variable input parameters (e.g., a sales report for different regions), the parameter-sensitive optimisation ensures that SQL Server selects the best execution plan tailored to the input, reducing execution time and resource consumption.
2. SQL Server Ledger
Overview: The SQL Server Ledger feature provides blockchain-like capabilities, ensuring data integrity and immutability through ledger tables. This feature uses cryptographic methods to verify data integrity, making it ideal for applications requiring audit trails.
Case Scenario: For financial institutions that need to maintain accurate records of transactions, SQL Server Ledger enables the creation of a ledger table where each transaction is logged. Any modifications are tracked, ensuring an immutable record that can be audited for compliance purposes.
3. Enhanced Security Features
Overview: SQL Server 2022 includes several security enhancements:
- Always Encrypted with Secure Enclaves: This allows more operations to be performed on encrypted data without exposing it in plaintext, enhancing security for sensitive information.
- Dynamic Data Masking Enhancements: Provides more flexible masking options, allowing DBAs to control how sensitive data is displayed to different users.
Case Scenario: In a healthcare application, patient data can be stored securely using Always Encrypted. The application can perform queries and calculations on encrypted data without revealing sensitive information to unauthorised users.
4. Query Store Enhancements
Overview: The Query Store feature has been enhanced to provide deeper insights into query performance over time. It now includes capabilities for identifying and managing query performance regressions.
Case Scenario: DBAs can utilise the enhanced Query Store to track performance changes after application deployments. If a new release introduces slow-running queries, the Query Store can help identify regressions, allowing quick remediation by reverting to previously optimised execution plans.
5. Built-in Machine Learning Services
Overview: SQL Server 2022 integrates built-in support for machine learning services, allowing DBAs to run R and Python scripts directly in the database engine. This feature enables data scientists and analysts to perform advanced analytics without moving data outside the SQL Server environment.
Case Scenario: A retail company can analyse customer purchasing patterns using machine learning algorithms stored in SQL Server. DBAs can schedule these analyses to run during off-peak hours, optimizing resource usage while delivering timely insights.
6. Serverless SQL Database
Overview: SQL Server 2022 introduces serverless capabilities for databases, allowing automatic scaling and cost-effective resource usage based on demand. This feature is ideal for workloads that experience variable usage patterns.
Case Scenario: A startup that experiences fluctuating traffic on its web application can benefit from serverless SQL databases, which automatically scale up during peak traffic and scale down during low usage periods, ensuring cost-efficiency.
7. Enhanced Availability Groups
Overview: New features for Always On Availability Groups improve failover performance and management. This includes automatic failover for databases in availability groups without the need for manual intervention.
Case Scenario: In a mission-critical application, the DBA can configure availability groups with automatic failover capabilities, ensuring minimal downtime during server maintenance or unexpected failures, thus enhancing application reliability.
8. Improved Performance Insights and Monitoring Tools
Overview: SQL Server 2022 provides enhanced monitoring tools with better insights into system performance. Features like the Database Health Monitor offer real-time health checks and performance tuning recommendations.
Case Scenario: DBAs can utilize the Database Health Monitor to receive alerts and recommendations for performance tuning based on workload analysis, enabling proactive management and optimization of database performance.
9. Support for JSON and XML Enhancements
Overview: SQL Server 2022 introduces improvements to support for JSON and XML data types, including better indexing and querying capabilities.
Case Scenario: A content management system that relies on JSON documents can leverage the new indexing features to improve query performance on JSON fields, allowing faster data retrieval and reporting.
That's all for now.
Thursday, 4 March 2021
Don't Place the Blame on SQL Server
No matter what technology we are working with, we are at the wheel — technology is just a tool — so it is not the best to blame technology on the ground of one's inefficiency.
Saturday, 17 February 2018
Quickly Checking for Stale Statistics in SQL Server
- How often have you faced unforeseen performance issues, even after completing your regular database maintenance tasks?
- Have you wondered why tasks like index rebuilding and statistics updates are sometimes insufficient?
- If everything was running smoothly before, why has performance suddenly declined?
- Why are indexes not being utilised as expected? Should more indexes be created?
Tuesday, 13 February 2018
Avoid changing default ANSI database options
Msg 8152, Level 16, State 14, Line 5
String or binary data would be truncated.
Friday, 1 December 2017
SQLServer Buffer Manager Buffer cache hit ratio, Page Life,Lazy Write/sec, Checkpoint Pages/sec
To begin with, we can analyse SQLServer Buffer cache hit ratio performance counter which gives us the percentage of cache memory usage. This percentage should mostly be above 99% for OLTP database servers. Therefore, nowadays 4GB or 8GB of RAM is not good enough for the vast majority of them. The knock-on effect of this lack of RAM is that there will be a huge increase on the I/O disk activity as a result of using much more paging (data pages are moving in and out from memory very frequently because there is not enough space to locate new ones). This will also cause a huge disk queue which will keep disk subsystem busy, consequently, it will impact directly on the database performance detrimentally. In this context, we also need to look into other SQL Server performance counters like SQLServer: Buffer Manager: Page Life Expectancy, SQLServer: Buffer Manager: Lazy Write/sec, and SQLServer: Buffer Manager: Checkpoint Pages/sec.
As I mentioned before, there are other useful performance counters to diagnose and tune cache memory issues like Buffer Manager: Page Life Expectancy which indicates the time in seconds that a page has been inside of memory. This time should mostly be above 300 seconds, that is, 5 minutes. So, values below it should be considered as an alert, which means that SQL Server is under high memory pressure because bad-written queries may be using a lot. Clearly, the final solution is not to add more memory but identify and optimise those problematic queries. Only after that may you opt to add memory.
The other performance counter to check is SQLServer: Buffer Manager: Lazy Write/sec which gives the quantity of pages moved out per second from memory. Not only could it mean a lack of memory, but also there are copious amounts of Checkpoints which are not good at all because it will cause recompilation of stored procedures at the same time.The value of this counter should mostly be below 20. So, if you notice this value above 20, you may need to check SQLServer: Buffer Manager: Checkpoint Pages/sec counter as well. Checkpoints move out all pages from memory to disk.
Finally, keep in mind that it really important to do diagnostic tasks regularly which allow to take preventive actions so as to improve the whole performance over time. I do hope you find this post helpful. Let me know any remarks you may have. Thanks for reading. Stay tuned.
Thursday, 23 November 2017
Dropping columns statistics after migrating to SQL Server 2014/2016/2017
Here I will share a script to delete columns statistics. Be caution and only drop them if you have AUTO_CREATE_STATISTICS option enabled, otherwise no statistics will be created and the database performance will be affected tremendously. That is all for now. Let me know any remarks you may have.
SET NOCOUNT ON IF db_name() NOT IN ('model','master','distribution','msdb','tempdb') BEGIN DECLARE @schema_name varchar(max) DECLARE @table_name varchar(max) DECLARE @stat_name varchar(max) DECLARE @update_stat_cmd varchar(max) DECLARE @update_stat_msg_header varchar(max) DECLARE update_stat_cursor CURSOR FOR select schema_name(o.[schema_id]), object_name(s1.[object_id]) , s1.name from ( select s.[object_id], s.name from sys.stats s left join sys.indexes i on s.name=i.name where i.name is null) s1 inner join sys.objects o on o.[object_id]=s1.[object_id] where o.type='U' order by schema_name(o.[schema_id]), object_name(s1.[object_id]) , s1.name OPEN update_stat_cursor FETCH NEXT FROM update_stat_cursor INTO @schema_name, @table_name, @stat_name WHILE (@@fetch_status = 0) BEGIN DECLARE @ini DATETIME, @fin DATETIME SET @update_stat_msg_header = '->Dropping ['+ RTRIM(@schema_name) +'].[' + RTRIM(@table_name) + '].[' +@stat_name + ']' PRINT @update_stat_msg_header SET @update_stat_cmd ='DROP STATISTICS ['+ RTRIM(@schema_name) +'].[' + RTRIM(@table_name) + '].[' +@stat_name + ']' SET @ini=GETDATE() EXEC (@update_stat_cmd) SET @fin=GETDATE() FETCH NEXT FROM update_stat_cursor INTO @schema_name, @table_name, @stat_name END PRINT ' ' PRINT '----------------------------------------------------------------------------- ' SET @update_stat_msg_header = '************* THERE ARE NO MORE STATISTICS TO BE UPDATED **************' PRINT @update_stat_msg_header PRINT ' ' PRINT 'All statistics not linked to any index were rebuilt!' CLOSE update_stat_cursor DEALLOCATE update_stat_cursor END SET NOCOUNT OFF
Friday, 3 November 2017
AUTO_CLOSE database option and its impact on the performance
ALTER DATABASE [UserDBInProduction] SET AUTO_CLOSE OFF WITH NO_WAIT
That is all for now, let me know any remarks you may have. Thanks for reading again. Stay tuned.
Friday, 4 August 2017
Troubleshooting timeout expired errors
Timeout expired. The timeout period elapsed prior to completion of the operation or the server is not responding.In order to get it fixed, firstly, developers should check .NET code within applications. In some cases, the developers I worked with found infinite loops or slow codes which spent much time before sending SQL queries to the database engine, as a result of this, the timeout threshold was reached at application level and, obviously, the error was raised. By the way, the following error could also be related to timeout:
Timeout expired. The timeout period elapsed prior to obtaining a connection from the pool. This may have occurred because all pooled connections were in use and max pool size was reached.
Unable to connect to SQL Server session database.Having checked the application, you can increase the 'CommandTimeout' property value to 60 seconds (if it is not enough then change to 120 or 180, and also considerate using 0 (unlimited) which should be a workaround while the problem is being traced and fixed). That is all for now, let me know any remarks you may have. Thanks for reading. Stay tuned.
Thursday, 20 July 2017
Detecting excessive compilation and recompilation issues
select spid, StartTime, Textdata, EventSubclass, ObjectID, DatabaseID, SQLHandle from fn_trace_gettable('C:\RecompilationTrace_01.trc', 1) where EventClass in(37,75,166) -- 37 = Sp:Recompile, 75 = CursorRecompile, 166 = SQL:StmtRecompile
Talking of DMVs to diagnose recompilation issues, looking into 'sys.dm_exec_query_optimizer_info' is very helpful, in particular, look at the Elapsed Time, which is the time elapsed due to optimizations, and also Final Cost. If you see that Time Elapsed is very close to the CPU time, you might reach the conclusion that the compilation and recompilation time is attributable to that high CPU use. Another DMV to use is 'sys.dm_exec_query_stats' whose most important columns to look at are sql_handle, total worker time, plan generation number (the number of times the query has recompiled), and statement Start Offset. Here is an example to check the top 20 most procedures that have been recompiled.
select top 20 SQLText.text, sql_handle, plan_generation_num, execution_count, dbid, objectid from sys.dm_exec_query_stats cross apply sys.dm_exec_sql_text(sql_handle) as SQLText where plan_generation_num >1 order by plan_generation_num desc
- Check to see whether the stored procedure was created with the WITH RECOMPILE option or whether the RECOMPILE query hint was used. If a procedure was created with the WITH RECOMPILE option, since SQL Server 2005, you may be able to take advantage of a statement-level RECOMPILE hint if a particular statement within that procedure needs to be recompiled. Using this hint at the statement level avoids the need of recompiling the whole procedure each time it executes, while at the same time allowing the individual statement to be compiled.
- Recompilations can occur due to changes in statistics, and you can use the KEEPFIXED PLAN query hint to make recompilations occur only when there is the need to ensure correctness and not to respond to changes in statistics. For instance, in this context recompilation can only occur if the underlying table structure or its schema that is referenced by a statement changes, or if a table is marked with the sp_recompile stored procedure, all resulting in the fact that the plan no longer applies and consequently triggering the recompilation event.
- Using the KEEP PLAN query hint is useful to set the recompilation threshold of temporary tables to be the same as permanent tables. Take a look at the EventSubclass column which displays 'Statistics Changed' for an operation on a temporary table.
- Turning off the automatic updates of statistics for indexes and statistics that are defined on a table or indexed view prevents recompilations that are due to statistics changes on that object. It is worth noting that turning off the auto-stats option is not always a good idea. This is because the query optimizer is no longer sensitive to data changes in those objects resulting in suboptimal query plans. To be honest, I never turned off this option because I always preferred trusting in SQL Server criteria and instead opting to work on optimising queries.
- Keep in mind that recompilation thresholds for temporary tables are lower than for normal tables, so if the recompilations on a temporary table are due to statistics changes, you can change the temporary tables to table variables. A change in the cardinality of a table variable does not cause a recompilation. The side effect of this approach is that the query optimizer does not keep track of a table variable's cardinality because statistics are not created or maintained on table variables. This can result in less optimal query plans, however, you can test the different options and choose the best one. Generally, temporary tables provide much better performance than tables variables when lots of data is involved.
- Recompilation might also occur as a result from SET option changes, so one can diagnose it by using SQL Server Profiler to determine which SET option changed. It is highly advisable to avoid changing SET options within stored procedures and it is much better to set them at the connection level, and for the overwhelmingly majority of cases the default SET options work well. It is also very important to ensure that SET options are not changed during the lifetime of the connection.
- To avoid recompilations that are due to deferred compilations, do not combine DML with DDL and do not create the DDL as a result from conditional IF statements.
- To avoid recompilation and also to avoid ambiguity between objects, batches should have qualified object names, for example, dbo.Table1, User1.MySP, etc.
Friday, 26 May 2017
How to know which non-clustered indexes are not being used any longer
This post will show a code to know exactly which non-clustered indexes are being used since the last restart of SQL Server engine. This code also filters out the indexes which are not related to primary keys. As you have seen in the code, we figure out the indexes by checking the value of columns user_seeks, user_scans, and user_lookups, if they are equal to zero then is says to us they were not used until this moment. Please be cautious, before using this code to get the indexes to be deleted, you should make sure that the database engine is not been restarted recently, otherwise, you will delete the wrong indexes and will impact on the performance tremendously. I strongly suggest using this code only if the latest restart of SQL Server has been done three months ago, more and less.
SELECT i.type_desc,last_user_seek, last_user_scan, last_user_lookup, OBJECT_NAME(s.[object_id]) AS [Table Name] , i.name AS [Index Name] , i.index_id , user_updates AS [Total Writes] , user_seeks + user_scans + user_lookups AS [Total Reads] , user_updates - ( user_seeks + user_scans + user_lookups ) AS [Difference] , 'DROP INDEX ['+ schema_name(o.[schema_id]) +'].['+ OBJECT_NAME(s.[object_id]) +'].[' + i.name + '];' as Statement_Drop_Index ,i.is_unique, i.is_unique_constraint, i.is_primary_key FROM sys.dm_db_index_usage_stats AS s WITH ( NOLOCK ) INNER JOIN sys.indexes AS i WITH ( NOLOCK ) INNER JOIN sys.objects o on o.object_id = i.object_id ON s.[object_id] = i.[object_id] AND i.index_id = s.index_id WHERE OBJECTPROPERTY(s.[object_id], 'IsUserTable') = 1 AND s.database_id = DB_ID() AND i.index_id > 1 and i.is_primary_key=0 and i.is_unique=0 and i.is_unique_constraint=0 AND isnull(user_seeks, 0) + isnull(user_scans, 0) + isnull(user_lookups,0) =0 ORDER BY OBJECT_NAME(s.[object_id]), [Difference] DESC, [Total Writes] DESC,[Total Reads] ASC ;
When you execute it, you will also get a column containing the T-SQL code to delete the indexes which have not been used since the latest restart of SQL Server, and then you can execute it to delete them. To sum up, do not get me wrong, I am not against using indexes, on the contrary, I am in favour of using them properly, and if there were indexes that are not being using since a long time ago then they should be deleted. That's all for now. Let me know any remarks you may have. Thanks for reading.
Monday, 26 December 2016
How to delete large quantity of data with no growth of the Transaction Log and no blocking issue
To begin with, the technique consists in using SQL Server partition switching which allows to access and transfer subsets of data quickly and efficiently. In this way, we may need to alter the large table and make it partitioned but in this example we are going to create a new one. Now let's say there is a need to keep only data from the latest three months. Here is the structure of the table 'LogTracker' which will be partitioned shortly after. It is also included a specific column named 'MonthNumber' to save the number of the month which at the same time will be the partition number.
CREATE TABLE [dbo].[LogTracker]( [LogID] [uniqueidentifier] NOT NULL, [CreateDate] [datetime] NULL, [ErrorDesc] [nvarchar](MAX) NULL, [User] [nvarchar](40) NULL, [MonthNumber] AS DATEPART(MONTH, [CreateDate]) PERSISTED NOT NULL CONSTRAINT [PK_LogTracker] PRIMARY KEY CLUSTERED ( [LogID] ASC, [MonthNumber] ASC ) )
CREATE TABLE [dbo].[LogTracker_Temp]( [LogID] [uniqueidentifier] NOT NULL, [CreateDate] [datetime] NULL, [ErrorDesc] [nvarchar](MAX) NULL, [User] [nvarchar](40) NULL, [MonthNumber] AS DATEPART(MONTH, [CreateDate]) PERSISTED NOT NULL CONSTRAINT [PK_LogTracker_Temp] PRIMARY KEY CLUSTERED ( [LogID] ASC, [MonthNumber] ASC ) )
CREATE PARTITION FUNCTION [upf_LogTracker](INT) AS RANGE LEFT FOR VALUES (1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11) GO CREATE PARTITION SCHEME [ups_LogTracker] AS PARTITION [upf_LogTracker] ALL TO ([PRIMARY]) GO
ALTER TABLE dbo.LogTracker DROP CONSTRAINT PK_LogTracker WITH (MOVE TO ups_LogTracker([MonthNumber])) GO ALTER TABLE dbo.LogTracker ADD CONSTRAINT PK_LogTracker PRIMARY KEY ([LogID],[MonthNumber]) GO
INSERT INTO LogTracker SELECT NEWID(), '2016-06-23 13:13:56', 'Error A Desc', 'preyes' INSERT INTO LogTracker SELECT NEWID(), '2016-07-24 04:18:12', 'Error C Desc', 'preyes' INSERT INTO LogTracker SELECT NEWID(), '2016-08-20 01:34:57', 'Error B Desc', 'preyes' INSERT INTO LogTracker SELECT NEWID(), '2016-09-21 08:09:16', 'Error D Desc', 'preyes' INSERT INTO LogTracker SELECT NEWID(), '2016-11-22 05:26:45', 'Error A Desc', 'preyes' INSERT INTO LogTracker SELECT NEWID(), '2016-11-23 15:43:56', 'Error A Desc', 'preyes' INSERT INTO LogTracker SELECT NEWID(), '2016-12-24 23:28:12', 'Error C Desc', 'preyes'
SELECT $PARTITION.[upf_LogTracker](MonthNumber),* FROM LogTracker
With this code we are going to delete the historic data and keep only the most recent three months. It moves out the data of each month from 'LogTracker' to the stage table 'LogTracker_Temp' which in no time is truncated. It starts moving data from the month number one until 'the current month - 3'.
DECLARE @M INT = CASE WHEN DATEPART(MONTH, GETDATE()) - 3 >0 THEN DATEPART(MONTH, GETDATE()) - 3 ELSE 12 + DATEPART(MONTH, GETDATE()) - 3 END DECLARE @PartitionNumberToArchive INT=1 WHILE (@PartitionNumberToArchive<=@M) BEGIN ALTER TABLE dbo.LogTracker SWITCH PARTITION @PartitionNumberToArchive TO dbo.LogTracker_temp TRUNCATE TABLE dbo.LogTracker_temp UPDATE STATISTICS dbo.LogTracker WITH FULLSCAN SET @PartitionNumberToArchive=@PartitionNumberToArchive+1 END
Naturally, the code needs to be scheduled and executed via a SQL Job at the end of each month. To be perfectly honest, not only will this deletion finish in one or two seconds, but also it will not cause any impact on the performance and the Transaction Log will not grow as it does by using traditionally techniques. I hope this post is truly useful for you and you make the most out of it. Let me know any remarks you may have. That is all for now. Thanks for reading.
Wednesday, 7 December 2016
New per-operator level performance stats included in SQL2016 SP1 and SQL2014 SP2
This is its actual execution plan.
It is highlighted all what is new in SQL2016, SQL2016 SP1 and SQL2014 SP2. Therefore, we can appreciate CPU and I/O runtime performance stats for the Clustered Index Scan operator on 'PurchaseOrderDetail' table such as I/O reads, CPU time and Elapsed time. In a parallel plan you will be able to see information per thread in each operator.
And now we will see not only CPU and I/O performance stats for the Hash operator but also Memory performance stats.
All this information can also be seen in the Properties window in the context of each operator inside the Actual Execution Plan by using the latest version of SSMS. If you display the detail of each statistic, the detail per thread will be seen when working with parallel execution plan.
Having seen this, not only are we now able to look into useful performance stats by using Showplan XML and SSMS, but also we can create some scripts for administration and optimization purposes in order to pick up queries with poor performance and then optimise them. That is all for now. Let me know any remark you may have. Thanks for reading.
Friday, 4 March 2016
How to split the tempdb database into more files
Today we are not going to discuss more details about it. I just would like to suggest splitting your tempdb database into four data files (if you have four or eight core) or eight data files (for 16, 32, 64, or more cores) and ONLY one log file. Ideally, we also need to locate them in different drives RAID1, RAID5 or RAID10. Now having very clear the situation, I will show you one small script to split your default tempdb database into 8 data files and remaining the only one log file.
USE [master] GO ALTER DATABASE [tempdb] MODIFY FILE (NAME=N'templog', NEWNAME= N'tempdev_Log') GO ALTER DATABASE [tempdb] MODIFY FILE (NAME=N'tempdev', NEWNAME=N'tempdev_Data01') GO ALTER DATABASE [tempdb] MODIFY FILE (NAME=N'tempdev_Log', FILENAME= N'D:SQLTempDBtempdev_Log.ldf', SIZE = 2048MB , FILEGROWTH = 2048MB) GO ALTER DATABASE [tempdb] MODIFY FILE (NAME=N'tempdev_Data01', FILENAME= N'D:SQLTempDBtempdev_Data01.mdf', SIZE = 512MB , FILEGROWTH = 1024MB ) GO ALTER DATABASE [tempdb] ADD FILE ( NAME = N'tempdev_Data02', FILENAME = N'D:SQLTempDBtempdev_Data02.ndf' , SIZE = 512MB , FILEGROWTH = 1024MB ) GO ALTER DATABASE [tempdb] ADD FILE ( NAME = N'tempdev_Data03', FILENAME = N'D:SQLTempDBtempdev_Data03.ndf' , SIZE = 512MB , FILEGROWTH = 1024MB ) GO ALTER DATABASE [tempdb] ADD FILE ( NAME = N'tempdev_Data04', FILENAME = N'D:SQLTempDBtempdev_Data04.ndf' , SIZE = 512MB , FILEGROWTH = 1024MB ) GO ALTER DATABASE [tempdb] ADD FILE ( NAME = N'tempdev_Data05', FILENAME = N'D:SQLTempDBtempdev_Data05.ndf' , SIZE = 512MB , FILEGROWTH = 1024MB ) GO ALTER DATABASE [tempdb] ADD FILE ( NAME = N'tempdev_Data06', FILENAME = N'D:SQLTempDBtempdev_Data06.ndf' , SIZE = 512MB , FILEGROWTH = 1024MB ) GO ALTER DATABASE [tempdb] ADD FILE ( NAME = N'tempdev_Data07', FILENAME = N'D:SQLTempDBtempdev_Data07.ndf' , SIZE = 512MB , FILEGROWTH = 1024MB ) GO ALTER DATABASE [tempdb] ADD FILE ( NAME = N'tempdev_Data08', FILENAME = N'D:SQLTempDBtempdev_Data08.ndf' , SIZE = 512MB , FILEGROWTH = 1024MB )
Saturday, 13 February 2016
Looking into some concepts about the well-known tempdb database
In a user database, the transactions have the ACID attributes: atomicity, concurrency, isolation, and durability whereas in the tempdb database the transactions lose the durability attribute which basically means that they do not persist after a SQL Server shut down event. Most of these internal operations on tempdb do not generate log records as there is no need to roll back. So, these operations are faster. Furthermore, some of the database options cannot be modified for tempdb and others are limited or restricted. Here they are:
- Auto Shrink is not allowed for tempdb.
- Database Shrink and File Shrink capabilities are also limited.
- The database CHECKSUM option cannot be enabled.
- A database snapshot cannot be created on tempdb.
- DBCC CHECKALLOC and DBCC CHECKCATALOG are not supported.
- Only offline checking for DBCC CHECKTABLE is performed.
Tempdb Space Usage: The following types of objects can occupy tempdb space: Internal Objects, Version Stores, and User Objects.
Internal Objects: Internal Objects metadata is stored in memory (it means that the metadata is hidden in the tempdb and it does not appear in catalog views such as 'sys.all_objects') and each of them occupies at least nine pages (one IAM page and eight data pages) in tempdb. Page Allocations and Update operations on it does not generate log records. The Internal Objects are used
- to store intermediate runs for sort
- to store intermediate results for hash joins and hash aggregates
- to store XML variables or other large object (LOB) data type variables (text, image, ntext, varchar(max), varbinary(max), and all others)
- by queries that need a spool to store intermediate results
- by keyset cursors to store the keys
- by static cursors to store a query result
- by Service Broker (Query Notification and Event Notification) to store messages in transit
- by INSTEAD OF triggers to store data for internal processing
- by DBCC CHECKDB (it internally uses a query that may need to spool intermediate results)
- Version stores are used to store row versions generated by transactions for features such as snapshot isolation, triggers, MARS (multiple active result sets), and online index build.
- The online index build version store is for row versions from tables that have online index build operations on them.
- The common version store is for row versions from all other tables in all databases.
- The version store consists of append-only store units which are highly optimized for sequential inserts and random look up. Inserts into the version store do not generate log records.
- Each unit can store many row versions. If there are versions to be stored, a new store unit is created about every minute.
- SNAPSHOT isolation and read committed snapshot isolation (the versions are generated by DML operations in the database when the respective database options are enabled)
- AFTER triggers (the versions are generated for all the update operations by the transaction that fired the trigger during the INSERT, DELETE, or UPDATE statement in any database, independent of database options. INSTEAD OF triggers do not generate versions)
- MARS (the versions are generated by the UPDATE or DELETE statement when there is a pending SELECT on the same transaction and the same connection)
- building an ONLINE index
- The 'sp_spaceused' system stored procedure can show the size occupied by these objects.
- User Objects include both user-defined tables and indexes, and system catalog tables and indexes.
- Operations on User Objects in tempdb are mostly logged. Bulk copy program (BCP), bulk insert, SELECT INTO, and index rebuild operations are bulk logged.
- User-defined tables include the global temporary tables such as ##t, and local temporary tables such as #t.
- Local temporary tables also include table variables such as @t and the mapping index for online clustered index build with the SORT_IN_TEMPDB option.
- The tempdb logging optimization avoids logging the “after value” in certain log records in tempdb.
- Instant data file initialization works by not zeroing out the NTFS file when the file is created or when the size of the file is increased.
- There is less use of the UP type page latch when allocating pages and extents in tempdb. Proportional fill has been optimized to reduce UP latch contention.
- Proportional fill has been optimized to reduce UP latch contention.
- There is now deferred drop in tempdb.
- Worktable caching is improved.
- SQL Server 2005 or later caches the temporary table that is created by using a CREATE TABLE or SELECT INTO statement.
- The tempdb files must be configured with initial size and auto-growth based on your workloads. Do not let with the default sizes.
- The tempdb files must be located on RAID0 (for better performance) or RAID1 (if you need have more writes than reads) or RAID5 (if you have more reads than writes). RAID10 is the best option but no all companies can justify this.
- The tempdb files must be located on separated disks to avoid contention issues and improves the performance.
- Tempdb database must be created with one data file per physical processor if the quantity of CPUs is eight or less. If there are more then eight CPUs then you can start off with eight data files and after increase the number of files by four in case there was PAGELATCH contentions on one of the allocation bitmap pages (including PFS pages, GAM pages and SGAM pages) until the issue is solved, if not, add four files more, and so on.
- Do not forget, tempdb data files or log file should not grow so much in a short time, if this happens then something is working wrong, so you need to analyze the workloads performance and detect the query that you have to optimize.
Thursday, 6 November 2014
Microsoft SQL Server 2014 In-Memory OLTP: How fast is it?
Friday, 12 November 2010
SQL11 (Denali): OFFSET and FETCH (New options for ORDER BY)
OFFSET keyword is used for excluding first N rows (default value is zero) whereas FETCH is used for reading/fetching M rows per each window of results starting from position (N+1). Therefore, OFFSET and FETCH allow to get only rows from position (N+1) to (N+M). I do recommend creating one Index (preferably Index Clustered) on ORDER BY columns and using these columns on WHERE clause as much as possible in order to avoid costly operations such as Index Scans or Table Scans. I have performed many tests on 10 millions of rows, and to be to honest, it worked very well. To illustrate, the following query will start fetching from position 4+1=5 (OFFSET position=4 for SQL Server) to the end of the table.
SELECT CustomerID, CompanyName, ContactName, ContactTitle, Address FROM dbo.Customers ORDER BY CustomerID OFFSET (4) ROWS
Now with DESC the results will be different (excluding last 4 rows)
SELECT CustomerID, CompanyName, ContactName, ContactTitle, Address FROM dbo.Customers ORDER BY CustomerID DESC OFFSET (4) ROWS
Basically, each execution of the query is done independently, which means that not only are the results sent back to the client side, but also immediately the resources are released. Here is another example how to get only the top 3 rows excluding first 4 rows (OFFSET=4), which means that it only reads the fifth, sixth, and seventh rows.
SELECT CustomerID, CompanyName, ContactName, ContactTitle, Address --/..*.../ FROM dbo.Customers ORDER BY CustomerID OFFSET (4) ROWS FETCH NEXT 3 ROWS ONLY -- FETCH FIRST 3 ROWS ONLY GO
Having a need to read the next three rows (eighth, ninth, and tenth rows), we have to increase OFFSET value in three.
SELECT CustomerID, CompanyName, ContactName, ContactTitle, Address --/..*.../ FROM dbo.Customers ORDER BY CustomerID OFFSET (7) ROWS ---ó OFFSET (7) ROW FETCH NEXT 3 ROWS ONLY -- ó FETCH FIRST 3 ROWS ONLY GO
Here the script to read dynamically each result by using variables (and optionally OPTIMIZE FOR hint to optimise this Ad-Hoc query).
DECLARE @Start INT, @Next INT SET @Start=0 SET @Next=3 SELECT CustomerID, CompanyName, ContactName, ContactTitle, Address --/..*.../ FROM dbo.Customers ORDER BY CustomerID OFFSET (@Start) ROWS FETCH NEXT @Next ROWS ONLY OPTION(OPTIMIZE FOR (@Start=4,@Next=5)) GO
CREATE PROCEDURE dbo.PageFetch(@PageNumber INT, @PageSize INT) AS BEGIN SELECT ROW_NUMBER() OVER (ORDER BY CustomerID) AS 'Nro', CustomerID, CompanyName, ContactName, ContactTitle, Address --/..*.../ FROM dbo.Customers ORDER BY CustomerID OFFSET (@PageSize * (@PageNumber -1) ) ROWS FETCH NEXT @PageSize ROWS ONLY END
EXEC dbo.PageFetch 2,3; GO EXEC dbo.PageFetch 3,3; GO EXEC dbo.PageFetch 5,3;
Luckily, OFFSET and FETCH are supported by subqueries, functions, derived tables, but not by indexed views and using directly with TOP, OVER, INSERT, UPDATE, MERGE, o DELETE (except inside of independent queries at lower level such as subqueries). Here an illustration.
SELECT TOP(2) ROW_NUMBER() OVER(ORDER BY CustomerID) AS 'NRO', CompanyName, ContactName, ContactTitle FROM ( SELECT CustomerID, CompanyName, ContactName, ContactTitle, Address --/..*.../ FROM dbo.Customers ORDER BY CustomerID OFFSET (4) ROWS ---ó OFFSET (4) ROW FETCH NEXT 3 ROWS ONLY -- ó FETCH FIRST 3 ROWS ONLY ) AS T1
Finally, using it with Views.
CREATE VIEW dbo.v1 as SELECT CustomerID, CompanyName, ContactName, ContactTitle, Address --/..*.../ FROM dbo.Customers ORDER BY CustomerID OFFSET (0) ROWS ---ó OFFSET (4) ROW GO SELECT CustomerID, CompanyName, ContactName FROM dbo.v1 WHERE CustomerID LIKE 'AN%'
To sum up, OFFSET and FETCH are excellent improvements which allow to implement solutions to fetch only specific pages/windows of results from the result set with ease. Let me know any remarks you may have. Thanks for reading!