Monday, 4 March 2024

Understanding the SQL Server Query Optimiser: A Comprehensive Overview

The SQL Server Query Optimiser is a crucial component of Microsoft SQL Server, responsible for determining the most efficient way to execute a given query. The optimiser takes into consideration various factors, including the structure of the query, available indexes, statistics about the data, and the overall database schema. Here is a detailed breakdown of its functionalities, processes, and related concepts. 

1. Purpose of the Query Optimiser

The primary goals of the Query Optimiser include:

  • Generating Execution Plans: The optimiser creates a roadmap that describes how SQL Server will access the data required by the query.
  • Maximising Efficiency: By evaluating different potential execution paths, the optimiser selects the one that minimises resource utilisation, such as CPU, memory, and disk I/O.
  • Improving Response Times: A well-optimised query results in faster execution, contributing to better overall system performance.
2. Types of Query Optimisation

SQL Server employs mainly Cost-Based Optimisation (CBO), which evaluates multiple execution plans based on estimated costs and selects the most efficient one. While SQL Server predominantly uses CBO, it is useful to know about older methodologies such as Rule-Based Optimisation (RBO), which relies on fixed rules rather than cost estimation.

3. Execution Plans

The execution plan generated by the Query Optimiser outlines the specific operations that SQL Server will perform to execute a query. Key elements of an execution plan include:

  • Access Methods: Indicates how SQL Server retrieves data, e.g. using an index seek or a table scan.
  • Join Methods: Describes how multiple tables are combined, e.g. nested loops, hash joins, or merge joins.
  • Order of Operations: Shows the sequence in which operations are performed to optimise performance.
Example of an Execution Plan: Consider a simple SQL query:
select emp_id, first_name, last_name, department 
from employees 
where department = 'sales';

The optimiser might evaluate several potential execution plans:

  1. Index Seek: If there is an index on the department column, an index seek will be the preferred method, as it efficiently retrieves only the relevant rows.
  2. Table Scan: If no suitable index exists, SQL Server may perform a table scan, reading every row in the employees table, which is less efficient, especially with large datasets.

You can view the execution plan in SQL Server Management Studio (SSMS) by using the following commands:

set showplan_xml on;  
go  
select emp_id, first_name, last_name, department 
from employees 
where department = 'sales';  
go  
set showplan_xml off;

This will return the execution plan in XML format, allowing you to analyse the chosen strategy.

4. Statistics

Statistics are critical for the Query Optimiser as they provide insights into the distribution of data within tables and indexes. SQL Server uses these statistics to estimate the cardinality (the number of distinct values) and density (the distribution of values) of columns.

Example of Statistics: For a table named employees, SQL Server might maintain statistics on the salary column to understand its distribution, such as the minimum, maximum, and number of distinct values.

When executing a query like:

select emp_id, first_name, last_name, salary 
from employees 
where salary between 30000 and 50000;

The optimiser will refer to the statistics for the salary column to estimate the number of rows meeting the condition. This estimation helps it decide the most efficient execution plan, such as whether to use an index seek or a table scan.

5. Query Recompilation

SQL Server can recompile queries based on changes in the database environment. This can occur when:

  • The underlying data changes significantly.
  • Indexes or statistics are added or modified.
  • Query parameters vary widely.

While recompilation ensures that the most optimal execution plan is used, excessive recompilation can introduce overhead and degrade performance. It is essential to monitor and control recompilation frequency for critical queries.

6. Optimisation Hints

Sometimes, developers may want to influence the behaviour of the Query Optimiser by using optimisation hints. These hints are directives that can be added to SQL statements to guide the optimiser towards a specific execution path. Some commonly used hints include:

  • NOLOCK: The NOLOCK hint allows a query to read data without acquiring shared locks. This can help improve performance by reducing blocking, but it may return uncommitted data (also known as dirty reads).
  • FORCESEEK: This hint forces the optimiser to use an index seek operation rather than a scan, which can improve performance in certain scenarios.
  • OPTION (RECOMPILE): This hint instructs SQL Server to recompile the query each time it is executed, which is useful for queries with highly variable parameters.
Examples of a Hints: here is an example of using the FORCESEEK hint, this forces SQL Server to use an index seek on the department column, potentially improving performance. 
select emp_id, first_name, last_name, department
from employees 
with (forceseek) 
where department = 'sales';
Here is an example of using the NOLOCK hint:
select emp_id, first_name, last_name, department
from employees 
with (nolock) 
where department = 'sales';
Both hints can be used in the same query 
select emp_id, first_name, last_name, department
from employees 
with (nolock, forceseek) 
where department = 'sales';
You can see more examples here

7. Query Execution Steps

The execution of a query by SQL Server involves several steps, including:

  1. Parsing: SQL Server first parses the SQL statement to check for syntax errors and creates a parse tree.
  2. Binding: SQL Server binds the parse tree against the database schema to ensure that all referenced objects (tables, columns, etc.) exist.
  3. Optimisation: The Query Optimiser evaluates various execution plans and selects the optimal one based on cost estimation.
  4. Execution: The selected execution plan is executed, and results are returned to the user.
8. Tools for Monitoring and Optimising Queries

To effectively monitor and optimise query performance, SQL Server provides various tools and features:

  • SQL Server Management Studio (SSMS): Use SSMS to view execution plans, analyse query performance, and identify potential bottlenecks.
  • Dynamic Management Views (DMVs): DMVs, such as sys.dm_exec_query_stats, can provide insights into query execution statistics, helping you identify slow-running queries.
  • SQL Server Profiler: This tool allows you to capture and analyse events occurring in SQL Server, aiding in performance tuning efforts.
  • SQL Server Extended Events: A lightweight performance monitoring system that captures and responds to SQL Server events, aiding in performance analysis.
  • SQL Server Activity Monitor: A built-in tool that provides real-time insights into SQL Server performance, showing active processes and resource usage.
  • Query Store: Captures query performance data over time, enabling comparisons of execution plans and runtime statistics.
  • Performance Monitor (PerfMon): Tracks various SQL Server performance counters, such as CPU usage and disk I/O.
  • Database Engine Tuning Advisor (DTA): Analyses workloads and provides recommendations for improving performance.
That's all for now.

No comments:

Post a Comment

Let me know any remarks or questions you may have. Please write down your name.

HELLO, I'M PERCY REYES! I've been working as a senior SQL Server Database Engineer for over 20 years; I'm a three-time Microsoft Data Platform MVP. I'm a cryptographer conducting research on cryptographic Boolean functions and their applications.