Percy Reyes: Collation

Showing posts with label Collation. Show all posts

Sunday, 21 January 2018

How to make uniform all the collations of table columns for all databases

Clearly, it is of paramount importance to standardise the collations for all databases in a SQL Server instance in order to avoid dealing with unforeseen conflicts of page code compatibility. We may find ourselves in complicated situations because databases with different collations were migrated from other environments and the new consolidated environment was not prepared to host those new databases, but in one way or another we may need to consolidate them in only one server. It may be compounded by the fact that there may be many character columns of a database using different collations and another collation at database level. What’s more, the tempdb database may be using another different collation. So, it may turn out to be not only a complex issue but also time-consuming.
To give you just an example, we can start finding out what character columns are using different collations from SQL_Latin1_General_CP1_CI_AS that we need to change in order to make everything uniform. I am going to display a useful script to do it. In this example I am assuming that we want to use SQL_Latin1_General_CP1_CI_AS for all objects in the database server.

EXEC sp_MSforeachdb '
USE [?]
select db_name(),c.name
from sys.columns c
inner join sys.types t on t.user_type_id= c.user_type_id
inner join sys.tables tb on  c.object_id=tb.object_id
where c.collation_name is not null 
and t.is_user_defined=0 and tb.is_ms_shipped=0 and tb.name<>''sysdiagrams''
and c.collation_name<>''SQL_Latin1_General_CP1_CI_AS''
order by tb.name, c.column_id'

After that,we can make the decision of changing all character columns for all databases and use SQL_Latin1_General_CP1_CI_AS, and then make that change at database level. Be cautious and make sure as well you have tempdb’s collation set SQL_Latin1_General_CP1_CI_AS.

EXEC sp_MSforeachdb '
USE [?]
if db_name() not in (''master'',''tempdb'',''msdb'',''model'')
begin
    select replace( REPLACE( ''ALTER TABLE '' + QUOTENAME(SCHEMA_NAME(tb.schema_id)) + ''.'' 
    + QUOTENAME(tb.name) + '' ALTER COLUMN '' + QUOTENAME(c.name) +  '' '' 
    + QUOTENAME(t.name) + ''('' + CAST( case when T.NAME=''NVARCHAR'' THEN  c.max_length/2 
      WHEN  T.NAME=''NCHAR'' THEN  c.max_length/2 ELSE c.max_length  END  AS VARCHAR(10)) +'')''  
    + '' COLLATE SQL_Latin1_General_CP1_CI_AS'' + CASE WHEN c.is_nullable =1 THEN '' NULL '' 
     else '' NOT NULL ;'' END, ''-1'', ''MAX'' ), ''[text](16)'', ''[varchar](max)'') as cmd
    INTO #TblTMP
    from sys.columns c
    inner join sys.types t on t.user_type_id= c.user_type_id
    inner join sys.tables tb on  c.object_id=tb.object_id
    where c.collation_name is not null 
    and t.is_user_defined=0 and tb.is_ms_shipped=0 and tb.name<>''sysdiagrams''
    and c.collation_name<>''SQL_Latin1_General_CP1_CI_AS''
    order by tb.name, c.column_id
    
  declare @cmd varchar(max)
  declare c_cmd cursor for 
     select cmd from  #TblTMP
  open c_cmd
  fetch next from c_cmd into @cmd
  while (@@fetch_status=0)
  begin    
    exec( @cmd)
    fetch next from c_cmd into @cmd
  end
  close c_cmd
  deallocate c_cmd
  drop table #TblTMP
end'

It is worth noting that while running the script above some errors may arise because of some indexes might be using one of the columns we are trying to alter. So, in this likely event, it is recommendable to drop those indexes and then run the script again. That is all for the time being. Let me know any remarks you may have.

Thursday, 4 January 2018

Table-valued user-defined functions and the database collation

When it comes to altering database collations we may face some problems that tend to slow us down at the beginning. Nevertheless, digging into the message errors we might not spot the causes easily. For instance, while executing the following script to change the collation at database level an error may arise informing that some objects depends on it and it is not possible to make that change. Here is the script.

ALTER DATABASE MyDBUser SET RESTRICTED_USER WITH ROLLBACK IMMEDIATE; 
ALTER DATABASE MyDBUser COLLATE SQL_Latin1_General_CP1_CI_AS;  
ALTER DATABASE MyDBUser SET MULTI_USER WITH ROLLBACK IMMEDIATE;

As far as we know collations are heavily linked to character columns, and it includes columns of table-valued user-defined functions as they may have character columns on its definition. When theses functions are created they inherit the database collation by design for their columns. Here the error message:

Msg 5075, Level 16, State 1, Line 1
The object 'TVFUserTable' is dependent on database collation. The database collation cannot be changed if a schema-bound object depends on it. Remove the dependencies on the database collation and then retry the operation.
Msg 5072, Level 16, State 1, Line 1
ALTER DATABASE failed. The default collation of database 'MyDBUser' cannot be set to SQL_Latin1_General_CP1_CI_AS.

What we just need to do to be able to change the collation at database level is firstly drop every schema-bound objects, then make the change and finally create the objects again. That is all for now. Let me know any remarks you may have.

Wednesday, 5 April 2017

Looking into Modern_Spanish and Latin1_General

The collations Modern_Spanish and Latin1_General are both Windows collations and support the same character set (Code Page 1252), but there are some differences between both collations related to the treatment of some characters in terms of sorting/doing comparisons. Some of the differences are noticeable especially when Accent-Insensitive is used, in Modern_Spanish n and ñ are considered different characters whereas in Latin1_General they are considered a character and an accented version of the same character.

Today I am going to show off an example to make it clear so that you can take it on board to pre-empt unforeseen results or conflict of collations. This example is based on Modern_Spanish_CI_AI and Latin1_General_CI_AI.

create table TempTable (
   Text_in_Latin1  varchar(100) collate Latin1_General_CI_AI,
   Text_in_Modern varchar(100) collate Modern_Spanish_CI_AI
)
go
insert into TempTable values ('Español', 'Español')
insert into TempTable values ('Espanol', 'Espanol')
insert into TempTable values ('Espanól', 'Espanól')
go
select Text_in_Latin1 from TempTable where Text_in_Latin1='Espanol'
go
select Text_in_Modern from TempTable where Text_in_Modern='Espanol'
go
drop table TempTable

Text_in_Latin1
--------------------------------------
Español
Espanol
Espanól

Text_in_Modern
--------------------------------------
Espanol
Espanól

As you can see, the first result demonstrates that Español, Espanol and Espanól are treated as the same word when we use Accent-Insensitive whereas in the second one Español and Espanol are treated as different words. It is worth noting that you only must change the collation if you really understand the behaviour change. For example, one might think that Latin1_General_CI_AS (Windows Collation) and SQL_Latin1_General_CP1_CI_AS (SQL Collation) are the same, but they are not the same, there is a slight difference in the short behaviour between both for specific special characters like '-'. Here another example about it.

create table TempTable (
   Text_in_Latin1  varchar(100) collate Latin1_General_CI_AS,
   Text_in_SQL_Latin1 varchar(100) collate SQL_Latin1_General_CP1_CI_AS
)
go
insert into TempTable values ('es-pe', 'es-pe')
insert into TempTable values ('espe', 'espe')
go
select Text_in_Latin1 from TempTable order by Text_in_Latin1
go
select Text_in_SQL_Latin1 from TempTable order by Text_in_SQL_Latin1
go
drop table TempTable

Text_in_Latin1
---------------------------------------
espe
es-pe

Text_in_SQL_Latin1
---------------------------------------
es-pe
espe

To be honest, I personally like having the same collation for all databases, but it has to be done carefully and, more importantly, if you really want that change. That is all for now, let me know any remarks you may have.

Tuesday, 24 May 2016

How to change collation of all columns

Just thinking about some interesting tools that could be useful for doing some DBA tasks, I would like to share my code to change the collation of all columns of all SQL Server tables (I mean User Tables, not System Tables). To begin with, I will show you a basic code to filter columns by an specific collation:

select tb.schema_id, tb.name,c.name,  c.collation_name, t.name, c.max_length, c.is_nullable,c.column_id

from sys.columns c

inner join sys.types t on t.user_type_id= c.user_type_id

inner join sys.tables tb on  c.object_id=tb.object_id

where c.collation_name is not null

and t.is_user_defined=0 and tb.is_ms_shipped=0 and tb.name<>'sysdiagrams'

and c.collation_name<>'SQL_Latin1_General_CP1_CI_AS'

order by tb.name, c.column_id

By executing it you will list every column that has a different collation you would like to change. Now I am going to show you the code that generates the code to change the collation of columns. After executing this code you must take the output and execute it to have your columns collation changed for a different one. In this example I am using SQL_Latin1_General_CP1_CI_AS collation as my wanted collation, I mean I want to have SQL_Latin1_General_CP1_CI_AS as my new collation. You have to replace it according to your requirement.

select 'ALTER TABLE ' + QUOTENAME(SCHEMA_NAME(tb.schema_id)) + '.' + QUOTENAME(tb.name) +

' ALTER COLUMN ' + QUOTENAME(c.name) +  ' ' + QUOTENAME(t.name) + '(' + CAST( case when T.NAME='NVARCHAR' THEN  c.max_length/2

WHEN  T.NAME='NCHAR' THEN  c.max_length/2 ELSE c.max_length  END  AS VARCHAR(10)) +')'

 +' COLLATE SQL_Latin1_General_CP1_CI_AS' + CASE WHEN c.is_nullable =1 THEN ' NULL ' else ' NOT NULL ;' END--,  c.collation_name,c.is_nullable

from sys.columns c

inner join sys.types t on t.user_type_id= c.user_type_id

inner join sys.tables tb on  c.object_id=tb.object_id

where c.collation_name is not null

and t.is_user_defined=0 and tb.is_ms_shipped=0 and tb.name<>'sysdiagrams'

and c.collation_name<>'SQL_Latin1_General_CP1_CI_AS'

order by tb.name, c.column_id

Mind your head because the execution of output code may fail as some columns could have some constraints (for instance some Foreign keys, Primary Keys, Uniques, etc.) or indexes that you may need to drop them first and then recreate them after you change the collation of the column.
Just to finish, I highly recommend testing this code on a copy of your database to check whether any error appears because of reasons explained above, then doing what is necessary. Finally, being totally sure that there is no error, you can proceed with the execution on your database in production environment. Please let me know any remark or question you may have. Thanks for reading!

Pages