Customer Question: "Why I am getting different results when is change collation compatibility from true to false?"
This actually turned out to be an interesting issue because no error was raised but the results are different.
The query looked like the following:
select * from
REMOTESERVER...tblTest as t
where t.strID = (select TOP 1 strID from myLocalTable)
INCORRECT RESULTS: When collation compatibility was set to TRUE
CORRECT RESULTS: When collation compatibility was set to FALSE
A linked server operation attempts to determine what predicates can be pushed to the remote server, what indexes are present and other such properties. For the example query if the strID predicate can be pushed to the remote server it saves the SQL Server from having to pull the entire table to the local server, sort and process it.
When collation compatibility is enabled (true) the linked server interrogates the remote table information. The tblTest reports it has a unique primary key so SQL Server makes the assumption that the data can be ordered by the strID column when retrieving it from the remote server.
The statistics profile snippet shows the remote query.
1 1 |--Merge Join(Inner Join, MANY-TO-MANY MERGE:([tblTest].[strID]=[myLocalTable].strId)
3 1 |--Remote Query(SOURCE:(REMOTESERVER), QUERY:(SELECT "tblTest"."strId" ...
When collation compatibility is disabled (false) SQL Server must retrieve a larger set of rows, store them in tempdb and sort them. Notice this statistics profile snippet indicating that 30 rows are retrieved, sorted and then joined.
3 1 |--Merge Join(Inner Join, MANY-TO-MANY MERGE:([tblTest].[strID]=[myLocalTable].strId)
30 1 |--SORT:([#tbl_0001]:[strId] asc) 30 1 |--Remote Query(SOURCE:(REMOTESERVER), QUERY:(SELECT "tblTest"."strId" ...
The problem was that the remote collation was EBCDIC and SQL Server was set to Latin General 1.
From the remote query here is how the strID is sorted:
Then if strID is sorted locally here is the order:
What happens is the merge Join is going to process in a control/break way. When it advances to a value beyond what is required to complete the TOP 1 join request execution is terminated. Because the sort order was actually different telling the SQL Server to trust the remote collation results in unexpected results.
So when the collation is not compatible set be sure you have collation compatible set OFF.
If you pull the results to the SQL Server and sort it performance can become an issue.
Be sure to test this carefully with you data but a workaround may be to enable 'Use Remote Collation' and specify a matching 'Collation Name'. In this case it was SQL_EBCDIC037_CP1_CS_AS.
The Linked server query behavior now uses EBCDIC sorting rules when communicating with the remote server. SQL Server can push predicates and assume ordering, keeping the results accurate and performance high.
Areas like bulk insert and a few others have been updated to detect such sorting problems at runtime. If the next row is not in sorted order an error is raised.
Sarah will be following up with the SQL Server development team for this linked server issue. Adding similar logic to linked server query processing would provide an error message instead of the unexpected results set.
SQL Server Support Escalation Engineer
SQL Server Senior Escalation Engineer