When working with linked servers, the most expensive factor to consider is the amount of data that will have to travel over the network.
It’s important to write the right code on the right server, since each mistake can be very expensive in this area.
Below are a few common mistakes when working with linked servers:
1. Using push rather than pull
Surprisingly, pushing data using linked server is much slower than pulling data. Linchi Shea wrote a good post about it.
* Linchi uses openquery to illustrate the difference, but this also happens when using four part name.
2. Using Join
In order to perform a join between two servers, SQL Server has to transfer the data from one server to another. When working with very large tables, this can be very painful. By dafault, the data will be transferred from the remote server to the local server.
In order to prevent this, limit the amount of data that will be transferred between the servers. You can do this by filtering the data in the where clause, having a remote stored procedure that will bring only the relevant data, or, in case you have an inner join where the local table is much smaller than the remote one, by using the REMOTE join hint, that will transfer data from the local server to the remote server in order to perform the join.
3. Using Union
As with joins, uniting two sets between servers will of course lead to transferring the data from the remote one.
But even if you run a remote query that units two sets from the same remote server, the sets will first be transferred to the local server, and only then be united.
Prevent this by having a remote procedure, function or view that will do the union.
4.Writing too complex queries
The optimizer cannot always know what you mean, especially when working with linked server queries.
For example, I encountered a query that looked like this one and ran for 10 minutes:
SELECT * FROM LocalTable WHERE SomeColumn < (SELECT COUNT(*) FROM RemoteServer.SomeDB.dbo.SomeTable WHERE SomeColumn > 100)
I changed the query to look like this:
DECLARE @Count INT SELECT @Count = COUNT(*) FROM RemoteServer.SomeDB.dbo.SomeTable WHERE SomeColumn > 100 SELECT * FROM LocalTable WHERE SomeColumn < @Count
After the rewrite, the query ran for 1 only second..
Keep it simple!
5. Working with linked server when the databases are on the same instance.
The performance penalty here is not as dramatic as in the other bullets, but working this way is slower than using three part name inside the instance. If you identify that kind of situation, test and compare performance between the two methods on your test system. Then decide if the performance improvement is worth changing the code in production. In some situations it will.