I really dislike database queries which are slow for no apparent reason. I ran into one of those today. It queries over a few thousands of well indexed rows and returned a handful, perhaps 3, records. Time to do this? 33 seconds. Well that’s no good for anybody. Digging into the query I found that it actually used a UNION
to join 3 sets of similar data together. I go by the rule of thumb that SQL operations which treat data as sets and do things with that in mind are efficient. I’m not sure where I read that but it has stuck with me over the years. What it suggests is that you should avoid doing things like looping over rows or calling functions on masses of data.
As it turns out there are actually two different UNION
operators in SQL Server: UNION
and UNION ALL
. They differ in how they handle duplicate entries. UNION
will check each entry to ensure that it exists in the output only one time.