I really dislike database queries which are slow for no apparent reason. I ran into one of those today. It queries over a few thousands of well indexed rows and returned a handful, perhaps 3, records. Time to do this? 33 seconds. Well that’s no good for anybody. Digging into the query I found that it actually used a UNION to join 3 sets of similar data together. I go by the rule of thumb that SQL operations which treat data as sets and do things with that in mind are efficient. I’m not sure where I read that but it has stuck with me over the years. What it suggests is that you should avoid doing things like looping over rows or calling functions on masses of data.
As it turns out there are actually two different UNION operators in SQL Server: UNION and UNION ALL. They differ in how they handle duplicate entries. UNION will check each entry to ensure that it exists in the output only one time.