WHERE SELECT Patients.Name, Doctors.Specialty FROM Patients, Doctors WHERE Patients.DoctorID = Doctors.DoctorID
JOIN SELECT Patients.Name, Doctors.Specialty FROM Patients INNER JOIN Doctors ON Patients.DoctorID = Doctors.DoctorID
They look nearly identical. So which one should you use?
Well to start with, the two methods may be functionally identical in many cases because the database engine's query optimizer is smart enough to see that the tables must be joined using said criteria with either method.
I did some testing on a table in one of my databases that has 1.6 million records and one that has 620k records, and the query optimizer chose the same execution plan for both methods (JOIN and WHERE), even with joining three copies of the same table.
So I searched for help online to find out if there is one I should use. What I found is that JOIN is the preferred method, for several reasons:
- It is the 'correct' way to join tables, at least according to ANSI standards. Newer databases in general support JOIN, older databases in general do not.
- In some databases you can't do LEFT or RIGHT joins with WHERE clauses, so you'd have to rewrite your query. In databases that perform left-join operations using special WHERE syntax, queries sometimes have to be significantly rewritten to function with JOIN syntax. See ANSI Joins by Neil Boyle.
Using the example queries from above, what if you wanted to show patients in your list, even if they did not have a doctor assigned to them (DoctorID = Null)? You must use a LEFT JOIN:
SELECT Patients.Name, Doctors.Specialty FROM Patients LEFT JOIN Doctors ON Patients.DoctorID = Doctors.DoctorID
Patients with no assigned doctor will show Null in the Specialty field, while those with an assignment will show the Specialty as usual. The example query above only shows patients who have a doctor assigned.
- Using JOIN separates the conditions for the query selection from the conditions for the table join. This simplifies the WHERE clause and allows you to more quickly scan it for the conditions in a query.
- Queries to find unmatched rows or 'orphans' using NOT IN or NOT EXISTS with a WHERE clause are less efficient than using JOIN. To find all doctors who are not assigned to a patient:
SELECT Doctors.Name FROM Doctors LEFT JOIN Patients ON Doctors.DoctorID = Patients.DoctorID WHERE Patients.DoctorID IS NULL
In databases that support JOIN, I don't think there is a way to do such an unmatched query using a WHERE clause.
Performance Tuning SQL Server Joins at SQL-Server-Performance.Com (excerpt as follows:) When you create joins using Transact-SQL, you can choose between two different types of syntax: either ANSI or Microsoft. ANSI refers to the ANSI standard for writing joins, and Microsoft refers to the old Microsoft style of writing joins. For example:
ANSI JOIN Syntax
SELECT fname, lname, department FROM names INNER JOIN departments ON names.employeeid = departments.employeeid
Former Microsoft JOIN Syntax
SELECT fname, lname, department FROM names, departments WHERE names.employeeid = departments.employeeid
If written correctly, either format will produce identical results. But that is a big if. The older Microsoft join syntax lends itself to mistakes because the syntax is a little less obvious. On the other hand, the ANSI syntax is very explicit and there is no chance you can make a mistake.
For example, I ran across a slow-performing query from an ERP program. After reviewing the code, which used the Microsoft JOIN syntax, I noticed that instead of creating a LEFT JOIN, the developer had accidentally created a CROSS JOIN instead. In this particular example, less than 10,000 rows should have resulted from the LEFT JOIN, but because a CROSS JOIN was used, over 11 million rows were returned instead. Then the developer used a SELECT DISTINCT to get rid of all the unnecessary rows created by the CROSS JOIN. As you can guess, this made for a very lengthy query. Unfortunately, all I could do was notify the vendor's support department about it.
The moral of this story is that you probably should be using the ANSI syntax, not the old Microsoft syntax. Besides reducing the odds of making silly mistakes, this code is more portable between database, and eventually, I imagine Microsoft will eventually stop supporting the old format, making the ANSI syntax the only option.