Monday, July 30, 2012

Points to remember before Query optimization in SqlServer

  • Table should have primary key
  • Table should have minimum of one clustered index
  • Table should have appropriate amount of non-clustered index
  • Non-clustered index should be created on columns of table based on query which is running
  • Following priority order should be followed when any index is created a) WHERE clause, b) JOIN clause, c) ORDER BY clause, d) SELECT clause
  • Do not to use Views or replace views with original source table
  • Triggers should not be used if possible, incorporate the logic of trigger in stored procedure
  • Remove any adhoc queries and use Stored Procedure instead
  • Check if there is atleast 30% HHD is empty – it improves the performance a bit
  • If possible move the logic of UDF to SP as well
  • Always use WHERE Clause in SELECT Queries while we don’t need all the rows to be returned. This will help to narrow the return rows else it will perform a whole table scan and waste the Sql server resources with increasing the network traffic. While scanning the whole it will lock the Table which may prevent other users to access the table.
  • Remove any unnecessary joins from table
  • If there is cursor used in query, see if there is any other way to avoid the usage of this (either by SELECT … INTO or INSERT … INTO, etc)
  • It is seen many times developers use codes like   SELECT * FROM OrderTable WHERE LOWER(UserName)='telsa'
    Instead of writing it like the below
    SELECT * FROM OrderTable WHERE UserName='telsa'
    Infact both the queries does the same work but the 2nd one is better and retrieves rows more speedly than the first query. Because Sql Server is not case sensitive
     
  • While running a query, the operators used with the WHERE clause directly affect the performance. The operators shown below are in their decreasing order of their performance.
    • =
    • >,>=,<, <=
    • LIKE
    • <>  
  • When we are writing queries containing NOT IN, then this is going to offer poor performance as the optimizer need to use nested table scan to perform this activity. This can be avoided by using EXISTS or NOT EXISTS.When there is a choice to use IN or EXIST, we should go with EXIST clause for better performance. 
  • It is always best practice to use the Index seek while the columns are covered by an index, this will force the Query Optimizer to use the index while using IN or OR clauses as a part of our WHERE clause. 
    • SELECT * FROM OrderTable WHERE Status = 1 AND OrderID IN (406,530,956) Takes more time than 
      SELECT * FROM OrderTable (INDEX=IX_OrderID) WHERE Status = 1 AND OrderID IN (406,530,956)
  • While we use IN, in the sql query it better to use one or more leading characters in the clause instead of using the wildcard character at the starting.

    SELECT * FROM CustomerTable WHERE CustomerName LIKE 'm%'
    SELECT * FROM CustomerTable WHERE CustomerName LIKE '%m'
    In the first query the Query optimizer is having the ability to use an index to perform the query and there by reducing the load on sql server. But in the second query, no suitable index can be created while running the query.
  • While there is case to use IN or BETWEEN clause in the query, it is always advisable to use BETWEEN for better result.
  • Always avoid the use of SUBSTRING function in the query.
  • The queries having WHERE clause connected by AND operators are evaluated from left to right in the order they are written. So certain things should be taken care of like
  • Provide the least likely true expressions first in the AND. By doing this if the AND expression is false at the initial stage the clause will end immediately. So it will save execution time
  • If all the parts of the AND expression are equally like being false then better to put the Complex expression first. So if the complex works are false then less works to be done.
  • While the select statement contains a HAVING clause, its better to make the WHERE clause to do most of the works (removing the undesired rows) for the Query instead of letting the HAVING clause to do the works. 
    • e.g. in a SELECT statement with GROUP BY and HAVING clause, things happens like first WHERE clause will select appropriate rows then GROUP BY divide them to group of rows and finally the HAVING clause have less works to perform, which will boost the performance.