http://csharpwithdj.blogspot.com/SQL Server Performance Tuning for Stored Procedures
Whenever a client application needs to send Transact-SQL to SQL Server, send it in the form of a stored procedure instead of a script or embedded Transact-SQL. Stored procedures offer many benefits, including:
• Reduced network traffic and latency, boosting application performance.
• Stored procedure execution plans can be reused, staying cached in SQL Server's memory, reducing server overhead.
• Client execution requests are more efficient. For example, if an application needs to INSERT a large binary value into an image data column not using a stored procedure, it must convert the binary value to a character string (which doubles its size), and send it to SQL Server. When SQL Server receives it, it then must convert the character value back to the binary format. This is a lot of wasted overhead. A stored procedure eliminates this issue as parameter values stay in the binary format all the way from the application to SQL Server, reducing overhead and boosting performance.
• Stored procedures help promote code reuse. While this does not directly boost an application's performance, it can boost the productivity of developers by reducing the amount of code required, along with reducing debugging time.
• Stored procedures can encapsulate logic. You can change stored procedure code without affecting clients (assuming you keep the parameters the same and don't remove any result sets columns). This saves developer time.
• Stored procedures provide better security to your data. If you use stored procedures exclusively, you can remove direct SELECT, INSERT, UPDATE, and DELETE rights from the tables and force developers to use stored procedures as the method for data access. This saves DBA's time.
Keep in mind that just because you use a stored procedure does not mean that it will run fast. The code you use within your stored procedure must be well designed for both speed and reuse.
*****
One of the biggest advantages of using stored procedures over not using stored procedures is the ability to significantly reduce network traffic. And the more network traffic that can be reduced, the better the overall performance of your SQL Server-based applications.
Here are some examples how stored procedures reduce network traffic:
• When an application executes a stored procedure, only a simple, small RPC (remote procedure call) is made from the client to SQL Server. But if the application is not using stored procedures, but sending Transact-SQL code directly from the client to SQL Server, network traffic can often very high. For example, if the amount of Transact-SQL code is 500 lines (and this would not be all that unusual), then it would take hundreds of network packets to transmit the Transact-SQL code from the client to SQL Server. On the other hand, if the 500 lines of Transact-SQL code are in a stored procedure, this code never has to travel the network, as it is already located on the server.
• When an application needs to retrieve one or more rows from SQL Server and then takes some action on this data, such as INSERTing, UPDATing, or DELETing rows in the database based on the data retrieved, network traffic is significantly reduced if all this code is stored in a stored procedure. As before, it only takes a single RPC call to execute a stored procedure. But if all the code to perform these steps is not in a stored procedure, but located in the application, network traffic can be high. For example, first, the application has to send the Transact-SQL code to SQL Server (lots of potential network traffic). Then SQL Server has to return the result set back to the client, then the client has to use the data, and then send additional requests (INSERT, UPDATE, DELETE) to SQL Server, and then SQL Server has to respond back to the client, and so on, until the task is completed. As you can see, this can generate a lot of network traffic. But if all the work is being done from within a stored procedure, network traffic is greatly reduced.
• Along the same lines as above, putting the business logic of your application in stored procedures can help your application's performance. By locating virtually all of the processing on SQL Server, round-trip network traffic is greatly reduced, helping boost performance.
The goal should be to limit network traffic from the client to SQL Server to simple RPCs, and limit the traffic from SQL Server to the client as finished results.
*****
To help identify performance problems with stored procedures, use the SQL Server's Profiler Create Trace Wizard to run the "Profile the Performance of a Stored Procedure" trace to provide you with the data you need to identify poorly performing stored procedures.
*****
By default, every time a stored procedure is executed, a message is sent from the server to the client indicating the number of rows that were affected by the stored procedure. Rarely is this information useful to the client. By turning off this default behavior, you can reduce network traffic between the server and the client, helping to boost overall performance of your server and applications.
There are two main ways to turn this feature off. You can also turn this feature off using a server trace setting, but it is unnecessary as there are easier ways, as described here.
To turn this feature off on at the stored procedure level, you can include the statement:
SET NOCOUNT ON
at the beginning of each stored procedure you write. This statement should be included in every stored procedure you write.
If you want this feature turned off for your entire server, you can do this by running these statements at your server:
SP_CONFIGURE 'user options', 512
RECONFIGURE
You may or may not want to do this for your entire server, as it affects every transaction on your server. For example, some application programs need the count information, otherwise they will not work. If this is the case, you don't want to turn this option for the entire server, but just for the stored procedures you write that don’t need the count information.
*****
Keep Transact-SQL transactions as short as possible within a stored procedure. This helps to reduce the number of locks, helping to speed up the overall performance of your SQL Server application.
Two ways to help reduce the length of a transaction are to: 1) break up the entire job into smaller steps (or multiple stored procedures) so each step can be committed as soon as possible; and 2) take advantage of SQL Server statement batches, which acts to reduce the number of round-trips between the client and server.
When a stored procedure is first executed (and it does not have the WITH RECOMPILE option), it is optimized and a query plan is compiled and cached in SQL Server's buffer. If the same stored procedure is called again from the same connection, it will used the cached query plan instead of creating a new one, often saving time and boosting performance. This may or may not be what you want.
If the query in the stored procedure is exactly the same each time, and the query plan is the same each time, then this is a good thing. But if the query within the stored procedure is dynamic (for example, the WHERE clauses changes from one execution of the stored procedure to the next), then this may not be a good thing, as the query may not be optimized when it is run, and the performance of the query can suffer greatly. This can happen because changes in the query plan may occur, and if you run a cached query plan for what is essentially a new query, it may not be appropriate and it may cause performance to suffer greatly.
If you know that your query's query plan will vary each time it is run from a stored procedure, you will want to add the WITH RECOMPILE option when you create the stored procedure. This will force the stored procedure to be re-compiled each time it is run, ensuring the query is optimized with the correct query plan each time it is run. Yes, this will circumvent the reuse of cached query plans, hurting performance a little, but it is more desirable than reusing incorrect query plans.
*****
Many stored procedures have the option to accept multiple parameters. This in and of itself is not a bad thing. But what can often cause problems is if the parameters are optional, and the number of parameters varies each time the stored procedure runs. There are two ways to handle this problem, the slow performance way and fast performance way.
If you want to save your development time, but don't care about your application's performance, you can write your stored procedure generically so that it doesn't care how many parameters it gets. The problem with this method is that you may end up unnecessarily joining tables that don't need to be joined based on the parameters submitted for any single execution of the stored procedure.
Another, much better performing way, although it will take you more time to code, is to include IF...ELSE logic in your stored procedure, and create separate queries for each possible combination of parameters that are to be submitted to the stored procedure. This way, you can be sure you query is as efficient as possible each time it runs.
*****
Here's a way to handle the problem of not knowing what parameters your stored procedure might face. The problems are the query plans, the pre-compilation of stored procedures that SQL Server does for you. As you know, one of the biggest reasons to use stored procedures instead of ad-hoc queries is the performance gained by using them. The problem that is that SQL Server will only generate a query-plan for the path taken through your stored procedure when you first call it, not all possible paths.
Let me illustrate this with an example. Consider the following procedure (pre-compilation doesn't really have a huge effect on the queries used here, but these are just for illustration purposes):
CREATE PROCEDURE dbo.spTest (@query bit) AS
IF @query = 0
SELECT * FROM authors
ELSE
SELECT * FROM publishers
GO
Suppose I make my first call to this procedure with the @query parameter set to 0. The query-plan that SQL Server will generate will be optimized for the first query ("SELECT * FROM authors"), because the path followed on the first call will result in that query being executed.
Now, if I next call the stored procedure with @query set to 1, the query plan that SQL Server has in memory will not be of any use in executing the second query, since the query-plan is optimized for the authors table, not the publishers table. Result: SQL Server will have to compile a new query plan, the one needed for the second query. If I next call the procedure with @query set to 0 again, the whole path will have to be followed from the start again, since only one query-plan will be kept in memory for each stored procedure. This will result in sub-optimal performance.
As it happens I have a solution, one that I've used a lot with success. It involves the creation of what I like to call a 'delegator'. Consider again spTest. I propose to rewrite it like this:
CREATE PROCEDURE dbo.spTestDelegator (@query bit) AS
IF @query = 0
EXEC spTestFromAuthors
ELSE
EXEC spTestFromPublishers
GO
CREATE PROCEDURE dbo.spTestFromAuthors AS
SELECT * FROM authors
GO
CREATE PROCEDURE dbo.spTestFromPublishers AS
SELECT * FROM publishers
GO
The result of this restructuring will be that there will always be an optimized query-plan for spTestFromAuthors and spTestFromPublishers, since they only hold one query. The only one getting re-compiled over and over again is the delegator, but since this stored procedure doesn't actually hold any queries, that won't have a noticeable effect on execution time. Of course re-compiling a plan for a simple 'SELECT *' from a single table will not give you a noticeable delay either (in fact, the overhead of an extra stored procedure call may be bigger then the re-compilation of "SELECT * FROM AnyTable"), but as soon as the queries get bigger, this method certainly pays off.
The only downside to this method is that now you have to manage three stored procedures instead of one. This is not that much of a problem though as the different stored procedures can be considered one single 'system', so it would be logical to keep all of them together in the same script, which would be just as easy to edit as a single stored procedure would be. As far as security is concerned, this method shouldn't give you any extra headaches either, as the delegator is the only stored procedure directly called by the client, this is the only one you need to manage permissions on. The rest will only be called by the delegator, which will always work as long as those stored procedures are owned by the same user as the delegator.
I've had large successes using this technique. Recently I developed a (partial full-text) search engine for our reports database, which resulted in a stored procedure that originally ran about 20 seconds. After employing the above technique, the stored procedure only took about 2 seconds to run, resulting in a ten-fold increase in performance!
While temporary stored procedures can provide a small performance boost in some circumstances, using a lot of temporary stored procedures in your application can actually create contention in the system tables and hurt performance.
Instead of using temporary stored procedures, you may want to consider using the SP_EXECUTESQL stored procedure instead. It provides the same benefits as temporary stored procedures, but it does not store data in the system tables, avoiding contention problems.
*****
If you are creating a stored procedure to run in a database other than the Master database, don't use the prefix "sp_" in its name. This special prefix is reserved for system stored procedures. Although using this prefix will not prevent a user defined stored procedure from working, what it can do is to slow down its execution ever so slightly.
The reason for this is that by default, any stored procedure executed by SQL Server that begins with the prefix "sp_", is first attempted to be resolved in the Master database. Since it is not there, time is wasted looking for the stored procedure.
If SQL Server cannot find the stored procedure in the Master database, then it next tries to resolve the stored procedure name as if the owner of the object is "dbo". Assuming the stored procedure is in the current database, it will then execute. To avoid this unnecessary delay, don't name any of your stored procedures with the prefix "sp_".
*****
Before you are done with your stored procedure code, review it for any unused code, parameters, or variables that you may have forgotten to remove while you were making changes, and remove them. Unused code just adds unnecessary bloat to your stored procedures, although it will not necessarily negatively affect performance of the stored procedure.
*****
For best performance, all objects that are called within the same stored procedure should all be owned by the same object owner or schema, preferably dbo, and should also be referred to in the format of object_owner.object_name or schema_owner.object_name.
If the object owner's or schemas are not specified for objects, then SQL Server must perform name resolution on the objects, which causes a small performance hit.
And if objects referred to in the stored procedure have different owners or schemas, SQL Server must check object permissions before it can access any object in the database, which adds unnecessary overhead. Ideally, the owner or schema of the stored procedure should own all of the objects referred to in the stored procedure.
In addition, SQL Server cannot reuse a stored procedure "in-memory plan" over if the object owner or schema is not used consistently. If a stored procedure is sometime referred to with its object owner's or schema name, and sometimes it is not, then SQL Server must re-execute the stored procedure, which also hinders performance.
*****
When you need to execute a string of Transact-SQL, you should use the sp_executesql stored procedure instead of the EXECUTE statement. Sp_executesql offers two major advantages over EXECUTE. First, it supports parameter substitution, which gives your more options when creating your code. Second, it creates query execution plans that are more likely to be reused by SQL Server, which in turn reduces overhead on the server, boosting performance.
Sp_executesql executes a string of Transact-SQL in its own self-contained batch. When it is run, SQL Server compiles the code in the string into an execution plan that is separate from the batch that contained the sp_executesql and its string.
Learn more about how to use sp_executesql in the SQL Server Books Online.
*****
SQL Server will automatically recompile a stored procedure if any of the following happens:
• If you include a WITH RECOMPILE clause in a CREATE PROCEDURE or EXECUTE statement.
• If you run sp_recompile for any table referenced by the stored procedure.
• If any schema changes occur to any of the objects referenced in the stored procedure. This includes adding or dropping rules, defaults, and constraints.
• New distribution statistics are generated.
• If you restore a database that includes the stored procedure or any of the objects it references.
• If the stored procedure is aged out of SQL Server's cache.
• An index used by the execution plan of the stored procedure is dropped.
• A major number of INSERTS, UPDATES or DELETES are made to a table referenced by a stored procedure.
• The stored procedure includes both DDL (Data Definition Language) and DML (Data Manipulation Language) statements, and they are interleaved with each other.
• If the stored procedure performs certain actions on temporary tables.
*****
One hidden performance problem of using stored procedures is when a stored procedure recompiles too often. Normally, you want a stored procedure to compile once and for it to be stored in SQL Server's cache so that it can be re-used without it having to recompile each time it is used. This is one of the major benefits of using stored procedures.
But in some cases, a stored procedure is recompiled much more often than it needs to be recompiled, hurting your server's performance. In fact, it is possible for a stored procedure to have to be recompiled while it is executing!
Here are three potential problems you want to look out for when writing stored procedures.
Unnecessary Stored Procedure Recompilations Due to Row Modifications and Automated Statistics Update
If your database has the "Auto Update Statistics" database option turned on, SQL Server will periodically automatically update the index statistics. On a busy database, this could happen many times each hour. Normally, this is a good thing because the Query Optimizer needs current index statistics if it is to make good query plan decisions. One side effect of this is that this also causes any stored procedures that reference these tables to be recompiled. Again, this is normal, as you don't want a stored procedure to be running an outdated query plan. But again, sometimes stored procedures recompile more than they have to. Here are some suggestions on how to reduce some of the unnecessary recompilations:
• Use sp_executesql instead of EXECUTE to run Transact-SQL strings in your stored procedures.
• Instead of writing one very large stored procedure, instead break down the stored procedure into two or more sub-procedures, and call then from a controlling stored procedure.
• If your stored procedure is using temporary tables, use the KEEP PLAN query hint, which is used to stop stored procedure recompilations caused by more than six changes in a temporary table, which is the normal behavior. This hint should only be used for stored procedures than access temporary tables a lot, but don't make many changes to them. If many changes are made, then don't use this hint.
Unnecessary Stored Procedure Recompilations Due to Mixing DDL and DML Statements in the Same Stored Procedure
If you have a DDL (Data Definition Language) statement in your stored procedure, the stored procedure will automatically recompile when it runs across a DML (Data Manipulation Language) statement for the first time. And if you intermix both DDL and DML many times in your stored procedure, this will force a recompilation every time it happens, hurting performance.
To prevent unnecessary stored procedure recompilations, you should include all of your DDL statements at the first of the stored procedure so they are not intermingled with DML statements.
Unnecessary Stored Procedure Recompilations Due to Specific Temporary Table Operations
Improper use of temporary tables in a stored procedure can force them to be recompiled every time the stored procedure is run. Here's how to prevent this from happening:
• Any references to temporary tables in your stored procedure should only refer to tables created by that stored procedure, not to temporary tables created outside your stored procedure, or in a string executed using either the sp_executesql or the EXECUTE statement.
• All of the statements in your stored procedure that include the name of a temporary table should appear syntactically after the temporary table.
• The stored procedure should not declare any cursors that refer to a temporary table.
• Any statements in a stored procedure that refer to a temporary table should precede any DROP TABLE statement found in the stored procedure.
• The stored procedure should not create temporary tables inside a control-of-flow statement.
• To find out if your SQL Server is experiencing excessive recompilations of stored procedures, a common cause of poor performance, create a trace using Profiler and track the SP:Recompile event. A large number of recompilations should be an indicator if you potentially have a problem. Identify which stored procedures are causing the problem, and then take correction action (if possible) to reduce or eliminate these excessive recompilations.
• *****
• Stored procedures can better boost performance if they are called via Microsoft Transaction Server (MTS) instead of being called directly from your application. A stored procedure can be reused from the procedure cache only if the connection settings calling the stored procedure are the same. If different connections call a stored procedure, SQL Server must load a separate copy of the stored procedure for each connection, which somewhat defeats the purpose of stored procedures.
• But if the same connection calls a stored procedure, it can be used over and over from the procedure cache. The advantage of Transaction Server is that it reuses connections, which means that stored procedures can be reused more often. If you write an application where every user opens their own connection, then stored procedures cannot be reused as often, reducing performance.
• *****
• Avoid nesting stored procedures, although it is perfectly legal to do so. Nesting not only makes debugging more difficult, it makes it much more difficult to identify and resolve performance-related problems.
• *****
• If you use input parameters in your stored procedures, you should validate all of them at the beginning of your stored procedure. This way, if there is a validation problem and the client application needs to be notified of the problem, it happens before any stored procedure processing takes place, preventing wasted effort and boosting performance.
• *****
• When calling a stored procedure from your application, it is important that you call it using its qualified name. Such as:
• exec dbo.myProcedure
• instead of:
• exec myProcedure
• Why? There are a couple of reasons, one of which relates to performance. First, using fully qualified names helps to eliminate any potential confusion about which stored procedure you want to run, helping to prevent bugs and other potential problems.
• But more importantly, doing so allows SQL Server to access the stored procedures execution plan more directly, and in turn, speeding up the performance of the stored procedure. Yes, the performance boost is very small, but if your server is running tens of thousands or more stored procedures every hour, these little time savings can add up.
• If a stored procedure needs to return only a single value, and not a recordset, consider returning the single value as an output statement. While output statements are generally used for error-checking, they can actually be used for any reason you like. Returning a single value as at output statement is faster than returning a single value as part of a recordset.
• *****
• One way to help ensure that stored procedures are reused from execution to execution of the same stored procedure is to ensure that any SET options, database options, or SQL Server configuration options don't change from execution to execution of the same stored procedure. If they do, then SQL Server may consider these same stored procedures to be different, and not be able to reuse the current query plan stored in cache.
• Some examples of this include when you change the language used by the stored procedure (using SET) and if you change the Dateformat (using SET).
• *****
• If you find that a particular stored procedure recompiles every time it executes, and you have determined that there is nothing you can do about the recompiles, and if that stored procedure is very large, consider the following option. Try to determine what part or parts of the stored procedure is causing the recompiles. Once you have done this, break out this troublesome code into its own stored procedure, and then call this stored procedure from the main stored procedure. The advantage of this is that is it takes much less time to recompile a smaller stored procedure than a larger stored procedure.
No comments:
Post a Comment