Blocking Monitor

October 2, 2009

I wanted to share a little something we put together to monitor blocking on our SQL 2005 servers. Every once in awhile the blocking gets out of control and when it does we get these emails so we can quickly jump in and troubleshoot before the Help desk tickets start rolling in.  Create this in your master database be sure to change the database name where it says YOURDB. Create a SQL Server job that runs constantly. In the Job SQL put the following infinite loop to call the proc below:

declare @count int
set @count = 5000000
WHIle @count > 0
BEGIN
waitfor delay ‘00:00:15′
exec [Proc_LeadBlocker] 0, ‘YOUR MAIL PROFILE’, ‘Youremail@email.com’
end

 

CREATE  PROCEDURE [dbo].[Proc_LeadBlocker]
(@maxtime int = 0,@MailProfile varchar(100),@recipients varchar(255) )   
as   
--=======================================================================   
-- Proc Name : sp__leadblocker   
-- Description : This procedure will report processes that are blocking   
--    other processes and processes that are being blocked   
--    
-- Author : Motherofallgeeks.com
--=======================================================================   
 
--DECLARE VARIABLES   
DECLARE @BLKemailmsg varchar(1000)   
DECLARE @WAITemailmsg varchar(1000)   
DECLARE @message varchar(max)
   
--CHECK FOR BLOCKING   
IF EXISTS (select 1 FROM sys.dm_exec_requests  
     WHERE Session_ID in
      (SELECT blocking_session_id 
      FROM sys.dm_exec_requests 
      WHERE (wait_time/15000) > @maxtime))    
BEGIN  
 --BUILD BLOCKING PROCESSES EMAIL MSG   
 select @BLKemailmsg = 'The Following Processes are Blocking other Processes:(BLOCKER)  <br>' 
 select @BLKemailmsg = @BLKemailmsg+    
   '<b> SPID: </b>'+ convert(varchar(5),DR.Session_ID) +  '<br>' +
   '<b> STATUS: </b>' + DS.status +    '<br>' +
   ' <b>CMD: </b>' +DR.command+    '<br>' +
   '<b> DB:</b> '+ db_name(DR.database_id) +    '<br>' +
   '<b> LOGIN:</b> '+ DS.login_name +    '<br>' +
   '<b> HOSTNAME:</b> '+ DS.host_name +    '<br>' +
   '<b> LAST_BATCH_TIME: </b>'+ convert(varchar(20),Ds.last_request_Start_time,113)+   '<br>' +
 '<b> wAIT rESOURCE: </b>'+wait_resource + '<br>' +
 '<b> last_wait_type: </b>'+last_wait_type + '<br>' +
   ' <b> SQL Statement: </b>' +  (select name from [YOURDB].sys.objects where object_id=s2.objectid)+ ' '+ SUBSTRING(s2.text,  statement_start_offset / 2, ( (CASE WHEN statement_end_offset = -1 THEN (LEN(CONVERT(nvarchar(max),s2.text)) * 2) ELSE statement_end_offset END)  - statement_start_offset) / 2)    +  --convert(varchar(8000),s2.text) +   '<br>'
   '<HR>'
FROM sys.dm_exec_requests DR  INNER JOIN sys.dm_exec_sessions DS  ON DR.Session_id = DS.Session_id
CROSS APPLY sys.dm_exec_sql_text(dr.sql_handle) AS s2 
WHERE DR.Session_ID in
 (SELECT blocking_session_id
   FROM sys.dm_exec_requests 
   WHERE (wait_time/15000) > @maxtime)   
         
   
 --BUILD WAITING PROCESSES EMAIL MSG   
 select @WAITemailmsg = ' The Following Processes are Waiting for Resources:(BEING BLOCKED) <br> '
 select @WAITemailmsg = @WAITemailmsg +
   '<b> SPID: </b>'+ convert(varchar(5),DR.Session_ID) +    '<br>' +
   '<b>  CMD: </b>' +DR.command+    '<br>'+
   '<b>  DB:</b> '+ db_name(DR.database_id) +   '<br>' + 
   '<b>  LOGIN:</b> '+ DS.login_name +   '<br>' + 
   ' <b> WAITTIME:</b> '+ convert(varchar(4),DR.wait_time/15000)+   '<br>' +
   ' <b> BLOCKED BY:</b> '+ convert(varchar(5),DR.blocking_session_id) +   '<br>'   + 
 '<b> wAIT rESOURCE: </b>'+wait_resource + '<br>' +
 '<b> last_wait_type: </b>'+last_wait_type + '<br>' +
 ' <b> SQL Statement: </b>' +  (select name from [YOURDB].sys.objects where object_id=s2.objectid)+' '+  SUBSTRING(s2.text,  statement_start_offset / 2, ( (CASE WHEN statement_end_offset = -1 THEN (LEN(CONVERT(nvarchar(max),s2.text)) * 2) ELSE statement_end_offset END)  - statement_start_offset) / 2)    + '<br>' --convert(varchar(8000),s2.text) +   '<br>'
FROM sys.dm_exec_requests DR 
INNER JOIN sys.dm_exec_sessions DS  ON DR.Session_id = DS.Session_id
CROSS APPLY sys.dm_exec_sql_text(dr.sql_handle) AS s2 
WHERE   (DR.wait_time/15000) > @maxtime
 and DR.blocking_session_id <> 0
SELECT @message =  @BLKemailmsg + ' ' + @WAITemailmsg
      exec msdb..sp_send_dbmail  @profile_name =  @MailProfile
     ,  @recipients =  @recipients
     ,  @subject =  'Blocking Processes'
     ,  @body_format =  'HTML'
     ,  @body = @message
  ,@query_result_header = 0
 END
--

Yet another Performance issue….

September 28, 2009

This article is related to the article I wrote a while back on parameter sniffing. We recently had another problem with a production application that started to experience poor performance. The C# web application using a SQL 2005 database starting timing out. The timeout was set to 30 seconds. I wasn’t seeing deadlocks on the database and we had some blocking but it wasn’t extreme. So this problem lasted over about two weeks. Here are the steps taken to troubleshoot.

1) Began running SQL Profiler. I started tracing only RPC calls using a filter on the trace for duration > 30000 (30 seconds). My first thoughts were I better figure out which procs are taking the time. The trace showed one particular proc which was executed many times and always took alot of time on the DB. This procedure was also blocking others. We spent some time tuning this procedure and testing it and pushed it into production.

2) The tuned procedure reduced the timeouts from around 150 a day to about 50 a day. A huge improvement but still not a great user experience. I changed my trace for duration > 10,000 and monitored it for many days. I started seeing that one proc to fetch information used for a drop down list was taking more and more time each day. If I took the call from the trace and executed it from SSMS the query would execute in 1 second. If I ran the same call from OSQL on a different machine it would take 30 seconds. The application timeout setting was also increased to 60 seconds.  

3) Next I checked my indexes for fragmentation and rebuilt them old style. I mean I dropped and recreated them I didn’t use DBCC commands.

4) I increased the number of times update statistics with full scan from once a week to twice a week.

We still had issues with this one proc and it is now causing the majority of the timeouts.  When I looked at the query plan from SSMS in the GUI format it’s doing two index seeks. But when I look at the plan generated from OSQL it’s doing an index scan and generating over 3 million reads on a table that has only about 400K rows in it.

5) At this point an advisory case with Microsoft was opened.  My gut was telling me that it has to be related to the query plan but I couldn’t understand why SSMS is picking an efficient plan and OSQL is not.

I checked the connection settings for the database properties coming from my application and I compared these to the settings used by SSMS and OSQL. They are all the same. To check the login set options you use profiler to trace the Audit:Login event. Microsoft told me that these settings can affect your query plan.

6) Next thing to do is start looking at the trace when I execute the stored procedure call from SSMS and OSQL. What I found partially answered my question as to why the procedure call executes quickly from SSMS.  I traced all the events under Stored Procedures for my spid and I found that whenever I execute the proc from SSMS it was doing a cache miss. AKA not finding it in the procedure cache. Ok Why? When I do a call from OSQL it’s also doing a cache miss but then picking an inefficient plan. Well at least I can recreate the issue….

7) Next thing is to start looking at is the plans that are actually in the cache. You can do this by using the Dynamic management views. Create the function and view below for an easy look into the cache.

create function SqlAndPlan(@handle varbinary(max))
returns table
as
return select sql.text, cp.usecounts,cp.cacheobjtype,
cp.objtype, cp.size_in_bytes,
qp.query_plan
 from
 sys.dm_exec_sql_text(@handle) as sql cross join
 sys.dm_exec_query_plan(@handle) as qp
 join sys.dm_exec_cached_plans as cp
 on cp.plan_handle = @handle;

create view PlanCache
as
select sp.* from sys.dm_exec_cached_plans   as cp
cross apply SqlAndPlan(cp.plan_handle) as sp

8 ) When viewing the results you will get the XMLversion of your query plan. If you click on the results you can open it and save it as .sqlplan and when you open it again it will be in the graphical format. However the XML Plan gives you views into some things that the graphical plan does not. Some of this information is extremely useful. The thing I personally thought is the greatest is that I can actually see what parameters are used for my procedure execution when the plan is stored in the cache. You can always reverse engineer your procedure calls from this. To find the parameter values search the xml plan for parameter list. You will see something similar to the following:

<ParameterList>ColumnReference Column=”@Default” ParameterCompiledValue=”(1)” />
ColumnReference Column=”@Role” ParameterCompiledValue=”(8)” />
ColumnReference Column=”@Page” ParameterCompiledValue=”Start” />
ColumnReference Column=”@user” ParameterCompiledValue=”(1)” />
…. n  

From this information I can see if my plan is an atypical call. For example if 90% of my users have role 1 and the plan cached is for role 8 perhaps you should remove this plan from the cache and replace it with a more efficient one. This happened to be the situation in my case. To change the value in the cache you need to recompile the stored proc and execute it with different set of parameters then verify that is stored in your cache. Another useful bit of information in the cache is the compile time. The proc which was causing us issues took .001 seconds to compile the plan. We decided with the advice of Microsoft to add the recompile option to this proc to ensure the correct plan is always chosen.  The over head in this case was .001 seconds. I don’t recommend adding this option all the time and this should be carefully weighed based on the stored procedure and other tuning options available.  


Simple Space Checker

March 7, 2009

I recently had someone ask me how to I see my actual usage of data and index files without using the GUI.  Well I am going to tell you that and I am also going to tell you how to set up a monitor job to alert  you when your file is approaching it’s capacity.  Why would you want to do that when you have AutoGrow? The first reason is that Auto grow will trigger an ALTER Database statement. You have no control over the time of day that this alter will fire.  Alter will Lock your system up until it has completed. This type of maintenance should be monitored and performed at a time when system impact will be less.  If auto grow is enabled and you are not monitoring your database space you could eat up all of the drive space and crash the server. DBA’s should be aware of how fast your systems are growing and proactively add the space as needed.

That being said. This script was developed for SQL Server 2005. It uses undocumented processes which are not recommended for use by microsoft because they may be removed at anytime.

There is an undocumented DBCC command which has been around for quite some time. We will use this command dbcc showfilestats as the basis of our script.

We are going to create a system stored procedure. That way you can run this proc from any database without having to create it in all databases. It makes maintenance much easier. You can schedule it to run from SQL Agent every 30 minutes and it will send you a mail based on your file reaching 90%.

Enjoy!

set ANSI_NULLS ON
set QUOTED_IDENTIFIER ON
go

/*******************************************************************************************
* proc_checkspace
* Created By: MotherofallGeeks
* Created Date: March 2008
* Purpose check available Free space in database and send alert
* exec proc_CheckSpace ‘MyMailProfile’ , ‘Mother@motherofallgeeks.com’
*/

CREATE PROC [dbo].[sp_CheckSpace]

@MailProfile varchar(max),
@recipients varchar(max)
AS
BEGIN
SET NOCOUNT ON

DECLARE @DataUsedPercent float
DECLARE @IndexUsedPercent float
DECLARE @FileName varchar(max)
DECLARE @Size numeric
DECLARE @IndexSize float
DECLARE @UsedSpace float

DECLARE @tmpspc table
(Fileid int, FileGroup int,
TotalExtents int,
UsedExtents int,
Name sysname,
FileName nchar(520))

INSERT @tmpspc EXEC (‘dbcc showfilestats’)

SELECT
@FileName = s.physical_name ,
@Size = convert (dec (15,2),s.size) * 8192 / 1048576,
@UsedSpace =CAST(tspc.UsedExtents*convert(float,64) /1024 AS float) ,
@DataUsedPercent =(CAST(tspc.UsedExtents*convert(float,64) /1024 AS float) / (convert (dec (15,2),s.size) * 8192 / 1048576) ) * 100
FROM
sys.filegroups AS g
INNER JOIN sys.master_files AS s ON (s.type = 0 and s.database_id = db_id() and (s.drop_lsn IS NULL)) AND (s.data_space_id=g.data_space_id)
INNER JOIN @tmpspc tspc ON tspc.Fileid = s.file_id
WHERE
(CAST(cast(g.name as varbinary(256)) AS sysname)=N’PRIMARY’)

IF (@DataUsedPercent > 90)
BEGIN
DECLARE @Body varchar(max)
SET @Body = db_name()+’ Free Space is < 10% in the following File: ‘ + char(13) + @FileName +char(13)+
+ ‘Size MB =’+convert(varchar(10),@Size) + ‘ Used Space MB = ‘+convert(varchar(10),@UsedSpace)
+ ‘ %Used = ‘+ convert(varchar(10),@DataUsedPercent)

exec msdb..sp_send_dbmail @profile_name = @MailProfile
, @recipients = @recipients
, @subject = ‘Free Space Alert’
, @body_format = ‘TEXT’
, @body = @Body
,@query_result_header = 0
END

SET @FileName = ”
SET @Size = 0
SET @UsedSpace =0

SELECT
@FileName = s.physical_name ,
@Size = convert (dec (15,2),s.size) * 8192 / 1048576,
@UsedSpace =CAST(tspc.UsedExtents*convert(float,64) /1024 AS float) ,
@IndexUsedPercent =(CAST(tspc.UsedExtents*convert(float,64) /1024 AS float) / (convert (dec (15,2),s.size) * 8192 / 1048576) ) * 100
FROM
sys.filegroups AS g
INNER JOIN sys.master_files AS s ON (s.type = 0 and s.database_id = db_id() and (s.drop_lsn IS NULL
)) AND (s.data_space_id=g.data_space_id)
INNER JOIN @tmpspc tspc ON tspc.Fileid = s.file_id
WHERE
(CAST(cast(g.name as varbinary(256)) AS sysname)=N’INDEX’)
IF (@IndexUsedPercent > 90)
BEGIN
SET @Body = db_name()+’ Free Space is < 10% in the following File: ‘ + char(13) + @FileName +char(13)+
+ ‘Size MB =’+convert(varchar(10),@Size) + ‘ Used Space MB = ‘+convert(varchar(10),@UsedSpace)
+ ‘ %Used = ‘+ convert(varchar(10),@indexUsedPercent)

exec msdb..sp_send_dbmail @profile_name = @MailProfile
, @recipients = @recipients
, @subject = ‘Free Space Alert’
, @body_format = ‘TEXT’
, @body = @Body
,@query_result_header = 0
END

END

 

GO

EXECUTE sp_ms_marksystemobject '[sp_CheckSpace]'

Index This Index That

December 30, 2008

So let me ask you this… Do you really know how SQL Server storage engine works? You would be surprised how many people can’t answer simple questions about indexes on interviews. Sorry Folks if you can’t tell me what the difference between and clustered and a non-clustered index your interview score just went down.

That being said this is a very simple question I like to ask people on interviews. If your going to rebuild the indexes on a large table which has no Forgein keys. How would you do it and why? You cannot use any  dbcc or online index operations. We are going old school. To keep it simple assume your table is 5GB, has a clustered index and 5 Non-clustered indexes some of which are composite.

When I ask this my goal is to find out if you truly understand how the data is stored. If you do then the answer is quite easy.

I’ll give you the answer I am looking for first then I will explain why.

“You would drop the NON-Clustered indexes first. Then drop the Clustered index. Next you would create the CLustered index then the Non-CLustered indexes.”

Now the why.

A clustered index in the simplest explaination is that it is your data. A classic example of a clustered index is a phone book. Now of course we can get into B-Tree structures but lets save that for another post. That being said If I drop my clustered index first the operation will also drop and recreate my non-clustered indexes and can potentially take alot of time. To speed up the process drop the NC indexes first. (AFTER YOU SCRIPTED them of course)


I like to Move it Move it

December 6, 2008

I’m sure of all of us have had to move our databases around from time to time. I just came into a situation where we are doing a POC test for DR mirroring software called Double Take. I had to uninstall and reinstall my SQL destination (or failover machine) because the first install was not done properly.  For example the System databases needed to mimic the source server. So I started along my install and I specified the path for the databases.

After the install was done I checked and I forgot that the install will create an MSSQL directory on top of whatever path you specify in the install. UGH!!!

For example I wanted my system and user databases to go Z:\SQLServer\MSSQL5\DATA  and that is what I told the install so the install put the databased here Z:\SQLServer\MSSQL5\DATA \MSSQL1\DATA. What an annoyance!!!!

Rather then redoing the install I thought I will just move the system databases. Well I haven’t done this task in SQL 2005 yet and I thought ok I will go and just modify the Startup parameters as I would in SQL 2000. Not the CASE.

Here is what I need to do.

For Non Master databases

ALTER DATABASE model MODIFY FILE ( NAME = modeldev, FILENAME = Z:\SQLServer\MSSQL5\DATA \model.mdf’)
ALTER DATABASE model MODIFY FILE ( NAME = modellog, FILENAME = Z:\SQLServer\MSSQL5\DATA \ modellog.ldf’)

Repeat for tempdb and msdb if needed
Stop SQL Server

Move the files

Start SQL Server and the databases will be in the new location and you can delete the old files.

Moving Master is a little trickier

Perform the same ALTER database statement executed above for master

ALTER DATABASE master MODIFY FILE ( NAME = master, FILENAME = x:\newlocation\master.mdf’)
ALTER DATABASE master MODIFY FILE ( NAME = mastlog, FILENAME = x:\newlocation\mastlog.ldf’)

Next you need to change the startup parameter in the REGISTRY.  (You can search the registery for ERRORLOG or master.mdf to find the proper key for your server) Edit the values for the new location

Stop the SQL Server Service

Move the Files

Start the SQL Server Service in minimal config mode -f and trace flag -T3608 (prevents auto recovery)

From the command prompt run the following. You need to be in the location of the sqlservr.exe.

net start MSSQLServer /f/T3608

net start MSSQL$InstanceA /f/T3608

Connect VIA Query Analyzer and Alter mssqlsystemresource
ALTER DATABASE mssqlsystemresource MODIFY FILE ( NAME = data, FILENAME = x:\newlocation\mssqlsystemresource.mdf’)
ALTER DATABASE mssqlsystemresource MODIFY FILE ( NAME = log, FILENAME = x:\newlocation\mssqlsystemresource.ldf’)

Move the mssqlsystemresource files.
Set mssqlsystemresource database to read only.  ALTER DATABASE mssqlsystemresource SET READ_ONLY
Stop and Start the SQL Server Service in normal mode.


Poor performance from Application

December 1, 2008

Have you ever come across the senario where a query or stored procedure performs just fine from Query Analyzer but performs terrible from the application? Most of us have. How do you go about troubling shooting this? Here is the senario I ran into and the steps taken.

During the stress testing phase of our application release we came to a situation when loadrunner hit two particular tabs in our application and it would report back timeout errors. The web developers were called in as well as the DBA team.  So obviously our first step was to run a trace and find out the exact call to the stored procedure which is taking time. (I assume you already know how to use SQL Profiler if not check it out in books online) . The events I wanted to see for this run of the trace are:

SP:Starting
SP:Completed
SP:StmtStarting
SP:StmtCompleted

The reason I chose these are because my application is stored procedure driven so I want to know which proc and which statement in that proc is taking time. I need to look at the Duration column, Reads and writes as well. Remeber to pay attention to column/event compatibility in your trace. For example if I want to see the duration value populated in my trace. I need to make sure that I am capturing the XXX:completed event. That being said, I reached my goal of getting the statement and the call to the stored procedure which is taking time. I noticed in my trace that the reads were through the roof. (Over three million)

So my next step obviously was to run the call to this stored procedure. This particular stored proc is a read proc. I ran it and of course it came back in 1 second while in the trace it reported 32 seconds. Hmmmmm, intersting.

Now lets examine the procedure and see whats going on and what the tables look like. The stored procedure is pretty straight forward. It’s a child proc and its parents have most of the logic in it. It’s checking if data exists in a particular table which has 42 million rows and if it does it returns a flag which is processed by the outer proc. So now my first thought is to check when statistics have last been updated on this table and if there is any index fragmentation.

I ran showcontig (this is a SQL 2000 app) and saw my scan density to be about 30 percent.  So next I planned and rebuilt the indexes on this table. I started to think maybe the fragmentation was negatively affecting the query plan when it was coming from the app. On examination of the query plan when I ran the proc from Query Analyzer I didn’t see anything negative.

After the index maintenance the test was run again and again the same thing happened but now the issue was sparatic. This makes it even more difficult now because the same call still takes one second for me. So next I started to think that maybe there was some excessive locking or blocking happening on this table. So I had my testor run the test again for a single user to reduce contention. Even when running the test as a single user the issue appeared.  I used the Deadlock even in profiler and that was not it either.

I started to go back to the idea of the query plan. Maybe when the call is coming from the application a different query plan is being used but how do I verify this. SQL Profiler has an event class for the query plan so that was my next step.  I captured the plan during the load test and compared it with the results from my machine. Sure enough when run through the application the query plan being used was scanning the table. You can imaging scanning 42 million rows will take some time but why when I run it from my machine does it do an index seek. I started to think my procedure cache was somehow corrupted so you guessed it. My next step was to clear the procedure cache.

Same results again. Ok I am ready to pull my hair out now. Why is this happening. Back to google and I found a topic called Parameter sniffing. This is something new to me but it was the exact reason for my problem. In a nut shell parameter sniffing is a side effect of parameter use. When you use parameters in your stored procedures SQL Server will generate plans based on common parameter use.  (It’s a bit more complicated this this. check out his link http://blogs.msdn.com/queryoptteam/archive/2006/03/31/565991.aspx)  If a atypical call to the proc is made and then a typical call. The typical call will have a plan which is not optimal for that situation.

Now that I know what is happening how do I avoid this? We need to fool the optimizer. Here is some simple code

Create procedure proc_myparasniffer

AS @office varchar(255), @offfice_template varchar(255) =”Main Street”

SET @office_template=@office

SELECT A,B,C from Order where OfficeName = @office_template

It’s pretty easy we are filtering the query on office_Template because @office_template is set with a default value when SQL sniffs the parameters it will always generate the same plan.

 


Welcome

November 29, 2008

Welcome to Mother of All Geeks! What you will find on this site are articles related to Microsoft SQL Server topics. My goal is to give back to the technology communitity some of what I have learned over the years. I also hope to learn some from my readers as well.