Friday, August 24, 2012

T-SQL: Performance of SQL Agent Duration Calculations

In my article published yesterday on using Powershell for SQL Durations (Working with SQL Agent Durations), I proposed using a combination of math functions to extract the durations of SQL Agent jobs in a dependable and accurate manner. While the article sought to provide a better alternative for those using Powershell, the discussion of the article was primarily about T-SQL solutions. The question of both capability and performance were both discussed, so I decided to research the limitations and performance metrics of the various solutions posted.

A combination of string and math based solutions were posted as potential alternatives. A recent post highlighted the use of the msdb function agent_datetime. All of these math functions posted will provide an accurate second value from a SQL Agent duration without regard to sign (positive or negative) or size (greater than 6 digits).

All of the string methods mentioned, including the agent_datetime function, rely on three facts to be true.
1) All values in the table must either be zero or positive integers
2) No value greater than 235959 can exist in the table because of datetime conversion issues.
3) Positions 2-3 and Positions 4-5 must not be greater than 60 due to datetime conversion issues.
While I understand that the existence of any of these values is a sign of a larger issue – if the code generating the report fails, the report is incapable of telling me there is a problem. I generally run reports on new environments to understand what the current state is and what needs to be worked on. It is not uncommon to find jobs that have run more than a day in a new environment. While less common, I have also had to deal with negative duration values stored in the sysjobhistory. I agree – neither is ideal and neither should exist. However, it is much more common for me to be called to fix what is broken, not to watch an already stable and perfect environment.

Testing Performance

I have never actually tested the performance of using the math based solution vs. the string manipulation solutions. So, I devised a process to put all of these suggestions to a performance test. I first created a table with integers –1971000 through 1971000. I understand this is extreme, in both directions, but I had to use something large enough to get some idea of performance.

You can examine the T-SQL at the end of this post. At each test, the T-SQL represents a proposed solution on the forum thread. Each test converts each of the integer values in the table to a number of seconds based upon the HHMMSS rule of how these durations are stored in sysjobhistory. These are populated into a variable so that any transfer to the client or IO issues are eliminated in the performance test. The only thing being tested should be CPU and memory throughput. In order to adjust for the performance of any given environment, I establish a control. This control does no conversion on the integer value, simply selects each one into a variable. This should provide a good baseline for what the maximum performance could be.

Given that each of the string functions proposed fail unless provided very specific numeric values, I eliminated any negative values, any values > 235959, and any values where the HH and SS values were > 59 to avoid datetime conversion errors.

Below is the average of each of these methods (and a control) over 10 executions for both my laptop and an enterprise class server.


AvgDurationMS
RunType Laptop Server
235959 Control 60 27
235959 Mod Math Implicit 87 38
235959 Mod Math 90 44
235959 Mod Math Intrope 123 44
235959 SQL-Tucker 163 79
235959 rmechaber 182 90
Full Control 983 406
235959 agent_datetime 1516 658
Full Mod Math 1755 790
Full Mod Math Implicit 1814 867
Full Mod Math Intrope 2377 1077

Admittedly, the difference in these test scenarios is negligible. However, in all cases, using the mod operator and the implicit truncation within SQL Server was the fastest conversion method. In addition to being the fastest – it’s the most compact visibly and works for just about any values – not just those less than a single day. I encourage you to run these tests in your environment and see which solution comes out on top.

As with anything in SQL Server, when asked which method is “better” I would have to say “It depends.” However, if you ask which appears to be fastest, which works for most conceivable situations, and which will I use? I’m going to stick with math – and leave the strings be.

About Kyle Neier
Husband of a magnificent woman, father of 5, SQL Server geek,
IndyPASS Vice President and Food Guy, DBA automation zealot, amateur Powershell evangelist. Follow Me on Twitter

T-SQL Used to Test

USE tempdb
GO
/*
Adapted from 
http://archive.msdn.microsoft.com/SQLExamples/Wiki/View.aspx?title=NumbersTable&referringTitle=Home
*/
-- Suppress data loading messages
SET NOCOUNT ON

-- Create Sample Data using a Table Varable
/*
    Positive and Negative values
*/
SELECT TOP 3942000 IDENTITY(INT,-1971000,1) AS N
INTO Numbers
FROM sys.all_objects a, sys.all_objects b, sys.all_objects c, sys.all_objects d

-- Create a Primary Key and Clustered Index to control data order
ALTER TABLE dbo.Numbers ADD CONSTRAINT
PK_Numbers PRIMARY KEY CLUSTERED (N) 
GO

CREATE TABLE DurationMetrics(
RunType VARCHAR(100) NOT NULL, StartTime DATETIME NOT NULL, EndTime DATETIME NOT NULL)
GO
DECLARE @StartTime DATETIME,
@EndTime DATETIME,
@DurationSeconds int

DECLARE @Loops INT = 10,
@i INT


/*Control*/
SELECT @i = 0

WHILE @Loops > @i
BEGIN
SELECT @StartTime = GETDATE()
SELECT @DurationSeconds = N
FROM Numbers
SELECT @EndTime = GETDATE()

SELECT @i = @i + 1

INSERT INTO DurationMetrics (RunType, StartTime, EndTime)
SELECT 'Control' AS RunType, @StartTime AS StartTime, @EndTime AS Endtime
END

/*Mod Math*/
SELECT @i = 0

WHILE @Loops > @i
BEGIN
SELECT @StartTime = GETDATE()
SELECT @DurationSeconds =  
FLOOR(N/10000) * 3600 +
FLOOR(N/100%100) * 60 +
N%100 
FROM Numbers
SELECT @EndTime = GETDATE()

SELECT @i = @i + 1

INSERT INTO DurationMetrics (RunType, StartTime, EndTime)
SELECT 'Mod Math' AS RunType, @StartTime AS StartTime, @EndTime AS Endtime
END

/*Mod Math Implicit*/
SELECT @i = 0

WHILE @Loops > @i
BEGIN
SELECT @StartTime = GETDATE()
SELECT @DurationSeconds =  
N/10000 * 3600 +
N/100%100 * 60 +
N%100 
FROM Numbers
SELECT @EndTime = GETDATE()

SELECT @i = @i + 1

INSERT INTO DurationMetrics (RunType, StartTime, EndTime)
SELECT 'Mod Math Implicit' AS RunType, @StartTime AS StartTime, @EndTime AS Endtime
END

/*Mod Math Intrope*/
SELECT @i = 0

WHILE @Loops > @i
BEGIN
SELECT @StartTime = GETDATE()
SELECT @DurationSeconds =  
((N - N % 10000) /10000) * 60 * 60 
+ ((N % 10000 - N % 100) /100) * 60
+ N % 100
FROM Numbers
SELECT @EndTime = GETDATE()

SELECT @i = @i + 1

INSERT INTO DurationMetrics (RunType, StartTime, EndTime)
SELECT 'Mod Math Intrope' AS RunType, @StartTime AS StartTime, @EndTime AS Endtime
END


/*235959 Control*/
SELECT @i = 0

WHILE @Loops > @i
BEGIN
SELECT @StartTime = GETDATE()
SELECT @DurationSeconds = N
FROM Numbers WHERE N BETWEEN 0 AND 235959
AND N%100 < 60 AND N/100%100 < 60
SELECT @EndTime = GETDATE()

SELECT @i = @i + 1

INSERT INTO DurationMetrics (RunType, StartTime, EndTime)
SELECT '235959 Control' AS RunType, @StartTime AS StartTime, @EndTime AS Endtime
END

/*235959 SQL-Tucker*/
SELECT @i = 0

WHILE @Loops > @i
BEGIN
SELECT @StartTime = GETDATE()
SELECT @DurationSeconds = 
datediff(ss,0,cast(stuff(stuff(left('000000',6-len(N))+cast(N AS VARCHAR),5,0,':'),3,0,':') AS DATETIME))
FROM Numbers WHERE N BETWEEN 0 AND 235959
AND N%100 < 60 AND N/100%100 < 60
SELECT @EndTime = GETDATE()

SELECT @i = @i + 1

INSERT INTO DurationMetrics (RunType, StartTime, EndTime)
SELECT '235959 SQL-Tucker' AS RunType, @StartTime AS StartTime, @EndTime AS Endtime
END

/*235959 Mod Math*/
SELECT @i = 0

WHILE @Loops > @i
BEGIN
SELECT @StartTime = GETDATE()
SELECT @DurationSeconds =  
FLOOR(N/10000) * 3600 +
FLOOR(N/100%100) * 60 +
N%100 
FROM Numbers WHERE N BETWEEN 0 AND 235959
AND N%100 < 60 AND N/100%100 < 60
SELECT @EndTime = GETDATE()

SELECT @i = @i + 1

INSERT INTO DurationMetrics (RunType, StartTime, EndTime)
SELECT '235959 Mod Math' AS RunType, @StartTime AS StartTime, @EndTime AS Endtime
END

/*235959 Mod Math Implicit*/
SELECT @i = 0

WHILE @Loops > @i
BEGIN
SELECT @StartTime = GETDATE()
SELECT @DurationSeconds =  
N/10000 * 3600 +
N/100%100 * 60 +
N%100 
FROM Numbers  WHERE N BETWEEN 0 AND 235959
AND N%100 < 60 AND N/100%100 < 60
SELECT @EndTime = GETDATE()

SELECT @i = @i + 1

INSERT INTO DurationMetrics (RunType, StartTime, EndTime)
SELECT '235959 Mod Math Implicit' AS RunType, @StartTime AS StartTime, @EndTime AS Endtime
END

/*235959 Mod Math Intrope*/
SELECT @i = 0

WHILE @Loops > @i
BEGIN
SELECT @StartTime = GETDATE()
SELECT @DurationSeconds =  
((N - N % 10000) /10000) * 60 * 60 
+ ((N % 10000 - N % 100) /100) * 60
+ N % 100
FROM Numbers WHERE N BETWEEN 0 AND 235959
AND N%100 < 60 AND N/100%100 < 60
SELECT @EndTime = GETDATE()

SELECT @i = @i + 1

INSERT INTO DurationMetrics (RunType, StartTime, EndTime)
SELECT '235959 Mod Math Intrope' AS RunType, @StartTime AS StartTime, @EndTime AS Endtime
END

/*235959 rmechaber*/
SELECT @i = 0

WHILE @Loops > @i
BEGIN
SELECT @StartTime = GETDATE()
SELECT @DurationSeconds =  
LEFT(RIGHT('000000' + CAST(N AS VARCHAR(6)), 6), 2) *3600 +
Substring(RIGHT('000000' + CAST(N AS VARCHAR(6)), 6), 3, 2) *60 +
RIGHT(RIGHT('000000' + CAST(N AS VARCHAR(6)), 6), 2)
FROM Numbers WHERE N BETWEEN 0 AND 235959
AND N%100 < 60 AND N/100%100 < 60
SELECT @EndTime = GETDATE()

SELECT @i = @i + 1

INSERT INTO DurationMetrics (RunType, StartTime, EndTime)
SELECT '235959 rmechaber' AS RunType, @StartTime AS StartTime, @EndTime AS Endtime
END

/*235959 agent_datetime*/
SELECT @i = 0

WHILE @Loops > @i
BEGIN
SELECT @StartTime = GETDATE()
SELECT @DurationSeconds =  
datediff(s,msdb.dbo.agent_datetime(19000101,0),msdb.dbo.agent_datetime(19000101,N))
FROM Numbers WHERE N BETWEEN 0 AND 235959
AND N%100 < 60 AND N/100%100 < 60
SELECT @EndTime = GETDATE()

SELECT @i = @i + 1

INSERT INTO DurationMetrics (RunType, StartTime, EndTime)
SELECT '235959 agent_datetime' AS RunType, @StartTime AS StartTime, @EndTime AS Endtime
END



SELECT RunType, 
AVG(DATEDIFF(ms, StartTime, EndTime)) AS AvgDurationMS,
MIN(DATEDIFF(ms, StartTime, EndTime)) AS MaxDurationMS,
MAX(DATEDIFF(ms, StartTime, EndTime)) AS MaxDurationMS
FROM DurationMetrics
GROUP BY RunType
ORDER BY AvgDurationMS

/*Clean Up*/
DROP TABLE DurationMetrics
DROP TABLE Numbers



No comments: