Send alert when SQL agent job runs too long

Posted by
Estimated Reading Time: 2 minutes

Looking for a solution to send an email alert when a SQL agent job is running too long? Read on 🙂

Recently I did some maintenance to a SQL Server database instance, where I discovered one agent job was periodically running too long. This job copied data on a daily basis from a transactional database to a staging database for BI purposes. The owner was depending on actual BI data every morning, so you can already guess it needs to be reliable and proactively monitored.

By the way, the root cause for the unexpected random delays wasn’t found yet so I decided to create an alert mechanism to inform the IT staff proactively when the specific job duration exceeds a treshold of a given amount of time.

I do realize it’s just a workaround and not a solution for the root cause of delays, however, the owner is now able to act proactively when the job gets out of hand again.

Let’s jump to the fun part: building the solution!

    1. Start SQL Server Management studio (with Run as admin priv.) and login to the database instance.
    2. Configure Database Mail. You can follow this official Microsoft tutorial to accomplish.

 

    1. Create a SQL Agent monitoring job which will execute the T-SQL script at the bottom of this post. Make sure to modify the script to your needs. (search for TODO tag)
    2. Make sure to let the monitoring job on a recurring base, i.e. every 15 minutes.
    3. To test the monitoring job, you can create a second job which will only run T-SQL command:
      WAITFOR DELAY '00:30:00' -- let the job hang for 30 minutes
    4. Let the second (hanging) job run, and make sure the monitoring job periodically runs to check the duration of the hanging job. It should send a mail when the configured treshold get exceeded.
    5. That’s it! 😉 having questions or comments, please feel free to post your comment(s).

Happy querying!

Monitoring job T-SQL script:

-- Use this SQL script to alert when a SQL agent job is overdue (duration too long)
 
-- To test the behaviour, create a agent job containing T-SQL query:
-- WAITFOR DELAY '00:30:00'
-- This will hang the query for 30 minutes
 
-- Make sure you've configured SQL Database Mail profile and Operator(s)
-- Then find and modify all rows containing TODO:
-- At last, put this query in a job to run i.e. every 15 minutes
 
USE [msdb]
 
IF (SELECT
    DATEDIFF(SECOND, aj.start_execution_date, GETDATE())
  FROM msdb..sysjobactivity aj
  INNER JOIN msdb..sysjobs sj
    ON sj.job_id = aj.job_id
  WHERE aj.stop_execution_date IS NULL -- condition: job hasn't stopped running
  AND aj.start_execution_date IS NOT NULL -- condition: job is currently running
  AND sj.name = 'Daily refresh staging job' -- TODO: fill in the exact job name to monitor
  AND NOT EXISTS (SELECT
    1
  FROM msdb..sysjobactivity NEW
  WHERE NEW.job_id = aj.job_id
  AND NEW.start_execution_date > aj.start_execution_date)
  -- TODO: change 3600 below to desired treshold value in seconds
  )
  > 3600
  EXEC dbo.sp_notify_operator @profile_name = N'MailAlert', -- TODO
                              @name = N'IT-Support@domain.com', -- TODO
                              @subject = N'Warning: Refresh staging data warehouse job overdue', -- TODO
                              @body = N'WARNING! Job Daily refresh staging data on server XXXXX took longer than expected! '; -- TODO
ELSE
  PRINT 'Job not running or not (yet) exceeding treshold';
GO

Sources:

Leave a Reply

Your email address will not be published. Required fields are marked *