Saturday, December 29, 2007

SQL Server Availability Monitoring - Part II

    In this part of the article I will show you how to deploy the solution that I built in the previous article SQLServer Availability Monitoring Part I . The solution was designed to monitor SQL Server in an environment from another monitoring SQL Server. Here I have assumed that I have a dedicated box with SQL Server setup for monitoring. In my case I had a MOM database server which was used only for monitoring purposes. I created my solution specific table(s) in the same server by creating another database called DBA_Benchmark.

    Below are the steps involved in the solution deployment.

    Create a table to store logging info that comes from Sub.vbs,following is the script for the same. The column names are self explanatory,

    3 Tbl_isAlive Table creation script

      CREATE TABLE [dbo].[Tbl_isAlive2] (
      [RecDateTime] [datetime] NULL ,
      [RecInstName] [varchar] (50) NULL ,
      [ResSeconds] [numeric](8, 6) NULL ,
      [ResDesc] [varchar] (500) NULL
      ) ON [PRIMARY]

    Next create a SQL job that triggers SQLPing.vbs. Instead of calling the vbs directly I have used a wrapper batch file. Batch file actually came into picture because earlier I had an idea of using it as a NT Scheduled job, however I changed my mind and decided to play with SQL Server only.

    4 Launch.bat

      :Path on my the server where VBS is located
      CD "C:\isAlive\UsingVB"
      SQLPing.vbs SrvList.txt

    In the above DOS command the SrvList.txt is the txt file that holds my entire list of SQL Server one per line. Thecontents as are shown below.

    5 SrvList.txt

      Sun\SQLSrv_2K
      Sun\SQLSrv_2K5
      Sun\MSDE

    Now, its time to create a job in SQL Server to trigger theLaunch.bat. This layer of batch file can be surely avoided ifrequired.

    6 SQL Batch creation script

      BEGIN TRANSACTION
      DECLARE @JobID BINARY(16)
      DECLARE @ReturnCode INT
      SELECT @ReturnCode = 0
      IF (SELECT COUNT(*) FROM msdb.dbo.syscategories WHERE name = N'[Uncategorized (Local)]') < 1
      EXECUTE msdb.dbo.sp_add_category @name = N'[Uncategorized (Local)]'
      IF (SELECT COUNT(*) FROM msdb.dbo.sysjobs WHERE name = N'isAliveChk') > 0
      PRINT N'The job "isAliveChk" already exists so will not be replaced.'
      ELSE
      BEGIN

      -- Add the job

      EXECUTE @ReturnCode = msdb.dbo.sp_add_job @job_id = @JobID OUTPUT , @job_name = N'isAliveChk', @owner_login_name = N'sun\girish', @description = N'Job Comments.',
      @category_name = N'[Uncategorized (Local)]', @enabled = 1, @notify_level_email = 0,
      @notify_level_page = 0, @notify_level_netsend = 0, @notify_level_eventlog = 2,
      @delete_level= 0

      IF (@@ERROR <> 0 OR @ReturnCode <> 0) GOTO QuitWithRollback

      -- Add the job steps

      EXECUTE @ReturnCode = msdb.dbo.sp_add_jobstep @job_id = @JobID, @step_id = 1, @step_name = N'step_01_01', @command = N'-- run for 3 times in a minute.

      DECLARE @iCount INT

      SET @iCount=0

      WHILE (@iCount < 5)

      BEGIN

      EXEC master..xp_cmdshell ''D:\isAliveChk\Launch.bat'', no_output

      WAITFOR DELAY ''00:00:10''

      SET @iCount=@iCount+1

      END

      -- end

      ', @database_name = N'master', @server = N'', @database_user_name = N'', @subsystem = N'TSQL', @cmdexec_success_code = 0, @flags = 0, @retry_attempts = 0, @retry_interval = 1, @output_file_name = N'', @on_success_step_id = 0, @on_success_action = 1, @on_fail_step_id = 0, @on_fail_action = 2

      IF (@@ERROR <> 0 OR @ReturnCode <> 0) GOTO QuitWithRollback

      EXECUTE @ReturnCode = msdb.dbo.sp_update_job @job_id = @JobID, @start_step_id = 1



      IF (@@ERROR <> 0 OR @ReturnCode <> 0) GOTO QuitWithRollback


      -- Add the job schedules

      EXECUTE @ReturnCode = msdb.dbo.sp_add_jobschedule @job_id = @JobID, @name = N'Sch_isAlive', @enabled = 1, @freq_type = 4, @active_start_date = 20070607, @active_start_time = 0, @freq_interval = 1, @freq_subday_type = 4, @freq_subday_interval = 1, @freq_relative_interval = 0, @freq_recurrence_factor = 0, @active_end_date = 99991231, @active_end_time = 235959

      IF (@@ERROR <> 0 OR @ReturnCode <> 0) GOTO QuitWithRollback

      -- Add the Target Servers

      EXECUTE @ReturnCode = msdb.dbo.sp_add_jobserver @job_id = @JobID, @server_name = N'(local)'

      IF (@@ERROR <> 0 OR @ReturnCode <> 0) GOTO QuitWithRollback

      END

      COMMIT TRANSACTION
      GOTO EndSave
      QuitWithRollback:
      IF (@@TRANCOUNT > 0) ROLLBACK TRANSACTION
      EndSave:

    In the job I execute the following T-SQL Code

      DECLARE @iCount INT
      SET @iCount=0
      WHILE (@iCount < 5)
      BEGIN
      EXEC master..xp_cmdshell ''D:\isAliveChk\Launch.bat'', no_output
      WAITFOR DELAY ''00:00:10''
      SET @iCount=@iCount+1
      END

    This is basically I want to check the service availabilityevery 10 seconds and SQL Agent in SQL 2000 does not allow to run any task withfrequency of less than a minute. So I use above code to overcome thatlimitation.

    After implementation is over, query the table Tbl_isAliveand see if results are getting logged, at least there should some rows, becausewe query and log results every 10 seconds, unless you have changed it tosomething different.

    Troubleshooting: In case if you see that its notworking correctly then disable the SQL Job first and try to launch the batchfile manually by dbl clicking on the file. If it works fine then looks like thepath what you have specified in the SQL Job may not be correct so fix the pathto point at the correct location.

    In case there is any other issue, the error details willbe logged in the file called as AppErrorLog.txt. If there iserror connecting to the server that it is logging the results to then you willsee the following kind of error message in the Errorlog,

    06/15/2007@ 11:52:08 AM -- Error : -2147467259 ==> [Microsoft][ODBC SQL ServerDriver][DBNETLIB]SQL Server does not exist or access denied.


    Otherwise if its having an issue with connecting to theservers to be monitored then entry in log file or SQL Server table would looklike,

    06/15/2007@ 11:52:08 AM INSERT INTO DBA_BenchMark.dbo.Tbl_isAlive VALUES('06/15/200711:51:44 AM', 'NonExistant', 12.1875, '[Microsoft][ODBC SQL ServerDriver][DBNETLIB]SQL Server does not exist or access denied.')

No comments:

About Me

By profession I am a Database Administrator (DBA) with total 13 yrs. of experience in the field of Information Technology, out of that 9 yrs as SQL DBA and last 3 years in IBM System i/iSeries and DB2 LUW 9. I have handled Developer, plus production support roles, and I like both the roles. I love and live information technology hence the name "Techonologyyogi" Apart from that I am a small, retail investor, with small investments in India and United States in the form of Equity holdings via common stocks. Don't ask me if I have made money, I have been loosing money in stocks.