Monitoring Agiloft Services

Monitoring  Agiloft services in an on-premise installation requires leveraging and customizing a script to suit your specific installation. The provided script is not guaranteed to work as-is for all on-premise installations, and it should be tuned and customized for your specific environment.

To access and run the script:

  1. Download the script from: http://database-download.com/gitlab/Agiloft-Monitoring.py.zip
  2. Unzip the files.
  3. If necessary, install these additional libraries:

    python -m pip install --user psutil
    python -m pip install --user requests
  4. Edit the monitoring script and make any necessary changes to suit your environment. The script contains in-line comments to help explain the structure.
  5. When you've made all your changes, run the script as:

    python Monitoring.py
  6. If the script runs successfully, the output should look something like this:

    Validating parameters
    Check modes: ['wildfly', 'db', 'login', 'sql']
      Checking wildfly
        wildfly service: EW Server
      Checking db
        DB service: EW Database
      Checking login
        base url: https://master1.test1.local/ui/
      Checking sql
        SQL validate: echo "placeholder for common action"
    Running monitoring
    ...

Examples

This example is a modified version of the script that includes cluster checking to accommodate a manual custom configuration:

Example of script edited to support custom cluster configuration
[MAIN]
    #list of modes monitoring should check and work as
    #format is json list, so keep quotations
    #all options in this section must persist and be defined in this config
    #NOTE: in case if 'cluster'  specified all others are ignored
modes=["cluster"]
    #time monitoring is in sleep 
    #i.e. it performs checking with specified period. for example below it is every minute.
    #seconds
period=60
    #global timeout specified how long monitoring should wait (possible hanged or very long) checking or execution of action to treat it failed.
    #seconds
timeout=30
    #for login checking: how many logins should fail (in a row) while monitoring treat that login to the server is not possible
    #for example below if login fails 10 times in a row monitoring performs additional actions specified for login mode
    #every success login reset this counter.
failed_login_limit=10
 
    #this is special monitoring mode which disables all others and should be used with manually configured Agiloft cluster
[cluster]
    #amount of fail over servers and main one.
count=2
    #defines additional counted actions like check DB.
    #the only action performed for over every machine specified by the above property
1.sql.action=sqlcmd -S dw0216.0216.df.agiloft.net -U sa  -P password1 -V1 -Q "SELECT 1"
2.sql.action=sqlcmd -S dw0219.0219.df.agiloft.net -U sa  -P password1 -V1 -Q "SELECT 1"
    #common timeout for this counted actions
sql.timeout=20
    #what to do if any counted check fails
sql.notification=powershell .\\db.notification.ps1
    #what to do after counted check failed. can contain restore action like stop/start server, send signal to hyperterminal, etc.
sql.repair=echo "placeholder for restore action in case of DB failure"
 
    #defines action to check login
    #should contain URL to login page
    #to check login please use systemmonitoring kb
login=https://master1.test1.local/ui/
    #what to do if login failed N time in a row
login.notification=powershell .\\login.notification.ps1
login.notification.timeout=15
login.repair=echo "placeholder for restore action in case of DB failure"
login.repair.timeout=15
 
    #defines additional action executed once at the end of checking cluster
action=echo "placeholder for common action"
action.timeout=30
notification=
notification.timeout=5
repair=
repair.timeout=5

This example leverages login.notification.ps1 to show pop-up notifications:

Example of pop-up notifications
[reflection.assembly]::loadwithpartialname("System.Windows.Forms")
[reflection.assembly]::loadwithpartialname("System.Drawing")
$notify = new-object system.windows.forms.notifyicon
$notify.icon = [System.Drawing.SystemIcons]::Information
$notify.visible = $true
$notify.showballoontip(10,"Login FAILED!","Wildfly is not accessible, please perform restore actions!",[system.windows.forms.tooltipicon]::None)
CONTENTS