Monitoring
Agiloft services in an on-premise installation requires leveraging and customizing a script to suit your specific installation. The provided script is not guaranteed to work as-is for all on-premise installations, and it should be tuned and customized for your specific environment.
To access and run the script:
- Download the script from: http://database-download.com/gitlab/Agiloft-Monitoring.py.zip
- Unzip the files.
If necessary, install these additional libraries:
python -m pip install --user psutil
python -m pip install --user requests
- Edit the monitoring script and make any necessary changes to suit your environment. The script contains in-line comments to help explain the structure.
When you've made all your changes, run the script as:
If the script runs successfully, the output should look something like this:
Validating parameters
Check modes: ['wildfly', 'db', 'login', 'sql']
Checking wildfly
wildfly service: EW Server
Checking db
DB service: EW Database
Checking login
base url: https://master1.test1.local/ui/
Checking sql
SQL validate: echo "placeholder for common action"
Running monitoring
...
Examples
This example is a modified version of the script that includes cluster checking to accommodate a manual custom configuration:
[MAIN]
#list of modes monitoring should check and work as
#format is json list, so keep quotations
#all options in this section must persist and be defined in this config
#NOTE: in case if 'cluster' specified all others are ignored
modes=["cluster"]
#time monitoring is in sleep
#i.e. it performs checking with specified period. for example below it is every minute.
#seconds
period=60
#global timeout specified how long monitoring should wait (possible hanged or very long) checking or execution of action to treat it failed.
#seconds
timeout=30
#for login checking: how many logins should fail (in a row) while monitoring treat that login to the server is not possible
#for example below if login fails 10 times in a row monitoring performs additional actions specified for login mode
#every success login reset this counter.
failed_login_limit=10
#this is special monitoring mode which disables all others and should be used with manually configured Agiloft cluster
[cluster]
#amount of fail over servers and main one.
count=2
#defines additional counted actions like check DB.
#the only action performed for over every machine specified by the above property
1.sql.action=sqlcmd -S dw0216.0216.df.agiloft.net -U sa -P password1 -V1 -Q "SELECT 1"
2.sql.action=sqlcmd -S dw0219.0219.df.agiloft.net -U sa -P password1 -V1 -Q "SELECT 1"
#common timeout for this counted actions
sql.timeout=20
#what to do if any counted check fails
sql.notification=powershell .\\db.notification.ps1
#what to do after counted check failed. can contain restore action like stop/start server, send signal to hyperterminal, etc.
sql.repair=echo "placeholder for restore action in case of DB failure"
#defines action to check login
#should contain URL to login page
#to check login please use systemmonitoring kb
login=https://master1.test1.local/ui/
#what to do if login failed N time in a row
login.notification=powershell .\\login.notification.ps1
login.notification.timeout=15
login.repair=echo "placeholder for restore action in case of DB failure"
login.repair.timeout=15
#defines additional action executed once at the end of checking cluster
action=echo "placeholder for common action"
action.timeout=30
notification=
notification.timeout=5
repair=
repair.timeout=5
This example leverages login.notification.ps1 to show pop-up notifications:
[reflection.assembly]::loadwithpartialname("System.Windows.Forms")
[reflection.assembly]::loadwithpartialname("System.Drawing")
$notify = new-object system.windows.forms.notifyicon
$notify.icon = [System.Drawing.SystemIcons]::Information
$notify.visible = $true
$notify.showballoontip(10,"Login FAILED!","Wildfly is not accessible, please perform restore actions!",[system.windows.forms.tooltipicon]::None)