Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Time based view check #23

Open
wants to merge 3 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
47 changes: 46 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
@@ -1,10 +1,13 @@

# Overview

This repostitory contains three nagios plugins:
This repostitory contains five nagios plugins:
* check_jenkins_job_extended.pl - The original, as documented below. Designed to check for failures, not how long since success.
* check_jenkins_cron.pl - A from-scratch copy designed to check jobs that *should* build periodically.
* check_jenkins_nodes.pl - Checks the number of nodes with a status of "offline".
* check_jenkins_view_last_success.pl - A nagios plugin for checking all jobs last success time in specified Jenkins view
* check_jenkins_view.pl - A nagios plugin for checking all jobs health in specified Jenkins view
* check_jenkins_job.pl - A nagios plugin for checking specified job

# check_jenkins_cron.pl

Expand Down Expand Up @@ -167,3 +170,45 @@ define service {
contacts bob,bill
}
```

# check_jenkins_view_last_success.pl

A nagios plugin for checking all jobs in specified Jenkins view.

It returns status as worst (highest) job status within the view.
When job was last succesfully built:
- more than critical_days_ago then its status is CRITICAL
- else if more than warning_days_ago - WARNING
- else - OK

Produces descriptive output with Nagios performance data.

## Usage

```
Usage: check_jenkins_view_last_success.pl <Jenkins URL> <user_name> <password> <view_name> <critical_days_ago> <warning_days_ago>
```

# check_jenkins_view.pl

A Nagios plugin for checking all jobs health in specified Jenkins view.

It returns status as worst (highest) job status within the view.
Job status is OK when health is above warning threshold, WARNING when health is between two given thresholds, otherwise - CRITICAL.
It produces descriptive output with Nagios performance data.

## Usage

```
check_jenkins_view.pl <Jenkins URL> <user_name> <password> <view_name> <critical_health_threshold> <warning_health_threshold>
```
# check_jenkins_job.pl

A Nagios plugin for checking specified job health.
Job status is OK, when health is 100%, WARNING when health is above specified thershold, otherwise - CRITICAL.

## Usage

```
check_jenkins_job.pl <Jenkins URL> <user_name> <password> <job_name> <critical_health_threshold>
```
79 changes: 79 additions & 0 deletions check_jenkins_job.pl
Original file line number Diff line number Diff line change
@@ -0,0 +1,79 @@
#!/usr/bin/perl
use strict;
use LWP::UserAgent;
use JSON;
use DateTime;
use URI::Escape;

#
# Check Hudson job status using the JSON API
#
# (c) 2011 Jon Cowie, Etsy Inc
# (c) 2015 Piotr Chromiec, RTBHouse
#
# Plugin for checking hudson build that alerts when more than x builds have failed, or a build took more than y seconds.
#
# Usage: check_jenkins_job url [user_name password] job_name concurrent_fails_threshold build_duration_threshold_milliseconds last_stable_build_threshold_minutes_warn last_stable_build_threshold_minutes_crit

# Nagios return values
# OK = 0
# WARNING = 1
# CRITICAL = 2
# UNKNOWN = 3

my $retStr = "Unknown - plugin error";
my @alertStrs = ("OK", "WARNING", "CRITICAL", "UNKNOWN");
my $exitCode = 3;
my $numArgs = $#ARGV + 1;

my $ciMasterUrl;
my $jobName;

my $userName;
my $password;

my $criticalThreshold;

if ( $numArgs == 5 ){
$ciMasterUrl = $ARGV[0];
$userName = $ARGV[1];
$password = $ARGV[2];
$jobName = $ARGV[3];
$criticalThreshold = $ARGV[4];
} else {
print "\nA nagios plugin for checking specified job\n";
print "\nUsage: check_jenkins_job.pl url user_name password job_name critical_health_threshold\n";
exit $exitCode;
}

my $jobStatusUrlPrefix = $ciMasterUrl . "/job/" . uri_escape($jobName);
my $jobStatusURL = $jobStatusUrlPrefix . "/api/json";

my $ua = LWP::UserAgent->new(
ssl_opts => { SSL_verify_mode => 'SSL_VERIFY_NONE' },
);
my $req = HTTP::Request->new( GET => $jobStatusURL );
$req->authorization_basic( $userName, $password );
my $res = $ua->request($req);

if ( $res->is_success ) {
my $json = new JSON;

my $obj = $json->decode( $res->content );

my $health = $obj->{healthReport}->[0]->{score};

if ( $health == 100 ) {
$retStr = "Last build OK";
$exitCode = 0;
} else {
$retStr = "Health score is: " . $health."%" ;
$exitCode = ( $health > $criticalThreshold ? 1 : 2 );
}
} else {
$retStr = "Failed retrieving status for job $jobName via API (API status line: $res->{status_line})";
$exitCode = 3;
}

print $alertStrs[$exitCode] . " - $retStr\n";
exit $exitCode;
99 changes: 99 additions & 0 deletions check_jenkins_view.pl
Original file line number Diff line number Diff line change
@@ -0,0 +1,99 @@
#!/usr/bin/perl
# A nagios plugin for checking all jobs in specified Jenkins view
#
# (c) 2015 Piotr Chromiec, RTBHouse

use strict;
use LWP::UserAgent;
use JSON;
use DateTime;
use URI::Escape;

my $retStr = "Unknown - plugin error";
my $perfData = "";
my @alertStrs = ("OK", "WARNING", "CRITICAL", "UNKNOWN");
my $exitCode = 3;
my $numArgs = $#ARGV + 1;

my $ciMasterUrl;
my $viewName;

my $userName;
my $password;

my $criticalThreshold;
my $warningThreshold;

if ( $numArgs == 6 ){
$ciMasterUrl = $ARGV[0];
$userName = $ARGV[1];
$password = $ARGV[2];
$viewName = $ARGV[3];
$criticalThreshold = $ARGV[4];
$warningThreshold = $ARGV[5];
} else {
print "\nA nagios plugin for checking all jobs in specified Jenkins view\n";
print "\nUsage: check_jenkins_view.pl url user_name password view_name critical_health_threshold warning_health_threshold\n";
exit $exitCode;
}

my $viewStatusURL = $ciMasterUrl . "/view/" . uri_escape($viewName) . "/api/json";

my $req = HTTP::Request->new( GET => $viewStatusURL );
$req->authorization_basic( $userName, $password );
my $ua = LWP::UserAgent->new( ssl_opts => { SSL_verify_mode => 'SSL_VERIFY_NONE' } );
my $res = $ua->request($req);

my $jobNo = 0;
my @alertCnts = (0, 0, 0, 0);

if ( $res->is_success ) {
$exitCode = 0;
$retStr = "";
my $json = new JSON;

my $viewJSON = $json->decode( $res->content );
for my $job( @{$viewJSON->{jobs}} ) {
$jobNo += 1;
my $jobStatus = 3;
my $health = -1;
my $msg = "";
my $jobURL = $job->{url} . "/api/json";
$req->uri( $jobURL );
$res = $ua->request($req);

if ( $res->is_success ) {
my $jobJSON = $json->decode( $res->content );
$health = $jobJSON->{healthReport}->[0]->{score};

if ( $health < $criticalThreshold ) {
$jobStatus = 2;
} elsif ( $health < $warningThreshold ) {
$jobStatus = 1;
} else {
$jobStatus = 0;
}
} else {
$msg = "UNKNOWN, status retrieval failure: $res->{status_line}";
}

my $jobName = $job->{name};
$jobName =~ tr/ ()/_/;
$perfData = $perfData . sprintf("\n%s=%d", $jobName, $health) ;
$retStr = $retStr . sprintf("\n %d. %-50s - %-10s health: %d%% %s", $jobNo, $job->{name}, $alertStrs[$jobStatus], $health, $msg);
#$retStr = $retStr . "\n[" . $jobNo . "] " . $job->{name} . " - " . $alertStrs[$jobStatus] . ", health: " . $health . "%" ;
$exitCode = ( $exitCode > $jobStatus ? $exitCode : $jobStatus);
$alertCnts[$jobStatus] += 1;
}
} else {
$retStr = "Failed retrieving status for view $viewName ($res->{status_line})";
$exitCode = 3;
}

print $alertStrs[$exitCode] . " - '" . $viewName . "' view: ";
print ($jobNo > 0 ? $jobNo . " jobs checked" : "" );
for my $i ( 0 .. $#alertCnts ) {
print ($alertCnts[$i] > 0 ? ", ". $alertCnts[$i] . " " . $alertStrs[$i] : "" );
}
print $retStr . " | " . $perfData . "\n";
exit $exitCode;
116 changes: 116 additions & 0 deletions check_jenkins_view_last_success.pl
Original file line number Diff line number Diff line change
@@ -0,0 +1,116 @@
#!/usr/bin/perl
# A nagios plugin for checking all jobs in specified Jenkins view
#
# (c) 2017 Piotr Chromiec @ RTBHouse com

use strict;
use LWP::UserAgent;
use JSON;
use DateTime;
use URI::Escape;

my $dayMilliseconds = 24*3600*1000;
my $retStr = "Unknown - plugin error";
my $perfData = "dummy=0";
my @alertStrs = ("OK", "WARNING", "CRITICAL", "UNKNOWN");
my $exitCode = 3;
my $numArgs = $#ARGV + 1;

my $ciMasterUrl;
my $viewName;

my $userName;
my $password;
my $criticalDaysAgo;
my $warningDaysAgo;

if ( $numArgs == 6 ){
$ciMasterUrl = $ARGV[0];
$userName = $ARGV[1];
$password = $ARGV[2];
$viewName = $ARGV[3];
$criticalDaysAgo = $ARGV[4];
$warningDaysAgo = $ARGV[5];
} else {
print "
A nagios plugin for checking all jobs in specified Jenkins view.

It returns status as worst (highest) job status within the view.
When job was last succesfully built:
- more than critical_days_ago then its status is CRITICAL
- else if more than warning_days_ago - WARNING
- else - OK

Produces descriptive output with Nagios performance data.

Usage: check_jenkins_view_last_success.pl url user_name password view_name critical_days_ago warning_days_ago\n";
exit $exitCode;
}

my $viewStatusURL = $ciMasterUrl . "/view/" . uri_escape($viewName) . "/api/json";

my $req = HTTP::Request->new( GET => $viewStatusURL );
$req->authorization_basic( $userName, $password );
my $ua = LWP::UserAgent->new( ssl_opts => { SSL_verify_mode => 'SSL_VERIFY_NONE' } );
my $res = $ua->request($req);

my $jobNo = 0;
my @alertCnts = (0, 0, 0, 0);

if ( $res->is_success ) {
$exitCode = 0;
$retStr = "";
my $json = new JSON;
my $viewJSON = $json->decode( $res->content );

for my $job( @{$viewJSON->{jobs}} ) {
$jobNo += 1;
my $jobStatus = 3;
my $lastSuccessDaysAgo = -1;
my $msg = "";
my $jobURL = $job->{url} . "lastSuccessfulBuild/api/json?tree=timestamp,duration";

$req->uri( $jobURL );
$res = $ua->request($req);

if ( $res->is_success ) {
my $jobJSON = $json->decode( $res->content );
my $lastSuccessfulBuildTs = $jobJSON->{timestamp} + $jobJSON->{duration};
my $nowTs = time() * 1000;

$lastSuccessDaysAgo = ($nowTs - $lastSuccessfulBuildTs ) / $dayMilliseconds;

if ( $lastSuccessDaysAgo > $criticalDaysAgo) {
$jobStatus = 2;
} elsif ( $lastSuccessDaysAgo > $warningDaysAgo ) {
$jobStatus = 1;
} else {
$jobStatus = 0;
}
} else {
$msg = "UNKNOWN, status retrieval failure: $res->{status_line}";
}

my $jobName = $job->{name};

$jobName =~ tr/ ()/_/;
$perfData = $perfData . sprintf("\n%s=%.2f", $jobName, $lastSuccessDaysAgo);
$retStr = $retStr . sprintf("\n %2d. %-60s - %-10s days ago: %.1f %s", $jobNo, $job->{name}, $alertStrs[$jobStatus], $lastSuccessDaysAgo, $msg);
#$retStr = $retStr . "\n[" . $jobNo . "] " . $job->{name} . " - " . $alertStrs[$jobStatus] . ", health: " . $health . "%" ;
$exitCode = ( $exitCode > $jobStatus ? $exitCode : $jobStatus);
$alertCnts[$jobStatus] += 1;
}
} else {
$retStr = "Failed retrieving status for view $viewName ($res->{status_line})";
$exitCode = 3;
}

print $alertStrs[$exitCode] . " - '" . $viewName . "' view: ";
print ($jobNo > 0 ? $jobNo . " jobs checked" : "" );

for my $i ( 0 .. $#alertCnts ) {
print ($alertCnts[$i] > 0 ? ", ". $alertCnts[$i] . " " . $alertStrs[$i] : "" );
}

print $retStr . " | " . $perfData . "\n";
exit $exitCode;