7 Cloud-based tools for the modern sysadmin

The old days spending long hours building on premise infrastructure is over. There are so many cloud-based tools, with such a good quality and easiness for deployment, that the valuable sysadmin time cannot be spent implementing complex tools. Prices are accessible, and generally, there are different levels of licensing where companies can find their right fit.

Most of these tools can be cataloged as Software as a service (SaaS)

New relic

Focus of New Relic is real-time performance monitoring for applications. It covers the most used platforms as Java, Python, PHP, .NET, etc. The insights the tool can provide are excellent, even in some cases it seems to be magical, for example automatically linking different services of your platform.

Not only application stats can be collected, New relic has a series of agents or plugins to monitor from the OS level to the database level. Although it is not an inexpensive tool.

In addition to metrics, the tool can capture app errors. And using thresholds for the performance metrics and error rates is possible to send alarms, what implies a much more intelligent control of the infrastructure.

PagerDuty

Now a days receiving alerts by e-mail is not enough. A tool like PagerDuty can manage alarms from different sources, and send them via e-mail, SMS or phone calls. It is a great tool to centralize events, and smartly distribute them, for example based on an on-call schedule.

The most basic method to receive alters is by email, but more sophisticated sources can be used, the APIs provided by Pager Duty enables ntegration with tools like New Relic and services like AWS, among others.

Loggly

Log analysis is a central task during the troubleshooting of issues, and even to proactively detect problems.

The Unix power tools, like grep, are ever present, but sometimes even more powerful tools are required. Graphical representation of events could be key to identify root-causes and easy and fast searches save a lot of time.

Loggly can aid the sysadmin in these matters. Although there are excellent open-source tools, like logstash, to deal with logs, it has a quite difficult learning curve, and sometimes it could have too much complexity. Loggly is an SaaS tool, where you can have log analysis up and running as fast as in few minutes.

Confluence Atlassian OnDemand

Documenting, at least for me, is an enjoyable task. Even more with neat tools.

The wiki concept is here since several years ago. There are plenty of free wiki systems, and maybe they can be enough for most of the environments. But if you want to have an enterprise-class tool, I think Confluence is the right choice. Even at an affordable price.

Atlassian also is leader with tools like Jira, which have a very good integration with Confluence

Draw.io

Network diagramming, was always a choice between two extremes: on one side a feature-rich and expensive tool (Visio), on the other, rustic tools with a bad usability.

Draw.io is a quite good tool, with the basic features required to make a good looking diagrams. Although It doesn’t have a wide number of stencils as in Visio, nor is extensible with new ones, if you don’t need such a fancy graphics, this tool is more than enough.

Regarding the cloud features, it allows saving your files in Google Drive, Dropbox and other cloud-storages.

In addition… it’s free

Toggl

Time tracking and logging is always unattractive for the technical expert, it can be an awful task if the tools available are old, hard to use, or too rigid. Not to mention the multitasking nature of any sysadmin role, what makes even harder to associate each task to a given issue, project, or client.

A tool like Toggl makes the task easier and a bit more attractive. In the web version, with a single click you can start tracking time for your daily tasks. The mobile app does not let you make excuses of not having the computer close.

Checkvist

Finaly a simple tool to keep to do lists.

 

 

 

Advertisements

Automating backup software configuration with Puppet

This post will show how to get started with the module bacula, to implement backups for the managed systems.
bacula_med
Also it can be use as a complement of the previous labs: Automate Nagios Configuration with Puppet – Part 1 , Part 2 and Part 3.
The bacula module used is the one published by RHarrison: https://forge.puppetlabs.com/rharrison/bacula

Install the module:

puppet module install rharrison/firewall

Create a new module with following manifests.
Highlighted lines are the ones added to the example code shown in the README of the module.

class my_bacula::client {
  include my_bacula::client::firewall
  $director_password = 'abcdefghijk'
  $director_server   = "core.${::domain}"

  # First install and configure bacula-fd pointing to the director.
  class { 'bacula':
    director_password => $director_password,
    director_server   => $director_server,
    is_client         => true,
    storage_server    => $director_server,
  }
}
class my_bacula {
  class { '::mysql::server':
  }
  include my_bacula::firewall
  $director_password = 'abcdefghijk'
  $director_server   = $::fqdn
  $bacula_clients = {
    "web01.${::domain}" => {
      client_schedule => 'WeeklyCycle',
      fileset         => 'Basic:noHome',
    },
    "core.${::domain}" => {
      client_schedule => 'WeeklyCycle',
      fileset         => 'Basic:noHome',
    },
  }
  # Lets set up the director server.
  class { '::bacula':
    clients           => $bacula_clients,
    console_password  => 'abcdefghijk',
    director_password => $director_password,
    director_server   => $director_server,
    is_client         => true,
    is_director       => true,
    is_storage        => true,
    mail_to           => "admin@${::domain}",
    storage_server    => $director_server,
    manage_db         => true,
    db_backend        => "mysql",
    db_user           => "bacula",
    db_password       => "strongpassword",
  }
  # Now lets realize all of the exported client config resources configured to
  # backup to this director server.
  Bacula::Client::Config <<| director_server == $::fqdn |>>
}

Then firewall subclasses to allow communications among bacula components

class my_bacula::firewall {
    firewall { '200 allow bacula-dir access':
     port   => 9101,
     proto  => tcp,
     action => accept,
    }
    firewall { '201 allow bacula-fd access':
     port   => 9102,
     proto  => tcp,
     action => accept,
    }
    firewall { '202 allow bacula-sd access':
     port   => 9103,
     proto  => tcp,
     action => accept,
    }
}
class my_bacula::client::firewall {
    firewall { '201 allow bacula-fd access':
     port   => 9102,
     proto  => tcp,
     action => accept,
    }
}

Once Puppet agent is run on the server dedicated to Bacula server (core.example.local),it will install MySQL, bacula daemons and configurations
This is the output:

Info: Retrieving plugin
Info: Loading facts in /etc/puppet/modules/firewall/lib/facter/iptables_persistent_version.rb
....
Info: Loading facts in /var/lib/puppet/lib/facter/puppet_vardir.rb
Info: Caching catalog for core.example.local
Info: Applying configuration version '1391989429'
Notice: /Stage[main]/Bacula::Director/Package[bacula-director-mysql]/ensure: created
Notice: /Stage[main]/Bacula::Console/Package[bacula-console]/ensure: created
Notice: /Stage[main]/Mysql::Server::Install/Package[mysql-server]/ensure: created
Notice: /Stage[main]/My_bacula::Firewall/Firewall[202 allow bacula-sd access]/ensure: created
Notice: /Stage[main]/Bacula::Storage/Package[bacula-storage-mysql]/ensure: created
Notice: /Stage[main]/My_bacula::Firewall/Firewall[200 allow bacula-dir access]/ensure: created
Notice: /Stage[main]/My_bacula::Firewall/Firewall[201 allow bacula-fd access]/ensure: created
Notice: /Stage[main]/Mysql::Server::Service/Service[mysqld]/ensure: ensure changed 'stopped' to 'running'
Info: /Stage[main]/Mysql::Server::Service/Service[mysqld]: Unscheduling refresh on Service[mysqld]
Notice: /Stage[main]/Bacula::Director::Mysql/Mysql::Db[bacula]/Mysql_database[bacula]/ensure: created
Notice: /Stage[main]/Bacula::Director::Mysql/Mysql::Db[bacula]/Mysql_user[bacula@localhost]/ensure: created
Notice: /Stage[main]/Bacula::Director::Mysql/Mysql::Db[bacula]/Mysql_grant[bacula@localhost/bacula.*]/ensure: created
Info: Mysql::Db[bacula]: Scheduling refresh of Exec[make_db_tables]
Notice: /Stage[main]/Bacula::Director::Mysql/Exec[make_db_tables]/returns: Creation of Bacula MySQL tables succeeded.
Notice: /Stage[main]/Bacula::Director::Mysql/Exec[make_db_tables]: Triggered 'refresh' from 1 events
Notice: /Stage[main]/Bacula::Client/Package[bacula-client]/ensure: created
Notice: /Stage[main]/Bacula::Common/File[/etc/bacula]/owner: owner changed 'root' to 'bacula'
Notice: /Stage[main]/Bacula::Common/File[/etc/bacula]/group: group changed 'root' to 'bacula'
Notice: /Stage[main]/Bacula::Common/File[/etc/bacula]/mode: mode changed '0755' to '0750'
Notice: /Stage[main]/Bacula::Director/File[/etc/bacula/bacula-dir.d]/ensure: created
Info: /Stage[main]/Bacula::Director/File[/etc/bacula/bacula-dir.d]: Scheduling refresh of Exec[bacula-dir reload]
Notice: /Stage[main]/Bacula::Director/Bacula::Client::Config[core.example.local]/File[/etc/bacula/bacula-dir.d/core.example.local.conf]/ensure: defined content as '{md5}c3e07c7132515720e377d914f4f6b20e'
Info: /Stage[main]/Bacula::Director/Bacula::Client::Config[core.example.local]/File[/etc/bacula/bacula-dir.d/core.example.local.conf]: Scheduling refresh of Exec[bacula-dir reload]
Notice: /Stage[main]/Bacula::Storage/File[/etc/bacula/bacula-sd.d]/ensure: created
Notice: /Stage[main]/Bacula::Storage/File[/etc/bacula/bacula-sd.d/empty.conf]/ensure: created
Notice: /Stage[main]/Bacula::Common/File[/etc/bacula/scripts]/ensure: created
Notice: /Stage[main]/Bacula::Common/File[/var/log/bacula]/mode: mode changed '0750' to '0755'
Notice: /Stage[main]/Bacula::Director/File[/etc/bacula/bacula-dir.d/empty.conf]/ensure: defined content as '{md5}d41d8cd98f00b204e9800998ecf8427e'
Notice: /Stage[main]/Bacula::Client/File[/etc/bacula/bacula-fd.conf]/content:
Info: FileBucket got a duplicate file {md5}112f572bee101e0ef044d84dbcf21f79
Info: /Stage[main]/Bacula::Client/File[/etc/bacula/bacula-fd.conf]: Filebucketed /etc/bacula/bacula-fd.conf to puppet with sum 112f572bee101e0ef044d84dbcf21f79
Notice: /Stage[main]/Bacula::Client/File[/etc/bacula/bacula-fd.conf]/content: content changed '{md5}112f572bee101e0ef044d84dbcf21f79' to '{md5}66928e796655744dde50eef35549f496'
Info: /Stage[main]/Bacula::Client/File[/etc/bacula/bacula-fd.conf]: Scheduling refresh of Service[bacula-fd]
Notice: /Stage[main]/Bacula::Client/Service[bacula-fd]/ensure: ensure changed 'stopped' to 'running'
Info: /Stage[main]/Bacula::Client/Service[bacula-fd]: Unscheduling refresh on Service[bacula-fd]
Notice: /Stage[main]/Bacula::Director/Bacula::Client::Config[web01.example.local]/File[/etc/bacula/bacula-dir.d/web01.example.local.conf]/ensure: defined content as '{md5}1fca035a7381da40d8ebb2c10132c811'
Info: /Stage[main]/Bacula::Director/Bacula::Client::Config[web01.example.local]/File[/etc/bacula/bacula-dir.d/web01.example.local.conf]: Scheduling refresh of Exec[bacula-dir reload]
Notice: /Stage[main]/Bacula::Console/File[/etc/bacula/bconsole.conf]/content:
Info: FileBucket got a duplicate file {md5}addbc05edbfb1f0cbe21ef9fe3186558
Info: /Stage[main]/Bacula::Console/File[/etc/bacula/bconsole.conf]: Filebucketed /etc/bacula/bconsole.conf to puppet with sum addbc05edbfb1f0cbe21ef9fe3186558
Notice: /Stage[main]/Bacula::Console/File[/etc/bacula/bconsole.conf]/content: content changed '{md5}addbc05edbfb1f0cbe21ef9fe3186558' to '{md5}79cca4ee0032cdb8efc0f31ca379ef4c'
Notice: /Stage[main]/Bacula::Console/File[/etc/bacula/bconsole.conf]/owner: owner changed 'root' to 'bacula'
Notice: /Stage[main]/Bacula::Storage/File[/etc/bacula/bacula-sd.conf]/content:
Info: FileBucket got a duplicate file {md5}3cfcdf02982f6f95404e027e62ea3c97
Info: /Stage[main]/Bacula::Storage/File[/etc/bacula/bacula-sd.conf]: Filebucketed /etc/bacula/bacula-sd.conf to puppet with sum 3cfcdf02982f6f95404e027e62ea3c97
Notice: /Stage[main]/Bacula::Storage/File[/etc/bacula/bacula-sd.conf]/content: content changed '{md5}3cfcdf02982f6f95404e027e62ea3c97' to '{md5}f18175befe88baa7ce7cc663de31b025'
Notice: /Stage[main]/Bacula::Storage/File[/etc/bacula/bacula-sd.conf]/owner: owner changed 'root' to 'bacula'
Notice: /Stage[main]/Bacula::Storage/File[/etc/bacula/bacula-sd.conf]/group: group changed 'root' to 'bacula'
Info: /Stage[main]/Bacula::Storage/File[/etc/bacula/bacula-sd.conf]: Scheduling refresh of Service[bacula-sd]
Info: /Stage[main]/Bacula::Storage/File[/etc/bacula/bacula-sd.conf]: Scheduling refresh of Service[bacula-sd]
Info: /Stage[main]/Bacula::Storage/File[/etc/bacula/bacula-sd.conf]: Scheduling refresh of Service[bacula-sd]
Notice: /Stage[main]/Bacula::Storage/Service[bacula-sd]/ensure: ensure changed 'stopped' to 'running'
Info: /Stage[main]/Bacula::Storage/Service[bacula-sd]: Unscheduling refresh on Service[bacula-sd]
Notice: /Stage[main]/Bacula::Common/File[/var/spool/bacula]/mode: mode changed '0750' to '0755'
Notice: /Stage[main]/Bacula::Director/File[/etc/bacula/bacula-dir.conf]/content:
Info: FileBucket got a duplicate file {md5}ceb7f14ef7cf8eeb83306948c46298e8
Info: /Stage[main]/Bacula::Director/File[/etc/bacula/bacula-dir.conf]: Filebucketed /etc/bacula/bacula-dir.conf to puppet with sum ceb7f14ef7cf8eeb83306948c46298e8
Notice: /Stage[main]/Bacula::Director/File[/etc/bacula/bacula-dir.conf]/content: content changed '{md5}ceb7f14ef7cf8eeb83306948c46298e8' to '{md5}76c793f43c8fe56103858d2a4e000994'
Notice: /Stage[main]/Bacula::Director/File[/etc/bacula/bacula-dir.conf]/owner: owner changed 'root' to 'bacula'
Info: /Stage[main]/Bacula::Director/File[/etc/bacula/bacula-dir.conf]: Scheduling refresh of Exec[bacula-dir reload]
Info: /Stage[main]/Bacula::Director/File[/etc/bacula/bacula-dir.conf]: Scheduling refresh of Exec[bacula-dir reload]
Notice: /Stage[main]/Bacula::Director/Service[bacula-dir]/ensure: ensure changed 'stopped' to 'running'
Info: /Stage[main]/Bacula::Director/Service[bacula-dir]: Unscheduling refresh on Service[bacula-dir]
Error: /Stage[main]/Bacula::Director/Exec[bacula-dir reload]: Failed to call refresh: Command exceeded timeout
Error: /Stage[main]/Bacula::Director/Exec[bacula-dir reload]: Command exceeded timeout
Notice: /Stage[main]/Bacula::Director::Logwatch/File[/etc/logwatch/conf/logfiles/bacula.conf]/content:
Info: FileBucket got a duplicate file {md5}04bda5e85b3e07983bf98df5d9460212
Info: /Stage[main]/Bacula::Director::Logwatch/File[/etc/logwatch/conf/logfiles/bacula.conf]: Filebucketed /etc/logwatch/conf/logfiles/bacula.conf to puppet with sum 04bda5e85b3e07983bf98df5d9460212
Notice: /Stage[main]/Bacula::Director::Logwatch/File[/etc/logwatch/conf/logfiles/bacula.conf]/content: content changed '{md5}04bda5e85b3e07983bf98df5d9460212' to '{md5}ffa7b35051e50e1a4c24be53da5ca413'
Notice: Finished catalog run in 137.90 seconds

And once the puppet agent runs in the client, will install and configure bacula-client

Info: Retrieving plugin
Info: Applying configuration version '1391990071'
Notice: /Stage[main]/My_fw::Pre/Firewall[000 accept all icmp]/ensure: created
Notice: /Stage[main]/Httpd::Firewall/Firewall[100 allow http access]/ensure: created
Notice: /Stage[main]/Nagios::Nrpe::Firewall/Firewall[10 allow nrpe access]/ensure: created
Notice: /Stage[main]/My_fw::Pre/Firewall[001 accept all to lo interface]/ensure: created
Notice: /Stage[main]/My_fw::Pre/Firewall[002 accept related established rules]/ensure: created
Notice: /Stage[main]/My_fw::Pre/Firewall[003 allow SSH access]/ensure: created
Notice: /Stage[main]/My_fw::Post/Firewall[999 drop all]/ensure: created
Notice: /Stage[main]/My_bacula::Client::Firewall/Firewall[201 allow bacula-fd access]/ensure: created
Notice: /Stage[main]/Bacula::Client/Package[bacula-client]/ensure: created
Notice: /Stage[main]/Bacula::Common/File[/etc/bacula]/owner: owner changed 'root' to 'bacula'
Notice: /Stage[main]/Bacula::Common/File[/etc/bacula]/group: group changed 'root' to 'bacula'
Notice: /Stage[main]/Bacula::Common/File[/etc/bacula]/mode: mode changed '0755' to '0750'
Notice: /Stage[main]/Bacula::Common/File[/etc/bacula/scripts]/ensure: created
Notice: /Stage[main]/Bacula::Client/File[/etc/bacula/bacula-fd.conf]/content:
Info: FileBucket got a duplicate file {md5}112f572bee101e0ef044d84dbcf21f79
Info: /Stage[main]/Bacula::Client/File[/etc/bacula/bacula-fd.conf]: Filebucketed /etc/bacula/bacula-fd.conf to puppet with sum 112f572bee101e0ef044d84dbcf21f79
Notice: /Stage[main]/Bacula::Client/File[/etc/bacula/bacula-fd.conf]/content: content changed '{md5}112f572bee101e0ef044d84dbcf21f79' to '{md5}1f983d94cd7836c690bcbbc9db3059b8'
Info: /Stage[main]/Bacula::Client/File[/etc/bacula/bacula-fd.conf]: Scheduling refresh of Service[bacula-fd]
Notice: /Stage[main]/Bacula::Client/Service[bacula-fd]/ensure: ensure changed 'stopped' to 'running'
Info: /Stage[main]/Bacula::Client/Service[bacula-fd]: Unscheduling refresh on Service[bacula-fd]
Notice: Finished catalog run in 31.19 seconds

Now everything is ready to run the first backup jobs

[root@core ~]# bconsole
Connecting to Director core.example.local:9101
1000 OK: core.example.local:director Version: 5.0.0 (26 January 2010)
Enter a period to cancel a command.
*run
Automatically selected Catalog: core.example.local:mysql
Using Catalog "core.example.local:mysql"
A job name must be specified.
The defined Job resources are:
     1: BackupCatalog
     2: core.example.local
     3: core.example.local Restore
     4: web01.example.local
     5: web01.example.local Restore
Select Job resource (1-5): 4
...

Automating iptables with puppet

This post will show how to get started with the module puppetlabs/firewall, to maintain iptables rules on the managed systems.
Also it can be used as a complement of the previous labs: Automate Nagios Configuration with Puppet – Part 1 , Part 2 and Part 3. In fact it will add sub clasess for the daemons already implemented.
It uses most of the examples shown in the module documentation https://forge.puppetlabs.com/puppetlabs/firewall

Install the module:

puppet module install puppetlabs/firewall

Create a new module with following manifests:

class my_fw {
    resources { "firewall":
        purge => true
    }
    Firewall {
        before  => Class['my_fw::post'],
        require => Class['my_fw::pre'],
    }
    class { ['my_fw::pre', 'my_fw::post']: }
    class { 'firewall': }
}
class my_fw::pre {
  Firewall {
    require => undef,
  }

  # Default firewall rules
  firewall { '000 accept all icmp':
    proto   => 'icmp',
    action  => 'accept',
  }->
  firewall { '001 accept all to lo interface':
    proto   => 'all',
    iniface => 'lo',
    action  => 'accept',
  }->
  firewall { '002 accept related established rules':
    proto   => 'all',
    state   => ['RELATED', 'ESTABLISHED'],
    action  => 'accept',
  }->
  firewall { '003 allow SSH access':
    port   => 22,
    proto  => tcp,
    action => accept,
  }
}
class my_fw::post {
  firewall { '999 drop all':
    proto   => 'all',
    action  => 'drop',
    before  => undef,
  }
}

Now created the different subclasses for each service

class puppetmaster::firewall {
    firewall { '99 allow puppetmaster access':
        port   => 8140,
        proto  => tcp,
        action => accept,
    }
}
class httpd::firewall {
    firewall { '100 allow http access':
     port   => 80,
     proto  => tcp,
     action => accept,
    }
}
class nagios::nrpe::firewall {
    firewall { '10 allow nrpe access':
     port   => 5666,
     proto  => tcp,
     action => accept,
    }
}

Assign the corresponding classes to each node

node 'core.example.local' {
    include my_fw
    include nagios::monitor
    include nagios::nrpe-command
    include httpd
    include puppetmaster::firewall
}
node 'web01.example.local' {
    include my_fw
    include nagios::target
    include nagios::nrpe
    include httpd
    include nagios::target::httpd
}

The firewall rules sub-classes for each service, are included in the parent classes:

class nagios::nrpe {
    include nagios::nrpe::firewall
    package { [ nrpe, nagios-plugins, nagios-plugins-all ]:
    ensure => installed, }
    service { nrpe:
        ensure => running,
        enable => true,
        require => Package[nrpe],
    }
......
class httpd {
    include httpd::install
    include httpd::testfile
    include httpd::firewall
}

That’s all!
Puppet

Automate Nagios configuration with Puppet – Part 3

In this third post about Automate Nagios configuration with Puppet, we will add services to our managed hosts, and streamline the monitoring of them with new Nagios checks.
It’s a continuation of the previous labs: Automate Nagios Configuration with Puppet – Part 1 and Part 2

First we will deploy the Apache daemon to our managed hosts. For this let’s create the httpd module, with 2 sub-classes, one for installing and enabling the service, and the other to copy a test file from the Puppet Fileserver.

class httpd {
    include httpd::install
    include httpd::testfile
}
class httpd::install {
    package { [ httpd, php ]:
        ensure => installed, }
    service { httpd:
        ensure => running,
        enable => true,
        require => Package[httpd],
    }
}
class httpd {
    include httpd::install
    include httpd::testfile
}
class httpd::testfile {
    file { "/var/www/html/test.html":
        mode => 440,
        owner => apache,
        group => apache,
        source => "puppet:///modules/httpd/test.html"
    }
}
test page</pre>
<h1>Test Page</h1>
<pre>

To define the new Nagios checks, we will use a sub class on the Nagios modile already built in the previous posts. We are using a check_http command to connect remotely to the test page, and a check_procs command run via NRPE to verify that the httpd processes are present.

class nagios::target::httpd {
   @@nagios_service { "check_http_${hostname}":
        check_command => "check_http!-u /test.html",
        use => "generic-service",
        host_name => "$fqdn",
        notification_period => "24x7",
        service_description => "${hostname}_check_http"
   }
   file_line { "command_check_httpd":
        line => "command[check_httpd]=/usr/lib64/nagios/plugins/check_procs -C httpd -c 1:",
        path => "/etc/nagios/nrpe.cfg",
        ensure => present,
        notify  => Service["nrpe"],
   }
   @@nagios_service { "check_httpd_${hostname}":
        check_command => "check_nrpe!check_httpd",
        use => "generic-service",
        host_name => "$fqdn",
        service_description => "${hostname}_check_httpd"
   }
}

Finally we add the corresponding classes to our nodes using the manifest file

node 'core.example.local' {
    include nagios::monitor
    include nagios::nrpe-command
}
node 'web01.example.local' {
    include nagios::target
    include nagios::nrpe
    include httpd
    include nagios::target::httpd
}

Once the puppet agents run on the managed node and nagios server, the monitoring of Apache will be ready:
services_httpd

Automate Nagios configuration with Puppet – Part 2

In this second post about Automate Nagios configuration with Puppet, we will include monitoring through NRPE.
As in the first post, the idea of the lab is use the simplest resources on puppet. In this case we are introducing a new feature: puppet modules download from Puppet forge.

Add module stdlib, used in the lab for editing the nrpe configuration file in each monitored server

puppet module install puppetlabs/stdlib

Create new clasess for NRPE:
A class for nrpe installation and configuration edit, and new service definitions for the default checks included in the CentOS nrpe package.

vi /etc/puppet/modules/nagios/manifests/nrpe.pp
class nagios::nrpe {
    package { [ nrpe, nagios-plugins, nagios-plugins-all ]:
        ensure => installed, }
    service { nrpe:
        ensure => running,
        enable => true,
        require => Package[nrpe],
    }
   file_line { "allowed_hosts":
        line => "allowed_hosts = 127.0.0.1,192.168.112.14",
        path => "/etc/nagios/nrpe.cfg",
        match => "allowed_hosts",
        ensure => present,
        notify  => Service["nrpe"],
   }
   @@nagios_service { "check_load_${hostname}":
        check_command => "check_nrpe!check_load",
        use => "generic-service",
        host_name => "$fqdn",
        service_description => "${hostname}_check_load"
   }
   @@nagios_service { "check_total_procs_${hostname}":
        check_command => "check_nrpe!check_total_procs",
        use => "generic-service",
        host_name => "$fqdn",
        service_description => "${hostname}_check_total_procs"
   }
   @@nagios_service { "check_zombie_procs_${hostname}":
        check_command => "check_nrpe!check_zombie_procs",
        use => "generic-service",
        host_name => "$fqdn",
        service_description => "${hostname}_check_zombie_procs"
   }
   @@nagios_service { "check_users_${hostname}":
        check_command => "check_nrpe!check_users",
        use => "generic-service",
        host_name => "$fqdn",
        service_description => "${hostname}_check_users"
   }
}

And a class to add the nrpe plugin and a command definition for it on the Nagios server

vi /etc/puppet/modules/nagios/manifests/nrpe-command.pp
class nagios::nrpe-command {
  package { "nagios-plugins-nrpe" :
        ensure => installed,
  }
  nagios_command { 'resource title':
    command_name => 'check_nrpe',
    ensure       => 'present',
    command_line => '/usr/lib64/nagios/plugins/check_nrpe -H $HOSTADDRESS$ -c $ARG1$',
  }
  file_line { "nagios_command.cfg":
        line => "cfg_file=/etc/nagios/nagios_command.cfg",
        path => "/etc/nagios/nagios.cfg",
        ensure => present,
        notify  => Service["nagios"],
   }
   file { "nagios_command":
        mode => 644,
   }
}

Include the new classes on the nodes manifests

vi /etc/puppet/manifests/nodes.pp
node 'core.example.local' {
      include nagios::monitor
      include nagios::nrpe-command
}
node 'web01.example.local' {
      include nagios::target
      include nagios::nrpe
}

Finally run the agent on the monitored server and then on puppet master

puppet agent --test --server core.example.local

Check the result on the Nagios console:
services_nrpe

Automate Nagios Configuration with Puppet

This is a full lab to show the automation of nagios configurations with Puppet, using CentOS 6.5. It’s based on the examples of the Exported Resources documentation on the PuppetLabs website: http://docs.puppetlabs.com/guides/exported_resources.html
It’s also published in Github: https://github.com/gfolga/puppet-nagios-lab

Nagios / Puppet Server core
Domain name example.local
Monitored server web01
Puppet version 3.4
Additional Puppet packages PuppetDB, puppet-dashboard
Nagios version 3.5
Additional Nagios packages nagios-plugins, nagios-plugins-all

Puppet/Nagios server configuration

Do a minimal install of CentOS
Configure Networking
Update packages

yum update -y

Add puppet & epel repositories

rpm -ivh http://download.fedoraproject.org/pub/epel/6/x86_64/epel-release-6-8.noarch.rpm
rpm -ivh https://yum.puppetlabs.com/el/6/products/x86_64/puppetlabs-release-6-7.noarch.rpm

Install puppet packages

yum install -y puppet puppet-server puppetdb puppet-dashboard puppetdb-terminus

Configure PuppetDB and Puppet Master:

cat <<END > /etc/puppet/puppetdb.conf
[main]
   server = core.example.local
   port = 8081
END
cat <<END >> /etc/puppet/puppet.conf
[master]
  storeconfigs = true
  storeconfigs_backend = puppetdb
  reports = store,puppetdb
END
cat <<END > /etc/puppet/routes.yaml
---
master:
  facts:
    terminus: puppetdb
    cache: yaml
END

Run the SSL Configuration Script

 /usr/sbin/puppetdb ssl-setup 

Enable and start services

puppet resource service puppetdb ensure=running enable=true
puppet resource service puppetmaster ensure=running enable=true

Test puppet agent and puppetdb

puppet agent --server core.example.local -t

Module for install and initalization of nagios

cd /etc/puppet
mkdir modules/nagios
mkdir modules/nagios/manifest
chmod 755 modules/nagios/
chmod 755 modules/nagios/manifests/
vi modules/nagios/manifests/monitor.pp
class nagios::monitor {
    package { [ nagios, nagios-plugins, nagios-plugins-all ]: ensure => installed, }
    service { nagios:
        ensure => running,
        enable => true,
        #subscribe => File[$nagios_cfgdir],
        require => Package[nagios],
    }
    # collect resources and populate /etc/nagios/nagios_*.cfg
    Nagios_host <<||>>
    Nagios_service <<||>>
}
chmod 644 modules/nagios/manifests/monitor.pp

Monitored server configuration

Do a minimal install of CentOS
Configure Networking
Update packages

yum update -y

Add puppet & epel repositories

rpm -ivh http://download.fedoraproject.org/pub/epel/6/x86_64/epel-release-6-8.noarch.rpm
rpm -ivh https://yum.puppetlabs.com/el/6/products/x86_64/puppetlabs-release-6-7.noarch.rpm

Install puppet packages

yum install -y puppet

Add a host entry on the puppet server for the monitored host

puppet resource host web01.example.local ip="192.168.112.15"

Add a host entry on the monitored host for the puppet server

puppet resource host core.example.local ip="192.168.112.14"

Initialize agent

puppet agent --test --server core.example.local

Sign cert on the puppet server

puppet cert sign web01.example.local

Create puppet module file and include it in the node definition

vi modules/nagios/manifests/target.pp
class nagios::target {
   @@nagios_host { $fqdn:
        ensure => present,
        alias => $hostname,
        address => $ipaddress,
        use => "linux-server",
   }
   @@nagios_service { "check_ping_${hostname}":
        check_command => "check_ping!100.0,20%!500.0,60%",
        use => "generic-service",
        host_name => "$fqdn",
        notification_period => "24x7",
        service_description => "${hostname}_check_ping"
   }
}
vi /etc/puppet/manifests/site.pp
import "nodes"
vi /etc/puppet/manifests/nodes.pp
 node 'core.example.local' {
      include nagios::monitor
    }
 node 'web01.example.local' {
      include nagios::target
    }

Run the agent on the monitored server and then on puppet master

puppet agent --test --server core.example.local

Add nagios_host.cfg and nagios_service.cfg to the main configuration file of nagios

cat <<END >> nagios.cfg
cfg_file=/etc/nagios/nagios_host.cfg
cfg_file=/etc/nagios/nagios_service.cfg
END
chmod 644 /etc/nagios/nagios_host.cfg
chmod 644 /etc/nagios/nagios_service.cfg

Restart Nagios & Apache, set nagiosadmin password

service nagios restart
service httpd restart
htpasswd -c /etc/nagios/passwd nagiosadmin

Access to Nagios and verify that the monitored host and the service defined appear on the nagios console
hosts services

That’s all. Just adding the class nagios::target to new servers, and puppet will take care of the nagios definitions.
In a next post I will extend the lab with advanced monitoring of the servers using NRPE.