Pandora: QuickGuides EN: Fast deployment

From Pandora FMS Wiki
Jump to: navigation, search

Go back to Quick Guides index

Info.png

This work is under development (not translated!)

 


1 Introduction

This guide aims to show the user how to quickly and efficiently administrate a high number of machines (5, 10... or 500) using the different features of Pandora FMS designed for this purpose. We are going to divide the document into four parts:


  • Network device monitoring, using Recon Server and templates.
  • SNMP network device monitoring, using Recon Script SNMP.
  • Agent monitoring, using policies (only Enterprise).
  • Remote monitoring with customized scripts, using an agent generator via XML.

2 Network device Monitoring, using Recon Server and templates

Situation To monitor two hundred servers, twenty switches and ten routers, which can't be configured one by one. The "general" monitoring is very easy, but there isn't much time or the possibility to install agents on the machines.

Solution

Pandora FMS will detect the systems and will apply different templates depending if it's a switch, a router or a server. The templates have remote checks that can be applied when detecting the kind of machine it is.


How long will it take ?

A class C network (255 hosts) can be scanned in less than one minute using version 6.0. Applying a monitoring standard to the detected machines is almost immediate, so you could have those 230 machines completely configured in less than ten minutes.


2.1 Step 1. Defining monitoring profiles

First we are going to define the monitoring template that is called "Module Template" in Pandora FMS. Go to the following menu:

Quick mon 1.png

Here we see some pre-defined profiles, that have some generic checks. We are going to edit one of them (on a Linux Server) that refers to a profile that is useful to monitor generic Linux servers remotely.

Quick mon 2.png

Quick mon 3.png

As you can see in the screenshot, this profile has some basic TCP checks, for example "Check SSH Server", a basic ICMP check ("Host Alive") and different SNMP modules that use the Linux MIB, which make up the rest of the checks.


These "template" checks are defined in the Pandora FMS basic module library that comes with the simple installation and which contains generic module definitions.


The IP value doesn't exist in this module, because it is auto-assigned from the agent IP when this module is applied. The rest of the fields are "default", i.e. thresholds, SNMP community; and are applied to all the agents that have a template with this module. If you want to modify it (i.e. changing the community) you need to change them one by one on the agents, or with the massive change tool.

Now that we know what monitoring templates, generic modules for templates or network components are, we can look at some of the other templates, specifically those involved with WMI generic monitoring and those corresponding to basic monitoring.

The first has three WMI modules for Windows. These modules need to be customized, editing the original component or the generated modules, so they need a username and password with permissions to perform remote queries on WMI.

The second one only has a basic check for ICMP connectivity. We can add other basic checks, as you can see in the following screenshot:

Quick mon 5.png


2.2 Step 2. Using a Network task with a Recon Server

Now we have three basic monitoring profiles: Linux, Windows and network.

Supposing that we have to monitor all the machines in a network group, for example:

  • 192.168.50.0/24 for servers.
  • 192.168.50.0/24,192.168.1.0/24 for communications.

We want it to identify all the machines on that network, and, depending on their OS, apply one template or another. Another way of doing it when/if the switches are of several different brands and models is to "identify" them through a standard procedure based on having an open port or not. i.e. that those machines with the port 23 (telnet) open, are identified as generic machines (switches, routers).

Go to the recon servers section to create a new one:

Quick mon 6.png

We're going to create one to look for, and register, Windows servers by applying the Windows machine's monitoring standard:

Quick mon 7.png

Here we can see, how in the "OS" field Windows is selected. That means it will only apply this monitoring profile to those machines that it finds running a Windows OS, if not, it will ignore them.

Since the OS detection isn't 100% reliable (it depends on the machine's own services), it is possible to select another method, like for example singling out a specific port.

This way, all the machines with this port open, would fit under the template's application. This example could be seen here, where we've created another task. This time we're using a port filter instead of an OS filter, to apply it to the generic network device monitoring:

Quick mon 8.png

To specify two networks, separate them by commas: 192.168.50.0/24,192.168.1.0/24

Finally, we'll configure the Linux one in a similar way. When you are finished defining the three groups it should look like this:

Quick mon 9.png

Once we have defined the recon task, these can start alone, but let's see their status and force them to start if necessary. To do this, click on the eye icon and go to the Recon server operation view.

Quick mon 10.png

By default, the recon server has one execution thread, so you'll only be able to execute one task at a time. The rest will wait for the active exploration task to end. However, the server configuration file (pandora_server.conf) can be modified. You can force exploration tasks by pressing the circular green icon at the left of the task.

This will make the Recon server search for new machines that don't exist in the active monitoring scheme. If it finds them, it will register them automatically (trying to resolve the name, if this option is activated) and assign all the modules that were contained in the profile to it.

Be aware that many of the modules assigned to one profile could make no sense or not be correctly configured for that specific agent. On this agent, we've correctly detected a Linux system, but this server hasn't got SNMP, so not all the SNMP modules are reporting. Given that they couldn't retrieve data on the first attempt, they are in a mode known as "Non-init status" (not initialized). The next time you pass the database maintenance script, they will be deleted automatically:

Quick mon 11.png


3 SNMP Network Device Monitoring, using SNMP Recon Script

In this case, we need to monitor an SNMP device with many interfaces "automatically" and in depth, needing to retrieve the status of each interface, the traffic on each entry, the error rate, etc.

To do this, we're going to use a system known as Recon Script. It's a modular system that allows you to execute complex actions on one script. Pandora FMS has a script to detect this kind of SNMP device.

To execute it, create a network task, like this:

Quick mon 12.png

In the "first field" write the network or the destined network. In the "second field" write the SNMP community that we are going to use when exploring these devices. In the "third field" write some optional parameters. In this case -n is to register the interfaces that are also down, this means that by default it only registers active interfaces.

This script will register the interfaces that didn't exist previously and that now are active on each machine, in each execution. So if new interfaces are started up, it detects them. We can program the network tasks so they are executed once a day, once an hour, etc....

This is the way that the Recon Script Task looks once it has been created:

Quick mon 14.png

And this is how the Recon Script Task looks in execution:

Quick mon 13.png


4 Agent Monitoring through monitoring templates and massive operations

Not written yet

5 Agent Monitoring through Policies

To manage monitoring on a massive scale with software agents installed, we can make use of policies. This is an Enterprise feature.

Firstly, the software agents must be installed with remote_config enabled, otherwise execution modules cannot be created.

  remote_config 1

Next, navigate to the Add policy section and create a new policy, filling out some of the informative parameters, such as name, group and description;

Policy1.JPG

From here, navigate to the section for creating new modules in the policies and create a new local module (dataserver module):

Policy2.JPG

Once you have created the modules you need, which can be of local (dataserver module) or remote execution, you can start adding as many agents as you need to the policy. To do this, navigate to the corresponding tab in your policy and move the agents to the section "Agents in policy":


Policy3.JPG

Once the agents are added, apply the changes made in Queue. Apply all the changes and wait for the progress bar to be completed.


Policy4.JPG

Once it's done, all the modules created in the policy are deployed on the selected agents.

Policies allow us to not only add modules to groups of agents, but to also include other kinds of elements as alerts, archive collections, plugins, etc.Furthermore, any modifications you make to the policies, like modifying thresholds on a module, are automatically inherited by all the agents included in that policy once it is applied.

6 Agent Monitoring using Customized Scripts

This is an advanced way to monitor high system volumes which are similar to each other, in a completely "ad-hoc" way. To do this you should have pre-existing tools that give you information about your systems. Some examples are:

  • Scripts already installed which give information about remote systems.
  • Other monitoring systems already in use which generate data that could be recycled.
  • Small checks that are similar for a group of X machines, but that don't return a single data string, instead they retrieve several simultaneously. If they return data piece by piece, they could be reused as plugins for the remote server.

The idea is simple: it uses a script to generate the agent's XML headers, writing the agent name that you want and filling out the module data through an external script that it executes as an argument. This external script should generate the correct data with the Pandora XML format (extremely simple!). The main script will close the XML and move it to the standard path to process the XML data files (/var/spool/pandora/data_in).Program the script through CRON. There is more information about the XML format that Pandora FMS uses to report the data. Check our technical annexes.


Remote agent Script

There is a small script at /usr/share/pandora_server/util/pandora_remote_agent.sh that has two parameters


-a <agent name>
-f <script file it'll execute>

This way if you have a script as /tmp/sample_remote.sh that contains:

#!/bin/bash

PING=`ping 192.168.50.1 -c 1 | grep " 0% packet loss" | wc -l`

echo "<module>"
echo "<name>Status</name>"
echo "<type>generic_proc</type>"
echo "<data>$PING</data>"
echo "</module>"

ALIVE=`snmpget -Ot -v 1 -c artica06 192.168.70.100 DISMAN-EVENT-MIB::sysUpTimeInstance | awk '{ print $3>=8640000 }'`

echo "<module>"
echo "<name>Alive_More_than_24Hr</name>"
echo "<type>generic_proc</type>"
echo "<data>$ALIVE</data>"
echo "</module>"

# Another script with XML retrieval 
EXT_FILE=/tmp/myscript.sh

if [ -e "$EXT_FILE" ]
then
	$EXT_FILE
fi

It could generate a complete XML with the agent name "agent_test" executing the remote agent script in the following way:

/usr/share/pandora_server/util/pandora_remote_agent.sh -a agent_test -f /tmp/sample_remote.sh


Supposing you want to execute the same script against X machines. You should transfer some data, e.g.: user, IP, and password onto the same script:

/usr/share/pandora_server/util/pandora_remote_agent.sh -a agent_test -f "/tmp/sample_remote.sh 192.168.50.1"

You have to parametrize the script /tmp/sample_remote.sh to get the command line parameters and use them correctly.

Programming the script with Cron

Imagine that you have 10 machines monitored like this:


/usr/share/pandora_server/util/pandora_remote_agent.sh -a agent_test1 -f "/tmp/sample_remote.sh 192.168.50.1"
/usr/share/pandora_server/util/pandora_remote_agent.sh -a agent_test2 -f "/tmp/sample_remote.sh 192.168.50.2"
/usr/share/pandora_server/util/pandora_remote_agent.sh -a agent_test3 -f "/tmp/sample_remote.sh 192.168.50.3"
/usr/share/pandora_server/util/pandora_remote_agent.sh -a agent_test4 -f "/tmp/sample_remote.sh 192.168.50.4"
/usr/share/pandora_server/util/pandora_remote_agent.sh -a agent_test5 -f "/tmp/sample_remote.sh 192.168.50.5"
/usr/share/pandora_server/util/pandora_remote_agent.sh -a agent_test6 -f "/tmp/sample_remote.sh 192.168.50.6"
/usr/share/pandora_server/util/pandora_remote_agent.sh -a agent_test7 -f "/tmp/sample_remote.sh 192.168.50.7"
/usr/share/pandora_server/util/pandora_remote_agent.sh -a agent_test8 -f "/tmp/sample_remote.sh 192.168.50.8"
/usr/share/pandora_server/util/pandora_remote_agent.sh -a agent_test9 -f "/tmp/sample_remote.sh 192.168.50.9"
/usr/share/pandora_server/util/pandora_remote_agent.sh -a agent_test10 -f "/tmp/sample_remote.sh 192.168.50.10"


Put all these lines in a new script, i.e:"/tmp/my_remote_mon.sh" and give it execution permissions and add the following line to the root crontab:

-*/5 * * * *   root /tmp/my_remote_mon.sh

This will make the script execute on the system each 5 minutes. You can start adding machines to the script.


Info.png

If you want more information about system monitoring and its advantages and the processes to follow for correct monitoring, visit our system monitoring blog

 



Return to the Pandora FMS Quick Guides index