Pandora: Documentation en: Satellite

From Pandora FMS Wiki
Jump to: navigation, search

Contents

1 Satellite Server

1.1 Introduction

The Satellite Server is used for network and remote systems monitoring and discovery. It can discover network elements (routers, switches, etc) using SNMP or ICMP, or Windows (using WMI) or Linux (using SNMP) servers. This is no ordinary server, it can be considered to be a broker agent with extended functions. It is an exclusive component for the Enterprise version. It is especially useful to monitor inaccessible, by the Pandora Server, remote networks, where a software agent just isn't and option.

Esquema-satellite.png

This server doesn't require a connection with the pandora database. It sends all information in XML format using the tentacle protocol like and agent.

The satellite server can be used in Windows and Linux alike although the installation process in both cases is a little different.

This server has some characteristics witch make it unique and more than recommended on many occasions:

  • It can execute network tests (ICMP, Latency and SNMP v1 and v2) at an extremely high pace (500 checks per second).
  • It only sends information to the server after some period of time (300 seconds by default), but it can execute the latency, ICMP and SNMP tests within a smaller interval (30 seconds for example). This way it can notify the Pandora Server almost instantly when a change in the status is detected. This status changes must be previosly defined if the module type isn't a generic_proc type (network interfaces or general network connectivity for example)
  • It doesn't require connection to the database. It sends all files in XML format the same way as an independent server, similar in many ways to a broker agent or an export server.
  • It has an autodiscovey mechanism for SNMP and WMI. Once an agent is detected (by IP address), it detects the dynamic elements (network interfaces, storage) and monitors them automatically.
  • In Windows systems it can detect the discdrive, CPU, and memory.
  • In systems with SNMP it can detect the status of the interfaces, inbound and outbound traffic for each interface and the name of the system.
  • The autogenarated modules can be modified, like every other module, administrating the agent from the console like any other agent in the massive operations menu and for these modules from the Satellite section.
  • Agents can be created directly by the creation o an agent configuration file in the satellite server directory for agent configuration files.

1.1.1 Capacity and performance of Satellite Server

It is difficult to pinpoint the maximum capacity of the satellite, as it depends entirely on the server running, and the type of checks you want to perform. In the best case, we have managed to make 500 checks ICMP/SNMP per second, but that depends a lot on the response times of the remote devices (is not the same a device which answers in 0.5ms than one that takes 2sec to respond). Under ideal conditions we can talk about monitor 150,000 checks with a single server. In real conditions, we tested in controlled environments (LAN) about 50,000 modules with a single satellite server in a low-end computer hardware (Intel i5, 2GZ, 4GB RAM).

1.2 Installation

The Satellite Server is distributed in binary format this way no additional library is required. In both Windows and Linux versions the functionality of this server is the same. In Windows systems it is installed as a service and in Linux systems it is installed as a daemon. The configuration file and specifications in both cases are the same.

1.3 Satellite Server Installation in Linux Systems

Once downloaded the binary witch contains the satellite server we must go to the download directory with root privileges and extract the files from the binary:

Desarchivar.png

The a satellite_server will be created. We must enter that folder typing:

cd satellite_server/

Before proceeding with installation it is necessary to clarify that fping, nmap, wmic and braa are absolutely necessary for the Satellite Server:

In the installer the Braa and Wmic packages are included. Fping and Nmap must be installed independently.

To install the Satellite Server we can just follow the instructions in the following image:

Instalacion linux.png

Once finished we need to edit the satellite_server.conf file, located in /etc/pandora/ To start the Satellite Server we need to type the following:

sudo /etc/init.d/satellite_serverd start

In case of an error take a look at the satellite_server.log file, located in /var/log/

1.4 Windows Installation

The Satellite Server can be installed following these simple steps:

We start by chosing the installation language:

English installation1.png

Then we click on Next

English installation2.png

Then we choose where to install the Satellite Server:

English installation3.png

Installation of WinPCap is required. The WinPCap installation window would appear at this step of the installation process:

Instalación wincap1.png

Then we must configure WinPCap to start on when system starts.

Instalación wincap2.png

Once finished the installation of WinPCap we would see the following window:

Instalación wincap3.png

The the license number must be introduced:

English installation4.png

Then the parameters of the recon task must be configured:

English installation5.png

At the end a restart of the system is required for all changes to take place.

English installation6.png

Once finished the Satellite Server can be started from the start menu.

1.4.1 Operation WMI modules in some Windows versions

For security reasons in Windows, some versions have limited users who can remotely query WMI. If these modules were not carried out, the solution is to run the service Satellite Server as an Administrator user.

The process to follow is:

Open services:

Instalacion windows7e.png

We click right click on the service and enter in Properties

Instalacion windows8e.png

On the Log On window, select an account with Administrator permissions and apply changes:

Instalacion windows9e.png

And following these changes, restart the service.

1.5 Configuration

All parameters that require a timeout or some time are specified in seconds, for example 300 = 5 minutes.

It is important to keep in mind that the latency and snmp intervals are specific for the status change. In case of Boolean checks (port or machine status) the threshold witch defines the change of state is automatic. For the numerical values (latency, network traffic in an interface, disk space, CPU, etc) it is based in a threshold that must be defined in each module.

1.5.1 agent_interval xxx

300 seconds by default (5 minutes), it creates agents with an interval of 5 minutes. Information ins't send to the server till this time has passed. Independently that the checks done by the network server have a lower interval.

1.5.2 agent_theads xxx

Number of threads used for sending agent XML data files.

1.5.3 xxxxxx_interval xxx

Executes all checks (latency, snmp, etc) with some interval. If the current information is different compared with the previous one it sends it instantly. If it is the same it will send it when the agent interval has passed. It is useful to do intesive checks and notify only in case of a status change.

1.5.4 xxxxx_retries xxx

Number of retries in checks (latency, snmp, ping...)

1.5.5 xxxxx_timeout xxx

Timeout in seconds for the SNMP, Latency and Ping checks.

1.5.6 xxxxx_block xxx

Forces the server to execute the checks in blocks of XXX checks. The higher the number (500 tops) the more capacity it would have, but with an increased latency. In some cases it mind be recommended to lower this number (latency, ping and snmp)

1.5.7 xxxxx_threads n

Number of assigned threads to every type of check. It depends on the capacity (CPY and Memory) of the machine. The higher the threads more pressure would be put on the machine but the processing speed would be higher for the satellite server.

1.5.8 log_file /dev/null

Satellite server logfile. It can grow quickly, so it is recommended, if not going to be used, to be redirected to /dev/null/. It is usefull at the beginning to try and discover possible errors and later on comment it.

1.5.9 recon_task xxxxx[,yyyy]

IP Address and network addresses for autodiscovery for example:

192.168.50.0/24,10.0.1.0/22,192.168.70.64/26

1.5.10 server_ip <ip>

Pandora FMS Server ip address where the information is send using the tentacle protocol (port 41121/tcp)

1.5.11 recon_mode [icmp,snmp,wmi]

Autodiscovery mode. The system would use the following protocols to in recon checks:

  • ICMP: It would just check if the host is alive and the latency time.
  • SNMP: If capable it would look for all the interfaces and get it's trafic, general status etc..

It can only use v1 and 2 of SNMP.

  • WMI: Similar to the previous but in this case it would show: CPU Usage, Memory and Diskdrives

1.5.12 recon_community aaa,bbb,ccc...

States a list of SNMP communities to be used in autodiscovery mode.

1.5.13 wmi_auth Administrator%password

Specifies a list of groups of User%Password, f.e: admin%1234,super%qwerty. This list is used in autodiscovery mode.

1.5.14 agent_conf_dir <path to agente conf dir>

In this directory the config files are automatically of each agent discovered by the satellite server is stored.

1.5.15 group <grupo>

Specifies the default group for the agents created by the Satellite Server.

1.5.16 daemon 1|0

When set to 1 starts the daemon in the background (by default).

1.5.17 hostfile <file>

It is an alternative method for network scanning. A file is provided with an adress in each line. It can include the hostname as well.

1.5.18 pandora_license xxxxxxx

Here you must input the license number of your Pandora FMS server the same way it appears in the Setup->Licency section. The total number of agents is verified in the pandora console.

1.5.19 remote_config 1|0

Specifies if the autodiscovery agents have enabled remote config to edit them from the console. It enable itself remote config too.

1.5.20 temporal_min_size

If the free space (in MB) of the partition in which the temporary directory is located. If it's smaller than this value, it would continue generating data packages. It avoids the disk becoming full if the connection with the server is lost during an extended interval under any circumstances.

1.5.21 xml_buffer

The default value is '0'. If set to '1', the agent is going to save any XML data files which couldn't be sent and retries later.

if you are in a secured environment under UNIX and want to enable the XML buffer, you should consider changing the temporal directory, since anyone has the right to write into '/tmp'.

1.5.22 snmp_version

SNMP version to use by default (only 1 and 2c are supported). 1 by default.

Template warning.png

Some modules could stop working if you change this setting.

 


1.5.23 braa <path to braa>

Path to the braa binary (/usr/bin/braa by default).

1.5.24 fping <path to fping>

Path to the fping binary (/usr/sbin/fping by default).

1.5.25 latency_packets xxx

Number of ICMP packets to send per latency request.

1.5.26 nmap <path to nmap>

Path to the nmap binary (/usr/bin/nmap by default).

1.5.27 nmap_timing_template xxx

A value that specifies how aggressive nmap should be from 1 to 5. 1 means slower but more reliable, 5 means faster but less reliable. 2 by default.

1.5.28 ping_packets xxx

Number of ICMP packets to send per ping request.

1.5.29 recon_enabled 0|1

Enable (1) or disable (0) host auto-discovery.

1.5.30 recon_timing_template xxx

Like nmap_timing_template, but applies to Satellite Server and Recon Server network scans. 3 by default.

1.5.31 server_port xxxxx

Tentacle server port.

1.5.32 Secondary Server

An special kind of general configuration parameter is the definition of a secondary server. This allows the definition of a server to send data to, in a complementary way to the server defined the standard way. The secondary server mode works in two different ways:

  • on_error: Send data to the secondary server only in cases it could not send them to the primary one.
  • always: Always send data to the secondary server, no matter if it's able to contact the main server or not.

Configuration example:

secondary_server_ip     192.168.1.123
secondary_server_path   /var/spool/pandora/data_in
secondary_mode          on_error
secondary_transfer_mode tentacle
secondary_server_port   41121

1.5.33 snmp_verify 0|1

Enable (1) or disable (0) the verification of SNMPv1 modules that break braa in realtime. These modules will be discarded and stop being executed.

1.5.34 snmp2_verify 0|1

Enable (1) or disable (0) the verification of SNMPv2 modules that break braa in realtime. These modules will be discarded and stop being executed.

Template warning.png

Verifying SNMP version 2 modules can be very slow!

 


1.5.35 startup_delay xxx

Wait startup_delay seconds before sending XML data files for the first time.

1.5.36 temporal /tmp

Temporal directory where XML files are created.

1.5.37 tentacle_client <path to tentacle_client>

Full path to the Tentacle client (/usr/bin/tentacle_client by default).

1.5.38 wmi_client <path to wmic>

Full path to the WMI client binary (/usr/bin/wmic by default).

1.5.39 snmp_blacklist <path to the blacklist>

Path to the SNMP blacklist file (/etc/pandora/satellite_server.blacklist by default).

1.5.40 add_host <IP address> [agent name] (Version >= 6.0)

Adds the given host to the list of monitored agents. The name for the agent can be specified after the IP address. Multiple hosts may be added, one per line. For example:

add host 192.168.0.1
add host 192.168.0.2 localhost.localdomain

1.5.41 ignore_host <agent name> (Version >= 6.0)

Removes the given host from the list of monitored agents, even if it is found in a network scan by a recon task. The host must be identified by the name of the agent. Multiple hosts may be ignored, one per line.For example:

ignore host 192.168.0.1
ignore host localhost.localdomain

1.5.42 keepalive xxx (Version >= 6.0)

Satellite Server reports its status to Pandora Server and checks remote configurations (from agent generated and itself) every keepalive seconds. It is 30 seconds by default.

1.5.43 credential_pass xxx (Version >= 6.0)

Password used to encrypt credential box passwords. It must match the one defined in the Pandora FMS Console. The hostname is used by default.

1.5.44 timeout_bin <path to timeout> (Version > 6.0SP3)

If defined, the timeout program (usually /usr/bin/timeout) will be used to call the Tentacle client.

1.5.45 timeout_seconds xxx (Version > 6.0SP3)

Timeout in seconds for the timeout command. timeout_bin must be configured.

1.5.46 proxy_traps_to <address[:port]> (Version > 6.0SP3)

Proxy SNMP traps received by the Satellite Server to the given address (and port). Port 162 is used by default.

1.5.47 proxy_tentacle_to <address[:port]> (Version > 6.0SP3)

Proxy Tentacle client requests received by the Satellite Server to the given address (and port). Port 41121 is used by default.

1.5.48 dynamic_inc 0|1 (Version > 6.0SP4)

Set to 1 to move dynamic auto-discovered modules (SNMP, WMI...) to separate files so that they don't interfere with remote agent configuration.

1.5.49 verbosity <0-10> (Version > 7.0OUM204)

Log verbosity level from 0 (less verbose) to 10 (more verbose).

1.6 Specific Configurations (per agent)

In addition to autodiscovered modules, all kinds of TCP, SNMP or WMI tests can be added, using a similar syntax to the local modules in software agents.

Template warning.png

Make sure OIDs start with a leading dot, otherwise SNMP modules will not work!

 


Status of the Interface (SNMP). The Satellite Server detects automatically each interface.

module_begin
module_name if eth1 OperStatus
module_description IP address N/A. Description: The current operational state of the interface. The testing(3) state indicates that no operational packets can be passed.
module_type remote_snmp_string
module_snmp 192.168.70.225
module_oid .1.3.6.1.2.1.2.2.1.8.3
module_community artica06
module_end

To force the module to use SNMP version 2c add the line:

module_version 2c

To force the module to use SNMP version 1 add the line:

module_version 1

For example:

module_begin
module_name if eth1 OperStatus
module_description IP address N/A. Description: The current operational state of the interface. The testing(3) state indicates that no operational packets can be passed.
module_type remote_snmp_string
module_snmp 192.168.70.225
module_version 2c
module_oid .1.3.6.1.2.1.2.2.1.8.3
module_community artica06
module_end

Conectivity to a machine (using PING)

module_begin
module_name ping
module_type generic_data
module_ping 192.168.70.225
module_end

General SNMP check. In this case the server extracts automatically the traffic for each interface with it's descriptive name.

module_name if eth0 OutOctets
module_description The total number of octets transmitted out of the interface, including framing characters.
module_type remote_snmp_inc
module_snmp 192.168.70.225
module_oid .1.3.6.1.2.1.2.2.1.16.2
module_community artica06
module_end

CPU WMI usage check (percentage).

module_begin
module_name CPU
module_type generic_data
module_wmicpu 192.168.30.3
module_wmiauth admin%none
module_end

Memory free wmi check (percentage).

module_begin
module_name FreeMemory
module_type generic_data
module_wmimem 192.168.30.3
module_wmiauth admin%none
module_end

General WMI Querry

module_begin
module_name GenericWMI
module_type generic_data_string
module_wmi 192.168.30.3
module_wmiquery SELECT Name FROM Win32_ComputerSystem
module_wmiauth admin%none
module_end

Generic SSH command (version > 6.0)

module_begin
module_name GenericSSH
module_type generic_data
module_ssh 192.168.30.3
module_command ls /tmp | wc -l
module_end

To introduce a threshold we must do it in the text definition of the module and the definition in the console for each module (module_min_warning, module_min_critical):

module_begin
module_name latency
module_type generic_data
module_latency 192.168.70.225
module_min_warning 80
module_min_critical 120
module_end

Manually we can create execution modules. The scripts or commands that the satellite server executes must be previously established and available for the server to use. The use of module_exec can make the performance speed of the satellite server to shrink.

module_begin
module_name Sample_Remote_Exec
module_type generic_data
module_exec /usr/share/test/test.sh 192.168.50.20
module_min_warning 90
module_min_critical 95
module_end

1.7 Credential boxes (> 6.0)

Unless key-based authentication is properly configured, SSH modules require a username and a password in order to work. These are configured in the main configuration file, satellite_server.conf, using credential boxes with the following format:

credential_box network/mask,username,password
credential_box network/mask,username,[[encrypted password]]

For example:

credential_box 192.168.1.1/32,user,pass1
credential_box 192.168.1.0/24,user,pass2

Credential boxes are searched from more restrictive to less restrictive masks.

Passwords can be encrypted using Blowfish in ECB mode. Make sure credential_pass is defined, otherwise the hostname will be used as the default encryption password. The hexadecimal representation of the ciphertext should be enclosed in double brackets:

credential_box 192.168.1.0/24,user,[[80b51b60786b3de2]]

1.8 General view of all agents in the console

If the configuration of the satellite server is correct we should be able to see the following in Agent Detail:

Selección 146.png

Generally in all machines ICMP (Ping and Latency) modules would be created but in some machines SNMP and WMI modules can be created.

In machines witch have enabled WMI the following modules can generate.

Modulos.png

In machines with enabled SNMP the following modules will generate:

Modulos1.png

In the massive operations menu of the pandora console there is a specific section for the satellite server where different edition, deletion actions can be performed on agents and modules massively.

Operación massivas.png

1.9 SNMP blacklist

When monitoring big networks SNMP modules that return invalid data can affect the performance of the Satellite Server and many modules may become unknown. The Satellite Server can read a blacklist of SNMP modules that will be discarded at startup before execution.

To create a new blacklist edit the /etc/pandora/satellite_server.conf configuration file and make sure snmp_blacklist if configured. Then run:

   satellite_server -v /etc/pandora/satellite_server.conf

And restart the Satellite Server. The blacklist can be regenerated as many times as needed.

The format of the blacklist file is:

agent:OID
agent:OID
...

For example:

192.168.0.1:1.3.6.1.4.1.9.9.27 
192.168.0.2:1.3.6.1.4.1.9.9.27