Pandora:Documentation en:Optimization

From Pandora FMS Wiki

Jump to: navigation, search

Go back to Pandora FMS documentation index

Contents

Optimization and problem solving of Pandora FMS

Introduction

Pandora FMS server can monitor about 2000 devices.To do that,it is necessary to refine the configuration of the database.

In this section are also explained some techniques to detect and solve problems of your Pandora FMS installation.

Optimizing Pandora FMS

MySQL Optimization for enterprise grade systems

General Advises

The first thing you should do if you really want to have a HUGE system with tables bigger than 2GiB and that MySQL recommends, is to use a system of 64Bit.Also, we suggest this: the more RAM memory and more CPU is used, the better performance.

According with out experience, the RAM memory is more important than the CPU. If you are thinking about using 1GiB or a lower memory quantity for your SQL system, please think it again. The minimum for an enterprise system should be 2GiB. One good option for a big system is 4GiB. Remmember that bigger RAM memory could speed up the key updates through the maintenance of the key pages more used in the RAM.

Other advise, if you are using transfer tables that are not sure or you have hard disks very big and you want to avoid long file checking, would be to use a UPS. In this case, it is a good idea to be able to remove the system in case of failure. For systems where the database is in an specific server, you should have a look to 1G Ethernet.The latency is as important as the performance.

The disk optimization is very important for databases that are very big: you should cut the databases and the tables in different disks. In MySQL is possible to use symbolic links for this. Use different discs for the system and the database and, very important:try to use a hard disk of low capture, so the application would be compromised by the disk capture velocity, that increases in N log N when it gets more data.

Under GNU/Linux use hdparm-m16 -d1 in the disks when starting to prepare the reading and writing of several sector in an specific time, and also DMA. This could increase the answer time in 5-50%.Other excellent idea would be to set the disks with async(in a predetermined way) and noatime, this group does not update the time access to the files in each reading/writting. For any specific application, it would be a good idea to have a RAM disc for some very specific tables. It would be an option lightly risky if it is switched off without storing it in a non volatile disk. Please, consider it carefully.

Use --skip-locking (activate in a predetermined way in some systems) if it is possible.This will put out the external blockade and will give a better performance.

If you start the client and the MySQL in the same machine,use sockets instead of TCP/IP connexions when connecting with MySQL (this could result in an improvement of the 7.5%). You could do this without specifying the host name or the localhost when connecting with the MySQL: disauthorize the star of the binary session an the replicationif it only fires one MySQL host server.

As a general advice for a better performance, check this two items:

  • Don't use binary replication logs if you will not use replication.
  • Don't use slowquery or debug logs.
  • Check your MySQL configuration files, default values are *SLOW*.
About MySQL Versions

Some people which uses high loaded Pandora FMS servers are using Percona modified MySQL versions which offers better performance.

MySQL performance is also better in last versions (5.5) and you can get an improvement on performance about 20% respect 5.0 version.

Tools for MySQL configuration check

There are many tools to "optimize" the setup of your MySQL server. Some of them could be very useful, just to keep a look and be sure you don't pass any important parameter.

MySQL Tuning Primer, from Mattew Montgomery, is a tool (command line) to check your MySQL performance, and give you a few tips and suggestions to improve it. Check it at https://bugs.launchpad.net/mysql-tuning-primer

Disable binary replication

It is enabled by default on most Linux distros. To disable it, edit the my.cnf file, usually in /etc/my.cnf and comment the following lines:

 # log-bin=mysql-bin
 # binlog_format=mixed

Comment both lines, and then restart the MySQL Server.

Avoiding Disk Flush in Every Transaction

By default, MySQL fix autocommit=1 for each connection. This is not so bad for MyISAM, so what one person writes is not guaranteed in the disk, but for InnoDB it means that any insert / update / delete in an InnoDB table will be result in a register on the disk.

So, why would it be bad that it writes on the disk? Nothing at all. It assures that when there's any commitment, it will be for sure that the data will be there when the database will be restarted after an accident. The problem is that the DDBB performance is limited by the physical velocity of the disk. Given that the disk has to write the data in a disk before the writing has been confirmed, this will take some time.

Even when we consider a searching average time of 9ms for the disk writing, we are being limited to approximately 67 commits/ sec1, this is very slow. And while the disk is busy trying that the sector would be written, it's not reading. InnoDB can avoid some of this limitation through the association of some writing together, but, even with this, the restriction exists.

We can avoid that it writes at the end of each transaction, doing that it uses an "automatic" system of writing, that writes approximately every second. In case of failure, we could lose the data from the last second, something more bearable considering that we are trying to gain efficiency. For doing this, we need to use the following configuration token:


innodb_flush_log_at_trx_commit = 0


Reference: http://tag1consulting.com/InnoDB_Performance_Tuning

Bigger Size for the KeyBuffer

Depending on the system total RAM, it's a very important global parameter, thats speeds up DELETES and INSERT.

key_buffer = 400M

Other important buffers

There are some buffers not configured by default in some MySQL/Linux distributions. Modify these default parameters (or add it if there are not present) could be very important for the final performance. It's very important to check if they are present in the my.cnf file, if not, add it, and of course, change some values (raise a bit if you have lots of RAM).

query_cache_size = 64M
query_cache_limit = 2M 
join_buffer_size = 16M

Improving InnoDB Concurrency

There is a parameter that can affect Pandora MySQL server performance pretty much. This parameter is innodb_thread_concurrency. This parameter is used to specify how many "concurrent threads" can run MySQL. Misconfiguration of this parameter can make it go slower than the default, so it is especially important to pay attention to several parameters:

  • MySQL version. In different versions of MySQL this parameter behaves VERY differently.
  • Real number of physical processors.

Here you can read the official MySQL documentation [1].

The recommended value is the number of CPUs (Physical) multiplied by 2 plus the number of disks where is located InnoDB. In later versions of MySQL (> 5.0.21) the default is 8. A value of 0 would mean that "opens up so many threads as possible."

Different people [2] [3] have done tests and have found problems with performance on servers with multiple physical CPUs when using a very high number, with relatively old versions of MySQL (we're talking 2008).

Using a table space for each table

( From the MySQL manual at http://dev.mysql.com/doc/refman/5.0/en/innodb-multiple-tablespaces.html)

In MySQL 5.0, it's possible to store each InnoDB table and its index in its own file. This feature is called "multiple tablespaces" because each table has its own table space.

The use of multiple space tables can be useful for users that want to move specific tables to separated physical disks or the ones who wanta restore table back ups without interrupt the use of the rest of the InnoDB tables.

It's possible to activate multiple table espaces adding this line to the my.cnf Mysqld section

[mysqld]
innodb_file_per_table

After restarting the server, InnoDB will store each new created table in its own file name_tabla.ibd in the database directory to which the table belongs to. This is similar to the MyISAM store motor does, but MyISAM divides the table in a tbl_name.MYD data file and the tbl_name.MYI. index file. For InnoDB data and index are kept together in the .ibd file. The tbl_name.frm file should be created as usual.

If we take off the innodb_file_per_table line form my.cnf and we restart the server, then InnoDB will create again the tables in the shared table space files

innodb_file_per_table affect only to the table creation. If you start the server with this option, then the new tables will be created using.ibd files, but you could still have access to the existing tables in the shared table space. If you remove the option, then the new tables will be created in the shared space, but it will be still possible to have access to the tables created in multiple table spaces

MySQL Fragmentation

Like the filesystems, databases also will fragment theirselves, doing the whole system slower. In a high performance system like Pandora, you need a fast al reliable database. In overloaded systems, database could "die" and force the monitoring system to stop.

An easy way to check how "fragmented" is your database, is use this SQL query: (you can use the SQL manager at Administration -> Database -> SQL Manager in Pandora console)

select table_schema, table_name, data_free, engine  from information_schema.tables where table_schema  not in  ('information_schema', 'mysql') and data_free > 0;

This will show you some fragmented tables of "pandora" database, like these ones:

pandora	taddress	15911092224	InnoDB
pandora	taddress_agent	15911092224	InnoDB
pandora	tagent_access	15911092224	InnoDB
pandora	tagent_custom_data	15911092224	InnoDB
pandora	tagent_custom_fields	15911092224	InnoDB
pandora	tagent_module_inventory	15911092224	InnoDB
pandora	tagente	15911092224	InnoDB
pandora	tagente_datos	15911092224	InnoDB
pandora	tagente_datos_inc	15911092224	InnoDB
pandora	tagente_datos_inventory	15911092224	InnoDB
pandora	tagente_datos_log4x	15911092224	InnoDB
pandora	tagente_datos_string	15911092224	InnoDB
pandora	tagente_estado	15911092224	InnoDB
pandora	tagente_modulo	15911092224	InnoDB
pandora	talert_actions	15911092224	InnoDB

In this case, there are lots of tables fragmented. To optimize one of them, you can use this command:

OPTIMIZE table tagente;

You should not optimize big tables, because you can lock the system. MySQL locks the whole table on optimize in order to rewrite it. In small tables, it takes just a seconds, but in huge tables, like tagente_datos or tagente_datos_string could take hours... without service.

We recommend to optimize following tables:

OPTIMIZE table tagente;
OPTIMIZE table tagente_estado;
OPTIMIZE table tagente_modulo;
OPTIMIZE table taddress;
OPTIMIZE table tserver;
OPTIMIZE table tsesion;

Using MySQL Table Partitioning

To use MySQL table partitioning, you should also use "multiple-tablespace" described above.

MySQL 5.1 supports table partitioning, which allows you to split large table into multiple small logical sub-tables. (See MySQL manual for more details: http://dev.mysql.com/doc/refman/5.1/en/partitioning-overview.html)

If you have large amounts of data in your Pandora FMS database and feel many console operations which refer to these data (e.g. drawing graph) are quite slow, you will improve their performance by using table partitioning.

For example, if you want to split tagente_datos table (which typically grow too large) into 100 logical partitions based on module id automatically, run the following query:

ALTER TABLE tagente_datos PARTITION BY HASH(id_agente_modulo) PARTITIONS 100;

This operation may take a long time depending on table size. As an example, it took about one and half hours to split table which has about 7500 modules' data for 100 days (more than 50,000,000 rows):

mysql> ALTER TABLE tagente_datos
    ->  PARTITION BY HASH(id_agente_modulo)
    -> PARTITIONS 100;
Query OK, 53391880 rows affected (1 hour 35 min 3.41 sec)
Records: 53391880  Duplicates: 0  Warnings: 0

In the case of this table, it took about one seconds to execute following query to partitioned table, though it took more then 8 minutes for non-partitinoed one.

SELECT datos,utimestamp FROM tagente_datos  WHERE `id_agente_modulo` = '6332' AND utimestamp > 1322838000 AND utimestamp < 1338390000 ORDER BY utimestamp ASC

DDBB Rebuilding

Partial Rebuilding

The MySQL database management system, same as other SQL engines, such as Oracle (tm) is degraded with the time due to causes as the data fragmentation produced by the deleting and continuous insertion in large tables. In large environments with a lot traffic volume, there is a very easy way to improve the performance and avoid that the performance would be degraded, this is, to rebuild the DDBB in a periodic way

To do this, you should schedule a service stop, that could last approximately 1 hr.

In this service stop, you should stop the Pandora FMS WEB console and also the server (be careful, leave the Tentacle server to it could receive data still and these will be processed as soon as the server would be working again).

Once they have been stopped we do a DDBB dump (Export)

mysqldump -u root -p pandora3 > /tmp/pandora3.sql
Enter password:

We delete the DDBB:

> mysql -u root -p
Enter password:
Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 3279346
Server version: 5.0.67-Max SUSE MySQL RPM
Type 'help;' or '\h' for help. Type '\c' to clear the buffer.
mysql> drop database pandora3;
Query OK, 87 rows affected (1 min 34.37 sec)

We create the DDBB and do an import of the previous export:

mysql> create database pandora3;
Query OK, 1 row affected (0.01 sec)
mysql> use pandora3;
mysql> source /tmp/pandora3.sql

This could last approximately 10-30 minutes, a little more if the system is large and the hardware is not very powerful. For one system with 1500 agents and approximately 100.000 modules. It's possible to automatize this process, but, because it's very delicate, the best option is to do this manually every month or month a a half.

Total Rebuilding

This section affects only to Innodb databases. Pandora FMS is built on Innodb databases.

Unfortunately MySQL is degraded a lot with time, and this affects to the global performance of the system.There is no other solution that doesn't involve to rebuild all the database schemes from 0, rebuilding the data binary file that MySQL uses to store all the information and the files used to rebuild the transactions.

If you take a look to the /var/lib/mysql directory, you can see that there are three files, that have always the same name, and that are, depending on the severity of the case, hugh. In my case of example:

-rw-rw----  1 mysql mysql 4.8G 2012-01-12 14:00 ibdata1
-rw-rw----  1 mysql mysql 5.0M 2012-01-12 14:00 ib_logfile0
-rw-rw----  1 mysql mysql 5.0M 2012-01-12 14:00 ib_logfile1


The ibdata1 is the one that store all the system Innobd data. In a very fragmented system, that has been a lot of time without "rebuilding" or without "installing", these system will be big a little efficient. The innodb_file_per_table parameter, that we have mentioned before, regulates part of this performance.

Same way, each database has in the /var/lib/mysql directory, one directory to define its structure. You should delete them also.

The process is very easy:

  1. Dump (via mysqldump) all the schemes to disk:
 mysqldump -u root -p -A > all.sql
  1. Stop MySQL.
  2. Delete ibdata1, ib_logfile0, ib_logfile1 and the InnoDB database directories
  3. Restart MySQL.
  4. Create pandora database again (create database pandora;)
  5. Import the backup file (all.sql)
mysql -u root -p
mysql> source all.sql;


The system should go much faster now.

Optional Indexes

There are some situations when you can optimize the MySQL performance, but sacrificing other system resources.

This index optimizes speed on graph rendering (a lot), but it uses more disk storage space, and could have a slightly decrease on INSERT/DELETE operation, due the Index overhead:

ALTER TABLE `pandora`.`tagente_datos`  ADD  INDEX  `id_agente_modulo_utimestamp`  (  `id_agente_modulo`  , `utimestamp`  );

Slow queries study

In some systems, depending on the type of information we have, we can find some "slow queries" that make the system worse off than normal. We can enable logging of this type of queries over a short period of time (and that hurts the system performance) in order to consider trying to optimize queries to tables with indexes. To enable this setings, do the following:

Edit my.cnf and add the following lines:

 slow_query_log = 1
 long_query_time = 2
 slow_query_log_file = / var / log / mysql_slow.log

In the OS:

 touch / var / log / mysql_slow.log
 chmod 777 / var / log / mysql_slow.log

Restart mysql.

Optimizing Specific tables

Other less "drastic" solution to solve the problem with fragmentation is the use of the MYSQL OPTIMIZE tool to optimize certain tables of Pandora FMS. For it, directly from MySQL, execute:

OPTIMIZE table tagente_datos;
OPTIMIZE table tagente;
OPTIMIZE table tagente_datos_string;
OPTIMIZE table tagent_access;
OPTIMIZE table tagente_modulo;
OPTIMIZE table tagente_estado;

This will improve the performance, and it shouldn't be necessary to fire it more than once per week. It could be done "IN THE HEAT OF THE MOMENT" while the system is working. In very big environments the OPTIMIZE could be "blocked" not being an option. In this case the best option is to rebuild the DDBB.

After doing these operations, you should execute:

FLUSH TABLES;

From the MySQL manual:

For InnoDB tables, OPTIMIZE TABLE is mapped to ALTER TABLE, which rebuilds the table to update index statistics and free unused space in the clustered index.

Mysql special tokens

There are some tokens very "special" in MySQL: they can help or degrade the performance, there is no "fixed" rule and you will need to check it by yourself, BUT, they usually help more than make the system go worse.

# Set to 0 in mysql 5.1.12 or higher
innodb_thread_concurrency            = 20

This parameter, innodb_thread_concurrency, in versions 5.1.12 or higher, on 0 value, means there is no limit on concurrency, BUT in previous versions, the same meaning is achieved with value 20.

innodb_flush_method = O_DIRECT

This important parameter affects on how is information written on disk, in most cases, helps to set to O_DIRECT.

innodb_thread_sleep_delay = 1000
innodb_concurrency_tickets = 250

This affects on systems with huge load, and helps to get quicker queue management and locking

innodb_lock_wait_timeout = 180

This helps when your database is "stuck" in a lock due a long transaction (mysql has gone away messages). If you get more than 180 lock, you have a real problem

Configuration Sample #1

This example of configuration uses an example system with 4CPU, 4GiB RAM, and InnoDB tables for the installation of two databases, both of them with a powerful usage. Consider that the total sum of obsolete memory, in reserve for each variable, should not be higher than the 80% of the system total memory. Try to adjust the values for its installation, and have look to the MySQL registers in the demon that starts.

As a general aspect for evaluating the performance, consider that the following things affect seriously to the performance:

  • DO NOT use binary logs if you are not going to use a MySQL configuration with replication.
  • DO Not use logs of queries traceability or slow query logs.
# Sample configuration for a MySQL with ~3GB RAM dedicated to MySQL Process

[mysqld]
datadir=/var/lib/mysql
socket=/var/lib/mysql/mysql.sock
user=mysql

# -------------------------
# Innodb specific tokens
# -------------------------

# You need to create the tables AFTER using this parameters 
innodb_file_per_table

# Beware, you cannot change this two following parameters
# in an already running system or the database will be corrupted!
#innodb_log_files_in_group            = 2
#innodb_log_file_size                 = 200M

# This parameter gives about a 150% more performance, but in case of
# power failure, you will loss about 1-2 secs of data.
innodb_flush_log_at_trx_commit       = 0

innodb_buffer_pool_size              = 2G
innodb_additional_mem_pool_size      = 4M

innodb_flush_method                   = O_DIRECT
innodb_open_files                     = 700
innodb_thread_sleep_delay             = 1000
innodb_concurrency_tickets            = 250

# This will avoid some "MySQL gone away" shutdowns on long locking
innodb_lock_wait_timeout              = 180

# Set to 0 in mysql 5.1.12 or higher
innodb_thread_concurrency            = 20

# -------------------------
# Performance Parameters
# -------------------------

key_buffer 			     = 512M
max_allowed_packet                   = 64M
max_connections                      = 500
max_heap_table_size                  = 512M
read_buffer_size 		     = 16M
read_rnd_buffer_size 		     = 32M
join_buffer 			     = 2M
record_buffer 			     = 1M
table_cache                          = 128
sort_buffer_size                     = 4M
query_cache_min_res_unit             = 1K
query_cache_limit                    = 2M
query_cache_size                     = 100M
thread_stack                         = 128K
thread_cache_size                    = 8
skip-external-locking
max_delayed_threads                  = 25

# -------------------------
# Other parameters
# -------------------------

max_connections                      = 500
# OS Buffer to let connections waiting for Mysql thread.
back_log                             = 100
# Print warnings to the error log file.
log_warnings

[mysqld_safe]
log-error=/var/log/mysqld.log
pid-file=/var/run/mysqld/mysqld.pid

################# /etc/my.cnf ###################

External references

References:

MySQL Percona XTraDB

Percona is a "high performance" version of MySQL, improving a lot the scalability and using all CPU's of the system, speeding also the disk transactions.

Percona installation is like a custom "mySQL" instalacion, very similar to a standard MySQL. We strongly suggest you download the binary packages from percona.com and do a manual install based on binary package, using following procedure:

shell> groupadd mysql
shell> useradd -r -g mysql mysql
shell> cd /usr/local
shell> tar zxvf /path/to/mysql-VERSION-OS.tar.gz
shell> ln -s full-path-to-mysql-VERSION-OS mysql
shell> cd mysql
shell> chown -R mysql .
shell> chgrp -R mysql .
shell> scripts/mysql_install_db --user=mysql
shell> chown -R root .
shell> chown -R mysql data
shell> cp support-files/mysql.server /etc/init.d/mysql.server 
shell> /etc/init.d/mysql.server start

To configure your percona server, you can use their excelent online configuration wizard, which will generate the file /etc/my.cnf: Percona Wizard Configurator

This is a sample of a Percona XtradB MySQL Server with 10GB RAM, 10KrpmHardrive, Xeon Quadcore.

[mysql]

# CLIENT #
port                           = 3306
socket                         = /var/lib/mysql/data/mysql.sock

[mysqld]

# GENERAL #
user                           = mysql
default_storage_engine         = InnoDB
socket                         = /var/lib/mysql/data/mysql.sock
pid_file                       = /var/lib/mysql/data/mysql.pid

# MyISAM #
key_buffer_size                = 32M
myisam_recover                 = FORCE,BACKUP

# SAFETY #
max_allowed_packet             = 16M
max_connect_errors             = 1000000
skip_name_resolve
sysdate_is_now                 = 1
innodb                         = FORCE

# DATA STORAGE #
datadir                        = /var/lib/mysql/data/

# CACHES AND LIMITS #
tmp_table_size                 = 32M
max_heap_table_size            = 32M
query_cache_size               = 32M
max_connections                = 500
thread_cache_size              = 50
open_files_limit               = 65535
table_definition_cache         = 1024
table_open_cache               = 2048

sort_buffer_size  = 1M
join_buffer_size = 8M
read_rnd_buffer_size = 16M

# INNODB #
innodb_flush_method            = O_DIRECT
innodb_log_files_in_group      = 2
innodb_log_file_size           = 64M
innodb_flush_log_at_trx_commit = 0
innodb_file_per_table          = 1
innodb_buffer_pool_size        = 6G
innodb_locks_unsafe_for_binlog = 1
innodb_lock_wait_timeout = 30
innodb_stats_on_metadata = 0
innodb_read_io_threads = 4
innodb_write_io_threads = 4
innodb_old_blocks_time = 1000

# LOGGING #
log_error                      = /var/lib/mysql/data/mysql-error.log
log_queries_not_using_indexes  = 0
slow_query_log                 = 0
slow_query_log_file            = /var/lib/mysql/data/mysql-slow.log

Measuring Pandora FMS for High Capacity

This section describes different methods in order to configure Pandora FMS in a high capacity environment. It also describes different tools for doing load tests, useful to fix the environment to the highest level of process.

Pandora FMS has been configured to bear a load of 2000 agents in systems where database, console and server are in the same machine. The recommended number is around 1200/1500 agents by system, but this number changes a lot depending on if they are XML agents, remote modules, with high or low intervals, with systems of high capacity or low memory. All these things changes a lot the nº of agents that one system is able to manage in an efficient way.

Use of RAM (tmpfs) disks for the incoming directory

In some environments of high capacity for the XML processing coming from agents, the directory directorio /var/spool/pandora/data_ has a high traffic and to have this file system available in a memory storage can improve the XML processing performance in a 25%.

To create a partition in /var/spool/pandora/data_in_RAM, it will be enough with the command:

mount -t tmpfs -o size=100M,nr_inodes=10k,mode=770 tmpfs /var/spool/pandora/data_in_RAM

It is possible to program in /etc/inittab so as this partition would be created when starting. The end directory should be exist and be empty.

tmpfs /var/spool/pandora/data_in_RAM tmpfs size=100M,nr_inodes=10k,mode=770 0 0

Of course, as it is limited to 100MB, if the system is filled it will stop working properly. If you are working with policies or remote configurations the directories that usually hang from /data_in (file collections, md5, conf and others) should be located as links to their real paths in the disk, with an structure based in the following commands:

mv /var/spool/pandora/data_in /var/spool/pandora/data_in_old
ln -s /var/spool/pandora/data_in /var/spool/pandora/data_in_RAM
ln -s /var/spool/pandora/data_in_old/md5 /var/spool/pandora/data_in_RAM/md5
ln -s /var/spool/pandora/data_in_old/conf /var/spool/pandora/data_in_RAM/conf
ln -s /var/spool/pandora/data_in_old/collections /var/spool/pandora/data_in_RAM/collections

Many Request in the Same System

An special case to implement a bigger processing power in servers with several processors (of two or more physical cores) consist of implementing several instances of Pandora Specific servers in the same machine, some that has nothing to do with increasing the nº of threads of the server, so due to the design of the Linux Kernel and of the Perl virtual machine, it is possible to take the most of the cores with several processes than with more threads in the same process

You can use this technique when Pandora FMS is not able of processing all the information without delaying to much. This options means that you should have to install another Pandora FMS server with other incoming entry directory. Of course it will have its own pandora_server.conf and a different server name. You should also do some changes in the server firing script and other smaller customizations in the system.

Example of High Capacity Servers Configuration

For example, for one machine with 16GB of RAM and 4 CPUs that we wanted to optimize for the Data server maximum processing capacity (XML)

my.cnf

(Only the most important parameters are shown)

key_buffer              = 1G
innodb_flush_log_at_trx_commit = 0
innodb_file_per_table
skip-locking
innodb_thread_concurrency = 16
max_allowed_packet      = 160M
query_cache_limit       = 50M
query_cache_size        = 360M
innodb_buffer_pool_size=9000M
innodb_additional_mem_pool_size=800M
innodb_log_file_size=2500M
innodb_log_buffer_size=80M
innodb_lock_wait_timeout=50

pandora_server.conf

(Only the most important parameters are shown)

verbose 1
server_threshold 15
dataserver_threads 5
max_queue_files 1000

You should consider these things:

  • A very high nº of threads(+5) only benefits to the processes with large E/S queues, like the network or the plugin server, just in case that the dataserver, which is always a processing one, could even penalize the performance. This is the reason why we use 5 here. In systems with an slow DB, we should use even less threads. Test different combinations between 1 and 10. In case of optimizing the system for the networkserver, the nº would be higher, between 10 and 30.
  • A high threshold server(15) does that the DB suffer less, and the increase in the maximum nº of files processed makes that any time that the server "looks for files" it fill the buffers. These two elements of the configuration are linked. In the case of optimizing the network server, it would be advisable to low the server threshold to 5 or 10.
  • Some parameters of the configuration could affect a lot to Pandora FMS performance, such as the parameter agent_access (configurable from the console).

Capacity analysis Tools(Capacity)

Pandora FMS has several tools that can help you to measure properly its hardware and software for the data amount that it expects to obtain.One of them is useful to "attack" directly the database with fictitious data (dbstress) and the other generates fictitious XML files(xml_stress)

Pandora FMS XML Stress

This is an small script that generates XML data files like the ones sent by Pandora FMS agents. It's placed on /usr/share/pandora_server/util/pandora_xml_stress.pl

The scripts reads agent names from a text file and generates XML data files for each agent according to a configuration file, where modules are defined as templates.

Modules are filled with random data. An initial value and the probability of the module data changing may be specified.

Run the script like this:

./pandora_xml_stress.pl <configuration file>

Sample configuration file:

# Maximum number of threads, by default 10.
max_threads 10

# File containing a list of agent names (one per line).
agent_file agent_names.txt

# Directory where XML data files will be placed, by default /tmp.
temporal /var/spool/pandora/data_in

# Pandora FMS XML Stress log file, logs to stdout by default.
log_file pandora_xml_stress.log

# XML version, by default 1.0.
xml_version 1.0

# XML encoding, by default ISO-8859-1.
encoding ISO-8859-1

# Operating system (shared by all agents), by default Linux.
os_name Linux

# Operating system version (shared by all agents), by default 2.6.
os_version 2.6

# Agent interval, by default 300.
agent_interval 300

# Data file generation start date, by default now.
time_from 2009-06-01 00:00:00

# Data file generation end date, by default now.
time_to 2009-06-05 00:00:00

# Delay after generating the first data file for each agent to avoid
# race conditions when auto-creating the agent, by default 2.
startup_delay 2

# Address of the Tentacle server where XML files will be sent (optional).
# server_ip 192.168.50.1

# Port of the Tentacle server, by default 41121.
# server_port 41121

# Module definitions. Similar to pandora_agent.conf.

module_begin
module_name Module 1 
module_type generic_data
module_description A long description.
module_max 100
module_min 10
module_exec type=RANDOM;variation=60;min=20;max=80
module_end
module_begin
module_name Module 2
module_type generic_data
module_description A long description.
module_max 80
module_min 20
module_exec type=SCATTER;prob=1;avg=40;min=0;max=80
module_end
module_begin
module_name Module 3
module_type generic_data
module_description A long description.
module_max 80
module_min 20
module_exec type=CURVE;min=20;max=80;time_wave_length=3600;time_offset=0
module_end


module_begin
module_name Module 4
module_type generic_data_string
module_description A long description.
module_max 100
module_min 10
module_exec type=RANDOM;variation=60;min=20;max=80
module_end

module_begin
module_name Module_3
module_type generic_proc
module_descripcion Module 3 description.
# Initial data.
module_data 1
module_end
Send and Receive the Agent Local Configuration

If you start in your "pandora_xml_stress.conf" the configuration value "get_and_send_agent_conf" to 1, you can do that the test load agents will act as normal agents, so they send their configuration file and also the md5. And from Pandora Console Enterprise you can change the remote configuration in orther that in next executions of the pandora_xml_stress it uses the customized configuration from the Pandora Console Enterprise instead of doing it through the "pandora_xml_stress.conf" definition.

Besides this, you can configure where to store in a local way the configuration of your testing agents with the "directory_confs" configuration token in the file "pandora_xml_stress.conf".

Configuration File
  • max_threads Number of threads where the script will be executed.This improves the E/S.
  • agent_file Path of the name list file path, separated by new line
  • temporal Path of the directory where the fictitious XML data files are generated.
  • log_file Path of the log where it will inform about its execution script.
  • xml_version Version of the XML data file (by default 1.0)
  • encoding XML data files encoding (by default ISO-8859-1).
  • os_name Name of the fictitious agent Operative System (by default Linux).
  • os_version Version of the fictitious agents Operative System (by default 2.6)
  • agent_interval Interval of the fictitious agents in seconds (by default 300).
  • time_from Time from which fictitious XML data files are generated, in format" YEAR-MONTH-DAY HOUR:MIN:SEC"
  • time_to Time until which fictitious XML data files are generated, in format YEAR-MONTH-DAY HOUR:MIN:SEC"
  • get_and_send_agent_conf Boolean value 0 or 1. When it is active the fictitious agents will try to download by remote configuration a more updated version of the standard configuration file of an agent. And from the Pandora FMS Enterprise console you can edit them.
  • startup_delay Time numeric value in seconds before each agent starts to generate the files. It is used to avoid race conditions.
  • timezone_offset Numeric value of the time zone offset
  • timezone_offset_range Numeric value that is useful to generate the timezone in this range in a random way.
  • latitude_base Numeric value. It's the latitude where the fictitious agents will be shown.
  • longitude_base Numeric value. It's the longitude where the fictitious agents will be shown.
  • altitude_base Numeric value. It's the altitude where the fictitious agents will be shown.
  • position_radius Numeric value. It's the range around. The circumference with this radius where the fictitious agent is shown in a random way.
Module Definition

The definition of one module in the script configuration file and if you have activated the remote configuration will also be the same. It is:


module_begin
module_name <name of the module>
module_type <type, p.e: generic_data>
module_description <description>
module_exec type=<type>;<other options separated by ; >
module_unit <units>
module_min_critical <value>
module_max_critical <value>
module_min_warning <value>
module_max_warning <value>
module_end

And you can configure each of them as:

  • <type of exec>:Can have the values RANDOM,SCATTER,CURVE.
  • module_attenuation <value>: The generated module value is multiplied by the specified value, usually between 0.1 and 0.9.
  • module_attenuation_wdays <value> <value> ... <value>: The module value is only attenuated the given days, ranging from Sunday (0) to Saturday (6). For example, the following module simulates a 50% drop in network traffic on Saturdays and Sundays:
module_begin
module_name Network Traffic
module_type generic_data
module_description Incoming network traffic (Kbit/s)
module_exec type=RANDOM;variation=50;min=0;max=1000000
module_unit Kbit/s
module_min_critical 900000
module_attenuation 0.5
module_attenuation_wdays 0 6
module_end
  • module_incremental <value>: If set to one, the module's previous value is alway added to a new value, resulting in an increasing function.
  • Others: See below what options are available, depending on the execution type.

Note that min/max_critical and min/max_warning are only available in 5.0 or higher version.

RANDOM

These have the following options:

  • variation probability in % that it would change regarding the previous value.
  • min Minimum value that the the value could have.
  • max Maximum value that the the value could have.

Numeric

Generates random numeric values between the ranges value min and the value max

Booleans

Generates values between 0 and 1.

String

Generates a string of length between values minand max. The characters are random between A and Z and includes capital and lower case letters and also numeric ciphers.

External data source (SOURCE)

Allows you to use a plain text file as a data source. Options:

  • src: source data file.

The file contains one data per line, there is no limit for lines. For example:

4
5
6
10

There are two possibilities for data (numeric and strings). These kind of modules will use data from file to generate module data in Pandora, data will be get secuentially. For example data above will be shown as follows:

4 5 6 10 4 5 6 10 4 5 6 10 4 5 6 10 4 5 6 10 4 5 6 10
SCATTER

It is only useful for numeric data, and the generated graphics are similar to the ones of a heartbeat, that is, a normal value, and from time to time a "beat".

It has the following options:

  • min Minimum value that the value could have.
  • max Maximum value that the value could have.
  • prob Probability in % that it generates a "beat".
  • avg Average value that it should show by default if there isn't any "beat".
CURVE

Generates module data following a trigonometric curve. They have the following options:

  • min Minimum value that the value could have
  • max Maximum value that the value could have
  • time_wave_length Numeric value in seconds of the duration of the "crest" of the wave
  • time_offset Numeric value in seconds from the starting of the wave from time zero with module value zero (similar to the sine graph)
Notes of Interest

Please, consider that the amount of generated files is the link between the starting time (time_from)and the final date (time_to) and the interval setted in the agent (agent_interval),so is there are long periods of time or small intervals, the script will generate lot of XML data files.

How to measure the Data server Processing Capacity

There is an small script called "pandora_count.sh" that is in the util/directory in the Pandora FMS server directory. This script is used to measure the processing rate of XML files by the data server, and it uses as reference all the files pending of processing at /var/spool/pandora/data_in so to can use it you need thousand of packages pending of being processed (or to generate them with the tool mentioned before). This script takes into account only the packages that are now, and it take them away from the packages that were 10 seconds ago, then divide the result by 10, and these will be the files that have been processed in the last 10 seconds, showing the rate per second. It's a rudimentary solution but it serves to fix the server configuration.

Pandora FMS DB Stress

This is an small tool to test you database performance.It could also be used to «pregenerate » periodical or random data (using trigonometric functions) and fill in fictitious modules.

you should create an agent and to assign it modules for automatic data injection with this tool. The names should be these ones:

  • random: to generate random data
  • curve: to generate a coincidence curve using trigonometric functions. useful to use the interpolating work with different intervals, etc.
  • boolean: To generate random boolean data

This way it's possible to use any name that contains the words «random», «curve» and/or «boolean». For example:

  • random_1
  • curve_other

You will only could choose the «data_server» module kind

Pandora FMS DB Stress Fine Adjustment

This tool is preconfigured in order to search, in all agents, the modules name «random», «curve» o «boolean»,that use one interval between 300 seconds and 30 days.

If you want to modify this performance, you should edit the pandora_dbstress script and change some variables at the start of the file:

# Configure here target (AGENT_ID for Stress)
my $target_module = -1; # -1 for all modules of that agent
my $target_agent = -1;
my $target_interval = 300;
my $target_days = 30;

The first line of variable corresponding withtarget_module, should be fix for a fix module or -1 to process all the objectives that match. The second line of variable match with target_agent, for an specific agent. The third line match with target_interval,defined in seconds and that represent the module predefined periodical interval. The fourth line is target_daysand represent the number of days in the past since the date , in the current timestamp.

Problem Solving and Diagnostic tools in Pandora FMS

Sometimes, the user have problems and Pandora Developers can't help without more information about the user systems. In 3.0 version we have created two small tools to help solving user problems:

pandora_diag.php

This is a web diagnostic tool. You need to have an active session in order to use this resource. It gives information about Pandora FMS database usage, and some setup values and version. This tool is accessible from your console using the following URL:

http://localhost/pandora_console/extras/pandora_diag.php

If you have your PandoraFMS console in other URL just add /extras/pandora_diag.php to your home url.

Sample of output

Pandora FMS Build	PC090512
Pandora FMS Version	v3.0-dev
Homedir	/var/www/pandora_console
HomeUrl	/pandora_console
tagente	2385
tagent_access	20049
tagente_datos	4342323
tusuario	19
Update Key	PANDORA-FREE
Updating code path	/var/www/pandora_console
Keygen path	/usr/share/pandora/util/keygen
Current Update #	0

This tool can be launched also from command line, and you need to pass the full path to your Pandora FMS console homedir, for example:

php /var/www/pandora_console/extras/pandora_diag.php /var/www/pandora_console

And the output of this script will be printed on the standard console output.

pandora_diagnostic.sh

Is a tool placed on /usr/share/pandora_server/util and it gives a lot of information about the system:

  • CPU information
  • Uptime and CPU avgload
  • Memory information
  • Kernel/Release information.
  • A fully mysql config file dump.
  • A fully PandoraFMS Server config file dump (filtering passwords).
  • Pandora FMS logs information (but not the full log!).
  • Disk information
  • Pandora FMS processes information
  • A fully kernel log information (dmesg).

All information is generated in a .txt file so users can sent this information to anyone who wants to help them, for example in Pandora FMS user forums or in the Pandora FMS public mailing lists. This information should not have any kind of confidential information. Note that you probably want to run with root privileges if you want to get pandora_server.conf and my.cnf files parsed.

This is an example of execution:

$ ./pandora_diagnostic.sh 
 
Pandora FMS Diagnostic Script v1.0 (c) ArticaST 2009
http://pandorafms.org. This script is licensed under GPL2 terms
 
Please wait while this script is collecting data
  
Output file with all information is in '/tmp/pandora_diag.20090601_164511.data'

And here there are some parts of file output

Information gathered at 20090601_164511
Linux raz0r 2.6.28-12-generic #43-Ubuntu SMP Fri May 1 19:27:06 UTC 2009 i686 GNU/Linux
=========================================================================
-----------------------------------------------------------------
CPUINFO
-----------------------------------------------------------------
processor	: 0
vendor_id	: GenuineIntel
cpu family	: 6
.
.
-----------------------------------------------------------------
Other System Parameters
-----------------------------------------------------------------
Uptime:  16:45:11 up  5:27,  2 users,  load average: 0.11, 0.12, 0.09
-----------------------------------------------------------------
PROC INFO (Pandora)
-----------------------------------------------------------------
slerena  11875  0.9  2.1 114436 44336 pts/0    Sl   13:14   1:56 gedit pandora_diagnostic.sh
slerena  24357  0.0  0.0   4452  1524 pts/0    S+   16:45   0:00 /bin/bash ./pandora_diagnostic.sh
-----------------------------------------------------------------
MySQL Configuration file
-----------------------------------------------------------------
#
# The MySQL database server configuration file.
#
# You can copy this to one of:
# - "/etc/mysql/my.cnf" to set global options,
.
.
.
-----------------------------------------------------------------
Pandora FMS Logfiles information
-----------------------------------------------------------------
total 3032
drwxr-xrwx  2 root    root       4096 2009-04-30 20:00 .
drwxr-xr-x 17 root    root       4096 2009-06-01 11:24 ..
-rw-r-----  1 root    sys      377322 2009-04-06 00:12 pandora_agent.log
-rw-r--r--  1 root    root          0 2009-04-06 00:15 pandora_agent.log.err
-rw-r--r--  1 root    root      13945 2009-04-02 21:47 pandora_alert.log
-rw-r--r--  1 slerena slerena 2595426 2009-04-30 20:02 pandora_server.error
-rw-rw-rw-  1 root    root       9898 2009-04-30 20:02 pandora_server.log
-rw-rw-rw-  1 root    root      65542 2009-04-30 20:00 pandora_server.log.old
-rw-r--r--  1 root    root         94 2009-04-06 00:19 pandora_snmptrap.log
-rw-rw-rw-  1 root    root          4 2009-04-03 14:16 pandora_snmptrap.log.index
-----------------------------------------------------------------
System disk
-----------------------------------------------------------------
S.ficheros            Tamaño Usado  Disp Uso% Montado en
/dev/sda6              91G   49G   37G  58% /
tmpfs                1003M     0 1003M   0% /lib/init/rw
varrun               1003M  260K 1002M   1% /var/run
varlock              1003M     0 1003M   0% /var/lock
udev                 1003M  184K 1002M   1% /dev
tmpfs                1003M  480K 1002M   1% /dev/shm
lrm                  1003M  2,4M 1000M   1% /lib/modules/2.6.28-12-generic/volatile
-----------------------------------------------------------------
Vmstat (5 execs)
-----------------------------------------------------------------
procs -----------memory---------- ---swap-- -----io---- -system-- ----cpu----
 r  b   swpd   free   buff  cache   si   so    bi    bo   in   cs us sy id wa
 2  0      0 684840 119888 619624    0    0    15    10  258  474  3  1 95  0
 0  0      0 684768 119888 619640    0    0     0     0  265  391  0  0 100  0
 0  0      0 684768 119892 619636    0    0     0    56  249  325  1  1 99  0
 0  0      0 684768 119892 619640    0    0     0     0  329  580  0  0 100  0
 0  0      0 684776 119892 619640    0    0     0     0  385 1382  1  0 99  0
-----------------------------------------------------------------
System dmesg
-----------------------------------------------------------------
[    0.000000] BIOS EBDA/lowmem at: 0009f000/0009f000
[    0.000000] Initializing cgroup subsys cpuset
[    0.000000] Initializing cgroup subsys cpu
[    0.000000] Linux version 2.6.28-12-generic (buildd@rothera) (gcc version 4.3.3 (Ubuntu 4.3.3-5ubuntu4) )   #43-Ubuntu SMP Fri May 1 
19:27:06 UTC 2009 (Ubuntu 2.6.28-12.43-generic)
.
.
-----------------------------------------------------------------
END OF FILE
-----------------------------------------------------------------
560e8fa02818916d4abb59bb50d91f6a  /tmp/pandora_diag.20090601_164511.data

Go back to Pandora FMS documentation index