New assistant network monitoring and servers

New assistant network monitoring and servers

sarka's picture

For some time now we monitor our servers and network in every possible way. The problems report to us old daemon icing and about collecting operational data daemon_Collectd . The latter tool is in the collection of data very good and is able to collect sufficient granulites and still encumbers the machine on which it runs. His only flaw is the front end and so we had to help ourselves.


Developers did not include Collectd frontend to project  and so the thing plunged community. Frontend shares with collectd database files but unlike collectd data from them only reads and generates the appropriate output to the human eye. If data are usually collectd graphs. They allow us to compare the load on the server with requests on Apache. We know what server do some planned actions (rotate logs) and most importantly - we know what preceded the problems and we are able to predict even before they notice users.


For this reason, we monitor such as:


  • Load servers (load average)
  • CPU load of individual
  • Cast of RAM
  • Cast disc
  • Disk operations
  • Processes
  • Communication between mod_fcgid in Apachi and PHP
  • Individual ports on all switches
  • Network servers and router interfaces
  • DNS lookups
  • Queries on MySQL database

A few other values ​​depending on what the server is used. Overall, we have 600 different graphs and overall view of the state of our infrastructure. Some graphs show values ​​more, so in total we can compare across the 2500th


So that everything went smoothly and fast, we had to find a frontend for Collectd, which would allow us to work efficiently with graphs. So many tools exist, but their development either stationary or detract from the direction we need. Therefore, we spent some time own tools which we called CollectdGraphs.


Our goal was to develop a mobile application that can handle work on large displays. We were virtually no restrictions and I eventually ended up with a combination of the Python programming language and framework jQuery Mobile. With the dispensing worry about graphics and application can be optimized for both large and small display exerting little effort. The result looks like this:


The application provides the following:

  • It includes a framework for easy writing plugins for data processing
  • Draws graphs
  • Takes care of the communication with the user

The interface works equally well on large computers and displays for tablets and phones. The plugin system allows you to write a new plugin for data processing in a few minutes. Graphs are generated using RRDtool that Collectd used for storage. Data are stored in RRD databases, which are circular buffers that do not change their size and are designed to store the measured data at a specific time. One database can contain multiple values, such as incoming and outgoing packets an interface and so on. RRD database is also proposed for operations over these data do, which is the aggregation or selection of data from a given interval.


But to summarize, thanks Collectd and CollectdGraphs our new tools, we know exactly what is happening on the network and whether the gun at any moment. CollectdGraphs has an open source code, which can be found on GitHub.