It is a daemon which collects system performance statistics periodically and provides mechanisms to store the values in a variety of ways, for example in RRD files. It gathers statistics about the system it is running on and stores this information. Those statistics can then be used to find current performance bottlenecks (i.e. performance analysis) and predict future system load (i.e. capacity planning).
Why collectd?
There are plenty of free, open source projects that are similar to collectd. But some key features of collectd are perfect for Openstack Nova:
- it’s written in C for performance and portability, so it is not aggressive in terms of resource consumption. This is key for nova-compute nodes, for example.
- It includes optimizations and features to handle hundreds of thousands of data sets. Exactly the level of scale of a Public Cloud platform.
- It comes with over 90 plugins which range from standard cases to very specialized and advanced topics.
- It provides powerful networking features and is extensible in numerous ways. Hence, it fits perfectly in the Openstack Nova network architecture.
- Last but not least: collectd is actively developed and supported and well documented.
- It does not generate graphs. This ‘limitation’ helps to take out of controller node this feature.
- It is easy to integrate with existing monitoring tools. There’s a plugin for Nagios, so it can use the values collected by collectd.
How is Collectd configured?
When you install the Stackops Distro all the nodes (controller, volume, network and compute) bundle collectd. The Smart Installer configures the node, and collectd configuration files are then deployed with the exact configuration needed by each node type. Example: compute nodes gather information about the virtual machines and their performance, but the other nodes ignore this configuration.
The collectd network architecture is what is described in the manuals as Basic Unicast Setup: all the system performance statistics are stored in a centralized server. By default, the controller node acts as a Client generating performance statistics and Server, gathering statistics from all the nodes of the Cloud architecture.
So, if you want to use the RRD files generated, you have to go to the controller server node and browse the information stored in
stackops@nova-controller:~$ cd /var/lib/collectd/rrd stackops@nova-controller:/var/lib/collectd/rrd$
This directory contains the list of servers of the Cloud architecture:
stackops@nova-controller:/var/lib/collectd/rrd$ ls i-0000001 i-0000002 i-0000003 i-0000004 i-0000005 i-0000006 i-0000007 i-0000008 . . nova-controller.stackops.org nova-network.stackops.org nova-volume.stackops.org nova-compute-1.stackops.org nova-compute-2.stackops.org nova-compute-3.stackops.org nova-compute-4.stackops.org nova-compute-5.stackops.org nova-compute-6.stackops.org nova-compute-7.stackops.org nova-compute-8.stackops.org nova-compute-9.stackops.org nova-compute-10.stackops.org
The first ‘i-XXXXXXXXX’ directories contain the performance statistics of those virtual instances. The rest of the directories contain information about their role.
What information can gather collectd?
Depending on the role of the server node collectd will gather different performance information based on the different plugins installed:
All nodes
- interface
- cpu
- memory
- df
- disk
- vmem
- swap
nova-network
- iptables
nova-compute
- libvirt
You can read the details of installed the plugins in the website of the collectd.
How can I display the RRD files created by collectd?
There are plenty of tools to display RRD files. Check the collectd website for some suggestions.
Other resources
- collectd website http://collectd.org/wiki/index.php/List_of_front-ends
- Related sites http://collectd.org/related.shtml
- Monitoring collectd with Nagios http://www.3open.org/d/collectd/working_with_nagios and http://lefant.net/debienna/collectd_and_nagios/