Software Secret Weapons™  
Basics Of Performance Counters And Software Monitoring posted by Pavel Simakov on 2006-03-23 23:58:33 under Linguine Watch
view comments
 


There is a sea of RFC documents about performance counters and monitoring of hardware and software under the umbrella of Simple Network Management Protocol (SNMP). The most common documents are:

Presented here is a condensed summary that is sufficient to build a complete set of performance counters and enable monitoring of your software product without actually learning all of SNMP. Linguine Watch immediate goal is to add performance counters to your growing software product and not to comply to RFC (at first at least). And we not likely to have enough time or patience to read and internalize all of those important documents anyways.

The software and hardware monitoring solution requires both an agent and a monitoring station. An agent is a piece of software or hardware that holds the performance counter values to be monitored. Performance counters continuously change to reflect activities that occur in the running software program. An agent responds to the requests from a monitoring station. A monitoring station is a piece of software that collects values to be monitored from various agents and stores them for analysis, raising alarms, discovering trends, etc. Sometimes one server hosts both an agent and a monitoring station. In larger installations a single monitoring station collects data from many agents distributed across the network.

The performance counters held by an agent and to be monitored by a monitoring station can be of several types. They can be either counters or gauges. They are both represented by an unsigned 32 bit integers, but with slightly different meaning. Counters are always increasing numbers. "Bytes sent" monitor, for example, always has an increasing value (or a constant value if there is nothing is being sent). So does the "bytes received" monitor. The value of a counter never goes down. If current value of a counter is less than the previous one - a counter has rolled over (exceeded 32 bit).

Gauges are quite different from counters. A gauge value can be increased or decreased. Gauges do not rollover, if gauge value held internally by an agent exceeds 32 bit, agent should return to a monitoring station a max value that fits into 32 bits (2^32-1, 4294967295 decimal). "Amount of free memory in bytes", for example, is a gauge. So is "CPU Utilization in percent". There are some other types of values that are supported by SNMP, namely DERIVE and ABSOLUTE.

In addition to SCALAR metrics we just discussed, the SNMP defines also TABULAR metrics. "A list of running processes" is a good example of TABULAR metric. An implementation of an agent and a monitoring station that both work with TABULAR metrics are quite complex. In my experience with using Linguine Watch, plain counters and gauges are well sufficient for most applications.

The SNMP defines a special MIB file format for listing counter and gauge definitions. Each software or hardware vendor, that supports SNMP, also provides an MIB file that lists unique id (OID), type, name and description for each counter and gauge. Free online databases of MIB's and OID's are readily available. There are thousands of individual counters and gauges available from many vendors. The Linguine Watch generates MIB files automatically thus saving you from the need to learn MIB file format and writing the file by hand.

Linguine Watch presents a framework for adding performance counters to a large software applications. It also has a complete SNMP v.1 agent implementation that hides all the complexities of reporting performance counters to SNMP monitoring station. If you want to add enterprise quality performance monitoring to you software applications, as the big boys do, this package is for you. The easiest way to learn what this package can offer is to review practical Tutorial 1.

No comments yet


Leave a comment


  Copyright © 2004-2007 by Pavel Simakov SourceForge.net Logo