Handling SNMP Counter32 overflows on HP1810-G correctly

Most network devices can be queried using SNMP nowadays. Luckily, even for a small HP 1810G-24 Switch, about 2000 SNMP OIDs exist – hooray for graphing.
One of the most interesting graphs on a switch is bits in/out on each port.

The HP 1810G-24 switch does not support Counter64-types for octet counters. This post introduces a middleware to fetch the Counter32-types fast enough to build a software Counter64 object to display the correct bandwidth in a munin graph.

The OIDs we need for this are defined in the IF-MIB (I really like the Cisco SNMP Object Navigator for quick OID <=> Name conversion):

IF-MIB::ifInOctets.1 (1.3.6.1.2.1.2.2.1.10.1)
and
IF-MIB::ifOutOctets.1 (1.3.6.1.2.1.2.2.1.16.1)

Unfortunately, this values are of the SNMP type Counter32, that means they overflow every 4GB (or, at full gigabit load, approximately every 34 seconds). Fortunately, there are OIDs in a Counter64 variant:

IF-MIB::ifHCInOctets.1 (1.3.6.1.2.1.31.1.1.1.6.1)
and
IF-MIB::ifHCOutOctets.1 (1.3.6.1.2.1.31.1.1.1.10.1)

Like stated before, the HP1810G-24 (even with the newest firmware P2.2) does not support Counter64 objects.

However, our munin graphing fetches the values every 5 minutes (that means, only one overflow is allowed in this time period – just below 8 GB traffic in 5 minutes, otherwise aliasing occurs).

To support the full bandwith, we have to fetch the values at least every 34/2 = 17 seconds (according to the Nyquist-Shannon sampling theorem) and detect overflows ourselves. An overflow occurs, if the last value was larger than the current value, because packet counters are usually monotonic functions.

That means: No overflow – just add the difference from the current value to the last value to the counter. Overflow – add the maximum number, decreased by the difference of the last value and the current value.

This script “switch_getcounters.py” fetches the SNMP values every 10 seconds and accumulates the counters in an external csv file. This csv file will be used instead of real SNMP GETs in the munin plugin.

#!/usr/bin/python

from pysnmp.entity.rfc3413.oneliner import cmdgen

import csv, time

name = 'hpswitch'
community = 'public'

SLEEPTIME=10
FILENAME="/tmp/switching.csv"
PORTCOUNT=24

portstatus = []
try:
infile = file(FILENAME, "rb")
csvread = csv.reader(infile)
for row in csvread:
portstatus.append({"cur_in": int(row[1]),
"cur_out": int(row[2]),
"last_in": int(row[3]),
"last_out": int(row[4]) })
infile.close()
if len(portstatus) < PORTCOUNT:
for i in range(PORTCOUNT-len(portstatus)):
portstatus.append({"cur_in": 0, "cur_out": 0,
"last_in": 0, "last_out": 0 })
except:
portstatus = []
for i in range(PORTCOUNT):
portstatus.append({"cur_in": 0, "cur_out": 0,
"last_in": 0, "last_out": 0 })

def getBulk(host, community, oid):
oid = tuple(map(int, oid.strip('.').split('.')))
errorIndication, errorStatus, \
errorIndex, varBindTable = cmdgen.CommandGenerator().bulkCmd(
cmdgen.CommunityData('hpswitch', 'public'),
cmdgen.UdpTransportTarget(('hpswitch', 161)),
0, PORTCOUNT,
oid
)
data = []
if errorIndication:
print errorIndication
else:
if errorStatus:
print '%s at %s\n' % (
errorStatus.prettyPrint(),
errorIndex and varBinds[int(errorIndex)-1] or '?'
)
else:
for varBindTableRow in varBindTable:

for name, val in varBindTableRow:
data.append(int(val))
return data

outfile = file(FILENAME, "wb")

csvwrite = csv.writer(outfile)
INT_MAX = 4294967295L

while True:
start = time.time()
inOct = getBulk(name, community, "1.3.6.1.2.1.2.2.1.10")[:PORTCOUNT]
outOct = getBulk(name, community, "1.3.6.1.2.1.2.2.1.16")[:PORTCOUNT]
outfile.seek(0)

for i in range(len(inOct)):

if portstatus[i]["last_in"] > inOct[i]:
#overflow
#print "overflow in at %s from %s to %s" % (i, portstatus[i]["last_in"], inOct[i])
portstatus[i]["cur_in"] += INT_MAX - (portstatus[i]["last_in"] - inOct[i])
else:
portstatus[i]["cur_in"] += inOct[i] - portstatus[i]["last_in"]

if portstatus[i]["last_out"] > outOct[i]:
#overflow
#print "overflow out at %s from %s to %s" % (i, portstatus[i]["last_out"], outOct[i])
portstatus[i]["cur_out"] += INT_MAX - (portstatus[i]["last_out"] - outOct[i])
else:
portstatus[i]["cur_out"] += outOct[i] - portstatus[i]["last_out"]

portstatus[i]["last_in"] = inOct[i]
portstatus[i]["last_out"] = outOct[i]

csvwrite.writerow([i+1, portstatus[i]["cur_in"],
portstatus[i]["cur_out"],
portstatus[i]["last_in"],
portstatus[i]["last_out"] ])

outfile.flush()
stop = time.time()
sleep = SLEEPTIME - (stop-start)
if sleep > 0:
time.sleep(sleep)

outfile.close()

The resulting csv file can be used for every graphing tool you can imagine. The new counters last at least 5 minutes ;-).
For the full munin-example, I just re-used the Perl snmp__if_ plugin for autoconfiguration, but replaced the value-code with an external call to “readsnmp” which converts the snmp value into the munin format.

snmp__if_ (remove the prints for recv. and send.):

print `/usr/share/munin/plugins/readsnmp $iface`

readsnmp

#!/usr/bin/python
import csv, sys

FILENAME="/tmp/switching.csv"
try:
infile = file(FILENAME, "rb")
csvread = csv.reader(infile)
except:
print "recv.value U"
print "send.value U"
sys.exit(1)

#munin_args = sys.argv[0].split("_")
iface = 1
try:
if len(sys.argv) > 1:
iface = int(sys.argv[1])
except:
pass

i=0
out = False
for row in csvread:
i+=1
if i != iface:
continue

out = True
print "recv.value %s" % row[1]
print "send.value %s" % row[2]

if not out:
print "recv.value U"
print "send.value U"

Our nice new munin chart:

Correct bandwidth display

Please note – if you have a switch with proper Counter64-support, just forget this workaround.
Hope this helps for other HP1810G users,
Gregor

This entry was posted in Python and tagged , , , , , , . Bookmark the permalink.

One Response to Handling SNMP Counter32 overflows on HP1810-G correctly

  1. Phil says:

    Hi,

    thank your for this very nice scripts.

    I have following questions:

    1. Can you give me the readsnmp script with the right tabstops at beginning of the lines? I never used Python before so it’s a little bit tricky…

    2. Can you give me some more with the snmp__if_ -File, I tried to change my file but it doesn’t work well, i don’t know where i have to put the code an where I should “remove the prints for recv. and send.”

    Sorry for my english,

    Thank you very much,
    Kind Regars from Germany