Descriptions of Tools we Use in Class
SNMP
 
The simple network management protocol (SNMP) forms part of the internet
protocol suite as defined by the Internet Engineering Task Force (IETF).
SNMP is used by network management systems to monitor network-attached
devices for conditions that warrant administrative attention. It
consists of a set of standards for network management, including an
Application Layer protocol, a database schema, and a set of data
objects.
SNMP exposes management data in the form of variables on the managed
systems, which describe the system configuration. These variables can
then be queried (and sometimes set) by managing applications.
In typical SNMP usage, there are generally a number of systems to be
managed, and one or more systems managing them. A software component
called an agent (see below) runs on each managed system and reports
information via SNMP to the managing systems.
Essentially, SNMP agents expose management data on the managed systems
as variables (such as "free memory", "system name", "number of running
processes", "default route"). The managing system can retrieve the
information through the GET, GETNEXT and GETBULK protocol operations
or the agent will send data without being asked using TRAP or INFORM
protocol operations. Management systems can also send configuration
updates or controlling requests through the SET protocol operation to
actively manage a system. Configuration and control operations are used
only when changes are needed to the network infrastructure and the
monitoring operations are frequently performed on a regular basis.
The variables accessible via SNMP are organized in hierarchies. These
hierarchies, and other metadata, are described by Management Information
Bases (MIBs).
 
Nagios
 
Nagios is a host and service monitor designed to inform you of
network problems before your clients, end-users or managers do. It
has been designed to run under the Linux operating system, but works
fine under most *NIX variants as well. The monitoring daemon runs
intermittent checks on hosts and services you specify using external
"plugins" which return status information to Nagios. When problems
are encountered, the daemon can send notifications out to administrative
contacts in a variety of different ways (email, instant message,
SMS, etc.). Current status information, historical logs, and reports
can all be accessed via a web browser.
But, does not use RRD.
CISCO IOS NetFlow
 
Cisco IOS NetFlow efficiently provides a key set of services for
IP applications, including network traffic accounting, usage-based
network billing, network planning, security, Denial of Service
monitoring capabilities, and network monitoring. NetFlow provides
valuable information about network users and applications, peak
usage times, and traffic routing. Cisco invented NetFlow and is the
leader in IP traffic flow technology.
The basic output of NetFlow is a flow record. Several different
formats for flow records have evolved as NetFlow has matured. The
most recent evolution of the NetFlow flow-record format is known
as NetFlow version 9. The distinguishing feature of the NetFlow
Version 9 format, which is the basis for an IETF standard, is that
it is template-based. Templates provide an extensible design to the
record format, a feature that should allow future enhancements to
NetFlow services without requiring concurrent changes to the basic
flow-record format. Using templates provides several key benefits:
 
- New features can be added to NetFlow more quickly, without breaking current implementations.
- NetFlow is "future-proofed" against new or developing protocols, because the NetFlow version 9 format can be adapted to provide support for them.
- NetFlow version 9 is the IETF standard mechanism for information export.
- Third-party business partners who produce applications that provide collector or display services for NetFlow will not be required to recompile their applications each time a new NetFlow feature is added; instead, they may be able to use an external data file that documents the known template formats.
MRTG
 
Multi Router Traffic Grapher
You have a router, you want to know what it does all day long? Then
MRTG is for you. It will monitor SNMP network devices and draw
pretty pictures showing how much traffic has passed through each
interface.
Routers are only the beginning. MRTG is being used to graph all
sorts of network devices as well as everything else from weather
data to vending machines.
MRTG is written in perl and works on Unix/Linux as well as Windows
and even Netware systems. MRTG is free software licensed under the
Gnu GPL.
MRTG uses the Simple Network Management Protocol (SNMP) to send
requests with two object identifiers (OIDs) to a device. The device,
which must be SNMP-enabled, will have a management information base
(MIBs) to lookup the OID's specified. After collecting the information
it will send back the raw data encapsulated in an SNMP protocol.
MRTG records this data in a log on the client along with previously
recorded data for the device. The software then creates an HTML
document from the logs, containing a list of graphs detailing traffic
for the selected device.
 
RRD
Round Robin Data
RRDtool is the OpenSource industry standard, high performance data
logging and graphing system for time series data. Use it to write
your custom monitoring shell scripts or create whole applications
using its Perl, Python, Ruby, TCL or PHP bindings.
Back end for things like Cacti, Munin, etc...
Smokeping
http://oss.oetiker.ch/smokeping/
SmokePing is a deluxe latency measurement tool. It can measure,
store and display latency, latency distribution and packet loss.
SmokePing uses RRDtool to maintain a longterm data-store and to
draw pretty graphs, giving up to the minute information on the state
of each network connection.
SmokePing uses latency measurement plug-ins for seamless extendability.
Smart Alarms
SmokePing comes with a smart alarm system. Apart from simple threshold
alarms, you have the option of defining latency or loss patterns
and use them to trigger alarms. This allows you to define a pattern
which would generate a single alarm when the loss goes from below
1% to over 20% and stays over 20% for more than 10 minutes. The
advantage of this approach is the virtual elimination of duplicate
alarms which you would get with a simple threshold based system.
Alarms can be sent to a mail address or a pager and if you want you
can also start an external script to handle the alarms.
SmokePing is a network latency monitor which works in a way that
is similar to MRTG. It measures network latency to a configurable
set of destinations on the network, and displays its findings in
easy-to-read Web pages. SmokePing has special support for monitoring
hosts with dynamic IP addresses. SmokePing uses RRDtool as its
logging and graphing back-end, making the system very efficient.
The presentation of the data on the Web is done through a CGI which
creates graphs on demand.
 
Cacti
 
Cacti is a network statistics graphing tool designed as a frontend
to RRDtool's data storage and graphing functionality. It is intended
to be intuitive and easy to use, as well as robust and scalable.
It is generally used to graph time-series data like CPU load and
bandwidth use.
The frontend is written in PHP; it can handle multiple users, each
with their own graph sets, so it is sometimes used by web hosting
providers (especially dedicated server, virtual private server, and
colocation providers) to display bandwidth statistics for their
customers. It can be used to configure the data collection itself,
allowing certain setups to be monitored without any manual configuration
of RRDtool.
 
RT (Request Tracker)
RT is an enterprise-grade ticketing system which enables a group
of people to intelligently and efficiently manage tasks, issues,
and requests submitted by a community of users.
The RT platform has been under development since 1996, and is used
by systems administrators, customer support staffs, IT managers,
developers and marketing departments at thousands of sites around
the world.
Written in object-oriented Perl, RT is a high-level, portable,
platform independent system that eases collaboration within
organizations and makes it easy for them to take care of their
customers.
RT manages key tasks such as the identification, prioritization,
assignment, resolution and notification required by enterprise-critical
applications including project management, help desk, NOC ticketing,
CRM and software development.
 
