Agenda: exercises-nagios-with-router.txt

File exercises-nagios-with-router.txt, 29.2 KB (added by brian, 7 years ago)
Line 
1
2Nagios Installation and Configuration
3
4Notes:
5------
6* Commands preceded with "$" imply that you should execute the command as
7  a general user - not as root.
8* Commands preceded with "#" imply that you should be working as root.
9* Commands with more specific command lines (e.g. "RTR-GW>" or "mysql>")
10  imply that you are executing commands on remote equipment, or within
11  another program.
12
13
14Exercises
15---------
16
17PART I
18------
19
200. Log in to your PC as the sysadm user.
21
22
231. Install Nagios
24
25    Install Nagios version 3:
26
27        $ sudo apt-get install nagios3
28
29   Unless you already have an MTA installed, nagios3 will install
30   postfix as a dependency. If you are prompted for this, select
31   "Internet Site" option. (If you had wanted to use a different MTA like
32   exim you'd install it before nagios3)
33
34   You will be prompted to choose a nagiosadmin password. Give it the normal
35   workshop password.
36
37   To get the documentation in /usr/share/doc/nagios3-doc/html/ (which
38   can also be read via the nagios web interface), do:
39
40       $ sudo apt-get install nagios3-doc
41
42
433. You should already have a working Nagios!
44
45    - Open a browser, and go to
46
47    http://pcX/nagios3/
48
49        Check with the instructor or your neighbor if you are in doubt.
50
51    - At the login prompt, login as:
52
53        user: nagiosadmin
54        pass: <workshop password>
55
56    Browse to the "Host Detail" page to see what's already configured.
57
58
594. Let's look at the configuration layout...
60
61    # cd /etc/nagios3
62    # ls -l
63
64    -rw-r--r-- 1 root root    1882 2008-12-18 13:42 apache2.conf
65    -rw-r--r-- 1 root root   10524 2008-12-18 13:44 cgi.cfg
66    -rw-r--r-- 1 root root    2429 2008-12-18 13:44 commands.cfg
67    drwxr-xr-x 2 root root    4096 2009-02-14 12:33 conf.d
68    -rw-r--r-- 1 root root      26 2009-02-14 12:36 htpasswd.users
69    -rw-r--r-- 1 root root   42539 2008-12-18 13:44 nagios.cfg
70    -rw-r----- 1 root nagios  1293 2008-12-18 13:42 resource.cfg
71    drwxr-xr-x 2 root root    4096 2009-02-14 12:32 stylesheets
72
73    # cd conf.d
74    # ls -l   
75
76    -rw-r--r-- 1 root root 1695 2008-12-18 13:42 contacts_nagios2.cfg
77    -rw-r--r-- 1 root root  418 2008-12-18 13:42 extinfo_nagios2.cfg
78    -rw-r--r-- 1 root root 1152 2008-12-18 13:42 generic-host_nagios2.cfg
79    -rw-r--r-- 1 root root 1803 2008-12-18 13:42 generic-service_nagios2.cfg
80    -rw-r--r-- 1 root root  210 2009-02-14 12:33 host-gateway_nagios3.cfg
81    -rw-r--r-- 1 root root  976 2008-12-18 13:42 hostgroups_nagios2.cfg
82    -rw-r--r-- 1 root root 2167 2008-12-18 13:42 localhost_nagios2.cfg
83    -rw-r--r-- 1 root root 1005 2008-12-18 13:42 services_nagios2.cfg
84    -rw-r--r-- 1 root root 1609 2008-12-18 13:42 timeperiods_nagios2.cfg
85
86    Notice that the package installs files with "nagios2" in their name.
87    This is because they are the same files as were used for the Nagios
88    version 2 Debian package. However there was a change made to the
89    host-gateway configuration file, so this has a new name.
90
91
925. You have a config which is already monitoring your own system
93(localhost_nagios2.cfg) and your upstream default gateway
94(host-gateway_nagios3.cfg).
95
96Have a look at the config file for the default gateway: it's very simple.
97(Note: tab completion is useful here. Type "cat host-g" then hit tab; the
98filename will be filled in for you)
99
100    # cat host-gateway_nagios3.cfg
101
102It should look something like this:
103
104    # a host definition for the gateway of the default route
105    define host {
106            host_name   gateway
107            alias       Default Gateway
108            address     10.10.X.254
109            use         generic-host
110            }
111
112It is monitoring the virtual Cisco router which is upstream of your VM.
113
114
115
116PART II
117Configuring Equipment
118-----------------------------------------------------------------------------
119
1200. Order of configuration
121
122Conceptually we will build our configuration files from the "nearest" device
123then the further away ones.
124
125By going in this order you will have defined the devices that act as parents
126for other devices.
127
128Your upstream Cisco virtual router (your PC's gateway) is already defined.
129
1301. The three PCs in your group are directly connected to you with nothing in
131between.  So there are no dependencies.
132
133Create a new file, 'pcs.cfg', to list the three other PCs in your group. The
134example below is ONLY for pc1, which has pc2/pc3/pc4 in its group, so modify
135it for your neighbours.
136
137    # cd /etc/nagios3/conf.d/
138    # editor pcs.cfg
139
140define host {
141    use         generic-host
142    host_name   pc1
143    alias       pc1 in group 1
144    address     pc1.ws.nsrc.org
145}
146
147define host {
148    use         generic-host
149    host_name   pc2
150    alias       pc2 in group 1
151    address     pc2.ws.nsrc.org
152}
153
154define host {
155    use         generic-host
156    host_name   pc3
157    alias       pc3 in group 1
158    address     pc3.ws.nsrc.org
159}
160
161
162THE FOLLOWING STEPS 2a - 2c SHOULD BE REPEATED WHENEVER YOU UPDATE THE CONFIGURATION!
163   
164
1652a. Verify that your configuration files are OK:
166
167    # nagios3 -v /etc/nagios3/nagios.cfg
168
169    ... You should get something like this:
170Warning: Host 'pc2' has no services associated with it!
171Warning: Host 'pc3' has no services associated with it!
172Warning: Host 'pc4' has no services associated with it!
173...
174Total Warnings: 3
175Total Errors:   0
176
177Things look okay - No serious problems were detected during the check.
178Nagios is saying that it's unusual to monitor a device just for its
179existence on the network, without also monitoring some service.
180
181
1822b. Reload/Restart Nagios
183
184    # service nagios3 restart
185
186
187HINT: You will be doing this a lot. If you do it all on one line, like this,
188then you can hit cursor-up and rerun all in one go:
189
190    # nagios3 -v /etc/nagios3/nagios.cfg && service nagios3 restart
191
192The '&&' ensures that the restart only happens if the config is valid.
193
194
1952c. Go to the web interface (http://pcX/nagios3) and check that the hosts
196   you just added are now visible in the interface. Click on the "Host Detail" item
197   on the left of the Nagios screen to see this. You may see it in "PENDING"
198   status until the check is carried out.
199
200
201
2023. Let's configure Nagios to start monitoring the classroom switch and then
203the backbone router.
204
205Add the switch in a new file:
206
207    # cd /etc/nagios3/conf.d
208    # editor switches.cfg
209
210define host {
211    use         generic-host
212    host_name   bb-sw
213    alias       backbone switch
214    address     10.10.0.253
215    parents     gateway
216}
217
218
219And let's create a file for routers:
220
221        # editor routers.cfg
222
223define host {
224    use         generic-host
225    host_name   bb-gw
226    alias       backbone gw
227    address     10.10.0.254
228    parents     bb-sw
229}
230
231Notice the "parents" entry. This must point at a device or devices which are
232also defined somewhere else in the configuration.
233
234From a topology point of view, pcX cannot reach the switch 'bb-sw' if its
235gateway is down; so the parent of bb-sw is gateway.  Similarly, you cannot
236reach bb-gw if bb-sw is down, so the parent of bb-gw is bb-sw.
237
238
239We end up with this relationship from the point of view of Nagios:
240
241    [Nagios]
242       |
243       |
244    gateway   ==>    host-gateway_nagios3.cfg
245       |
246       |
247     bb-sw    ==>    switches.cfg (parent is gateway)
248       |
249       |
250     bb-gw    ==>    routers.cfg (parent is sw)
251
252
253Once you have created these files, validate the config and restart nagios
254(by repeating steps 2a - 2c above) and check the web interface.
255
256Try the "Status Map" option: it gives you a graphical view of the
257parent-child relationships you have just defined.
258
259
2604. Create an entry for the classroom NOC
261
262Open the existing pcs.cfg and add a new entry to the end:
263
264        # editor pcs.cfg
265       
266# Our classroom NOC
267
268define host {
269    use         generic-host
270    host_name   noc
271    alias       Workshop NOC machine
272    address     10.10.0.250
273    parents     bb-sw
274}
275
276
277Question: why is the parent 'bb-sw?'
278
279As usual, validate configuration and restart nagios.
280
281
282PART III
283Configure Service checks for the classroom NOC
284-----------------------------------------------------------------------------
285
2860. Configuring
287
288Now that we have our hardware configured we can start telling Nagios what services to monitor
289on the configured hardware.
290
291The most basic way is to define individual service checks.
292
2931. Edit pcs.cfg and add the following service check near the definition for
294the 'noc' host
295
296    # cd /etc/nagios3/conf.d
297    # editor pcs.cfg
298   
299define service {
300        host_name                       noc
301        service_description             HTTP
302        check_command                   check_http
303        use                             generic-service
304        notification_interval           0
305}
306
307
3082. Validate the config, restart, and via the nagios web interface check that
309the http service is being monitored (go to "service detail" page)
310
311
312However, when you are checking many identical services, this approach
313quickly becomes tedious. For example, you may have many hosts which are
314running an ssh server and you wish to monitor that service. So you create
315a single service definition, and link it to a group of hosts.
316
317
3183. Look inside the file 'services_nagios2.cfg':
319
320    # cat services_nagios2.cfg
321
322... it should include a section like this:
323
324# check that ssh services are running
325define service {
326        hostgroup_name                  ssh-servers
327        service_description             SSH
328        check_command                   check_ssh
329        use                             generic-service
330        notification_interval           0 ; set > 0 if you want to be
331        renotified
332}
333
334
3354. Open the hostgroups file
336
337    # editor hostgroups_nagios2.cfg
338
339    - Find the hostgroup named "ssh-servers". In the members section of the
340    definition  change the line:
341
342members                 localhost
343
344    to
345
346members                 localhost,noc
347
348 
349Exit and save the file.
350
351
3525. Verify that your changes are OK:
353
354        # nagios3 -v /etc/nagios3/nagios.cfg
355       
356Restart Nagios to see the new service assocation with your host:
357
358        # service nagios3 restart
359
360Click on the "Service Detail" link in the Nagios web interface to see your new entry.
361
362
363PART IV
364Defining more devices
365-----------------------------------------------------------------------------
366
3671. Create entries for some other routers and PCs in the classroom
368
369Now that we have our routers and switches defined it is quite easy to create
370entries for another group's router and PCs.  Think about the parent
371relationships:
372
373                   gw
374                    |
375          +-------------------+
376          |        sw         |
377          +-------------------+
378           |                 |
379        gateway             rtrN
380           |                 |
381     +---+-+-+---+     +---+-+-+---+
382     |   |   |   |     |   |   |   |
383    pcA pcB pcC pcD   pcW pcX pcY pcZ
384
385The parent of one of you neighbour's PCs is THEIR router. The parent of
386their router is the switch.
387
388If you are in doubt: DRAW this on paper!
389
390So: pick a group to monitor - this example assumes you decided to pick
391group 2. Edit routers.cfg to add their router:
392
393define host {
394    use         generic-host
395    host_name   rtr2
396    alias       group 2 router
397    address     rtr2.ws.nsrc.org
398    parents     bb-sw
399}
400
401And edit pcs.cfg to add their PCs:
402
403define host {
404    use         generic-host
405    host_name   pc5
406    alias       pc5 outside interface
407    address     pc5.ws.nsrc.org
408    parents     rtr2
409}
410define host {
411    use         generic-host
412    host_name   pc6
413    alias       pc6 outside interface
414    address     pc6.ws.nsrc.org
415    parents     rtr2
416}
417define host {
418    use         generic-host
419    host_name   pc7
420    alias       pc7 outside interface
421    address     pc7.ws.nsrc.org
422    parents     rtr2
423}
424define host {
425    use         generic-host
426    host_name   pc8
427    alias       pc8 outside interface
428    address     pc8.ws.nsrc.org
429    parents     rtr2
430}
431
432
433You can review the Network Diagram for the class linked off the classroom wiki
434main page.
435
436As before, repeat steps 2a-2c to verify your configuration, correct any
437errors, and activate it.
438
439PART V
440Defining more services
441-----------------------------------------------------------------------------
442
4430. For services, the default normal_check_interval is 5 (minutes) in
444   generic-service_nagios2.cfg. You may wish to change this to 1 to speed up
445   how quickly service issues are detected, at least in the workshop.
446
4471. Determine what services to define for what devices
448
449   - In this particular class we have:
450
451     routers:  running ssh and snmp
452     switches: running telnet and possibly ssh as well as snmp
453     pcs:      All PCs are running ssh and http and should be running snmp
454               The NOC is currently running an snmp daemon
455             
456     So, let's configure Nagios to check for these services for these
457     devices.
458
4592.) Verify that SSH is running on the routers and workshop PCs images
460
461   - In the file services_nagios2.cfg there is already an entry for the SSH
462     service check, so you do not need to create this. Instead, you
463     simply need to re-define the "ssh-servers" entry in the file
464     /etc/nagios3/conf.d/hostgroups_nagios2.cfg. The initial entry in the file
465     looked like:
466
467# A list of your ssh-accessible servers
468define hostgroup {
469        hostgroup_name  ssh-servers
470                alias           SSH servers
471                members         localhost,noc
472        }
473
474     What do you think you should change? Correct, the "members" line. You should
475     add the other group's router and PCs that you defined above. You can
476     also add "bb-sw" and "bb-gw" since they are also running SSH servers.
477
478     The entry will look something like this:
479
480define hostgroup {
481        hostgroup_name  ssh-servers
482                alias           SSH servers
483                members         localhost,rtr2,pc5,pc6,pc7,pc8,bb-sw,bb-gw
484        }
485
486         Note: leave in "localhost" - This is your PC and represents Nagios' network point of
487         view.
488         
489         The "members" entry will be a long line and might wrap on the screen.
490
491    - Once you are done, run the pre-flight check:
492
493    # nagios3 -v /etc/nagios3/nagios.cfg
494
495    If everything looks good, then restart Nagios
496
497    # service nagios3 restart
498
499    and view your changes in the Nagios web interface.
500
5013.) Check that http is running on all the classroom PCs.
502
503    - This is almost identical to the previous exercise.  There is already
504      a hostgroup called 'http-servers' in the hostgroups_nagios2.cfg
505      file, so you just need to add the new router and PCs there as
506      members of the http-servers group.
507
508
509
510PART VI
511Create More Host Groups
512-----------------------------------------------------------------------------
513
5140. In the web view, look at the pages "Hostgroup Overview", "Hostgroup
515   Summary", "Hostgroup Grid". This gives a convenient way to group together
516   hosts which are related (e.g. in the same site, serving the same purpose).
517
5181. Update /etc/nagios3/conf.d/hostgroups_nagios2.cfg
519
520    - For the following exercises it will be very useful if we have created
521      or update the following hostgroups:
522
523      debian-servers
524      routers
525      switches
526 
527      If you edit the file /etc/nagios3/conf.d/hostgroups_nagios2.cfg you
528      will see an entry for debian-servers that just contains localhost.
529      Update this entry to include all the classroom PCs you are monitoring,
530      including the NOC, but not including the routers.
531
532    # editor /etc/nagios3/conf.d/hostgroups_nagios2.cfg
533
534     Update the entry that says:
535
536
537# A list of your Debian GNU/Linux servers
538define hostgroup {
539        hostgroup_name  debian-servers
540                alias           Debian GNU/Linux Servers
541                members         localhost
542        }
543     
544      So that the "members" parameter contains something like this. Use your
545      classroom network diagram to confirm the exact number of machines and names
546      in your workshop.
547
548                members         localhost,noc,pc5,pc6,pc7,pc8
549
550        Be sure that the line wraps and is not on two separate lines. Otherwise
551        you will get an error when you go to restart Nagios. Remember that
552        your own PC is "localhost".
553
554      - Once you have done this, add in two more host groups, one for routers and
555        one for switches. Call these entries "routers" and "switches".
556        Include the routers and switches you are monitoring.
557
558      - When you are done be sure to verify your work and restart Nagios.
559 
5602. Go back to the web interface and look at your new hostgroups.
561
562
563PART VII
564Extended Host Information ("making your graphs pretty")
565-----------------------------------------------------------------------------
566
5671. Update extinfo_nagios2.cfg
568
569    - If you would like to use appropriate icons for your defined hosts in
570      Nagios this is where you do this. We have the three types of devices:
571
572      Cisco routers
573      Cisco switches
574      Ubuntu servers
575
576      There is a fairly large repository of icon images available for you to
577      use located here:
578
579      /usr/share/nagios/htdocs/images/logos/
580
581      these were installed by default as dependent packages of the nagios3
582      package in Ubuntu. In some cases you can find model-specific icons for
583      your hardware, but to make things simpler we will use the following
584      icons for our hardware:
585
586      /usr/share/nagios/htodcs/images/logos/base/debian.*
587      /usr/share/nagios/htdocs/images/logos/cook/router.*
588      /usr/share/nagios/htdocs/images/logos/cook/switch.*
589
590    - The next step is to edit the file /etc/nagios3/conf.d/extinfo_nagios2.cfg
591      and tell nagios what image you would like to use to represent your devices.
592
593    # editor /etc/nagios3/conf.d/extinfo_nagios2.cfg
594
595      Here is what an entry for your routers looks like (there is already an entry
596      for debian-servers that will work as is). Note that the router model (3600)
597      is not all that important. The image used represents a router in general.
598
599define hostextinfo {
600        hostgroup_name   routers
601        icon_image       cook/router.png
602        icon_image_alt   Cisco Routers (3600)
603        vrml_image       router.png
604        statusmap_image  cook/router.gd2
605}
606
607      Now add an entry for your switches. Once you are done check your
608      work and restart Nagios. Take a look at the Status Map in the web interface.
609      It should be much nicer, with real icons instead of question marks.
610
611
612PART VIII
613Create Service Groups
614-----------------------------------------------------------------------------
615
6161. Create service groups for ssh for your group's PCs.
617
618   - The idea is to create groups of services for display; one for the
619     HTTP servers in your own group, and one for the HTTP servers in the
620     other group you are monitoring. To do this create a new file:
621
622   # editor /etc/nagios3/conf.d/servicegroups.cfg
623
624# My group (example is for group 1)
625define servicegroup {
626        servicegroup_name       group1-http
627        alias                   group 1 HTTP services
628        members                 localhost,HTTP,pc2,HTTP,pc3,HTTP,pc4,HTTP
629        }
630
631# Another group (example is for group 2)
632define servicegroup {
633        servicegroup_name       group2-http
634        alias                   group 2 HTTP services
635        members                 pc5,HTTP,pc6,HTTP,pc7,HTTP,pc8,HTTP
636        }
637
638        - Note that "SSH" needs to be uppercase as this is how the service_description is
639          written in the file /etc/nagios3/conf.d/services_nagios2.cfg
640         
641    - Save your changes, verify your work and restart Nagios. Now if you click on
642      the Servicegroup menu items in the Nagios web interface you should see
643      this information grouped together.
644
645    - If you like you can also create service groups for SSH between
646      the groups.
647
648
649PART IX
650Configure Guest Access to the Nagios Web Interface
651-----------------------------------------------------------------------------
652
6531. Edit /etc/nagios3/cgi.cfg to give read-only guest user access to the Nagios
654   web interface.
655
656    - By default Nagios is configured to give full r/w access via the Nagios
657      web interface to the user nagiosadmin. You can change the name of this
658      user, add other users, change how you authenticate users, what users
659      have access to what resources and more via the cgi.cfg file.
660
661    - First, lets create a "guest" user and password in the htpasswd.users
662      file.
663     
664    # htpasswd /etc/nagios3/htpasswd.users guest
665
666      You can use any password you want (or none). A password of "guest" is
667      not a bad choice.
668
669    - Next, edit the file /etc/nagios3/cgi.cfg and look for what type of access
670      has been given to the nagiosadmin user. By default you will see the following
671      directives (note, there are comments between each directive):
672
673      authorized_for_system_information=nagiosadmin
674      authorized_for_configuration_information=nagiosadmin
675      authorized_for_system_commands=nagiosadmin
676      authorized_for_all_services=nagiosadmin
677      authorized_for_all_hosts=nagiosadmin
678      authorized_for_all_service_commands=nagiosadmin
679      authorized_for_all_host_commands=nagiosadmin
680
681      Now let's tell Nagios to allow the "guest" user some access to
682      information via the web interface. You can choose whatever you would
683      like, but what is pretty typical is this:
684
685      authorized_for_system_information=nagiosadmin,guest
686      authorized_for_configuration_information=nagiosadmin,guest
687      authorized_for_system_commands=nagiosadmin
688      authorized_for_all_services=nagiosadmin,guest
689      authorized_for_all_hosts=nagiosadmin,guest
690      authorized_for_all_service_commands=nagiosadmin
691      authorized_for_all_host_commands=nagiosadmin
692
693    - Once you make the changes, save the file cgi.cfg, verify your
694      work and restart Nagios.
695
696    - To see if you can log in as the "guest" user you may need to clear
697      the cookies in your web browser. You will not notice any difference
698      in the web interface. The difference is that a number of items that
699      are available via the web interface (forcing a service/host check,
700      scheduling checks, comments, etc.) will not work for the guest
701      user.
702
703
704OPTIONAL
705--------
706
707You can now look at configuring different plugins for monitoring
708services.
709
710*    As opposed to just checking that a web server is
711     running on the classroom PCs, you could also check that the nagios3
712     service is available, by requesting the /nagios3/ path. This means
713     passing extra options to the check_http plugin.
714
715     For a description of the available options, type this:
716
717      # /usr/lib/nagios/plugins/check_http
718      # /usr/lib/nagios/plugins/check_http --help
719
720     and of course you can browse the online nagios documentation or google
721     for information on check_http. You can even run the plugin by hand to
722     perform a one-shot service check:
723
724     # /usr/lib/nagios/plugins/check_http -H localhost -u /nagios3/
725
726     So the goal is to configure nagios to call check_http in this way.
727
728There is no suitable plugin definition available, so we need to create one.
729
730# editor /etc/nagios-plugins/config/local.cfg
731define command{
732        command_name    check_http_arg
733        command_line    /usr/lib/nagios/plugins/check_http -H '$HOSTADDRESS$' $ARG1$
734        }
735
736# editor /etc/nagios3/conf.d/services_nagios2.cfg
737define service {
738        hostgroup_name                  nagios-servers
739        service_description             NAGIOS
740        check_command                   check_http_arg!-u /nagios3/
741        use                             generic-service
742}
743
744     and of course you'll need to create a hostgroup called nagios-servers (in
745     hostgroups_nagios2.cfg) to link to this service check.
746
747     Once you have done this, check that Nagios warns you about failing
748     authentication (because it's trying to fetch the page without providing
749     the username/password). There's an extra parameter you can pass to
750     check_http_arg to provide that info, see if you can find it.
751
752      WARNING: in the tradition of "Debian Knows Best", their definition of the
753      check_http command in /etc/nagios-plugins/config/http.cfg
754      is *not* the same as that recommended in the nagios3 documentation.
755      It is missing $ARG1$, so any parameters to pass to check_http are
756      ignored. So you might think you are monitoring /nagios3/ but actually
757      you are monitoring root!
758
759     This is why we had to make a new command definition "check_http_arg".
760     You could make a more specific one like "check_nagios", or you could
761     modify the Ubuntu check_http definition to fit the standard usage.
762
763* Check that SNMP is running on the classroom NOC
764
765    - First you will need to add in the appropriate service check for SNMP in the file
766      /etc/nagios3/conf.d/services_nagios2.cfg. This is where Nagios is impressive. There
767      are hundreds, if not thousands, of service checks available via the various Nagios
768      sites on the web. You can see what plugins are installed by Ubuntu in the nagios3
769      package that we've installed by looking in the following directory:
770
771    # ls /usr/lib/nagios/plugins
772
773      As you'll see there is already a check_snmp plugin available to us. If you are
774      interested in the options the plugin takes you can execute the plugin from the
775      command line by typing:
776
777    # /usr/lib/nagios/plugins/check_snmp
778    # /usr/lib/nagios/plugins/check_snmp --help
779
780      to see what options are available, etc. You can use the check_snmp plugin and
781      Nagios to create very complex or specific system checks.
782
783    - Now to see all the various service/host checks that have been created using the
784      check_snmp plugin you can look in /etc/nagios-plugins/config/snmp.cfg. You will
785      see that there are a lot of preconfigured checks using snmp, including:
786
787      snmp_load
788      snmp_cpustats
789      snmp_procname
790      snmp_disk
791      snmp_mem
792      snmp_swap
793      snmp_procs
794      snmp_users
795      snmp_mem2
796      snmp_swap2
797      snmp_mem3
798      snmp_swap3
799      snmp_disk2
800      snmp_tcpopen
801      snmp_tcpstats
802      snmp_bgpstate
803      check_netapp_uptime
804      check_netapp_cupuload
805      check_netapp_numdisks
806      check_compaq_thermalCondition
807     
808      And, even better, you can create additional service checks quite easily.
809      For the case of verifying that snmpd (the SNMP service on Linux) is running we
810      need to ask SNMP a question. If we don't get an answer, then Nagios can assume
811      that the SNMP service is down on that host. When you use service checks such as
812      check_http, check_ssh and check_telnet this is what they are doing as well.
813
814    - In our case, let's create a new service check and call it "check_system". This
815      service check will connect with the specified host, use the private community
816      string we have defined in class and ask a question of snmp on that ask - in this
817      case we'll ask about the System Description, or the OID "sysDescr.0" -
818
819    - To do this start by editing the file /etc/nagios-plugins/config/snmp.cfg:
820
821    # editor /etc/nagios-plugins/config/snmp.cfg
822
823      At the top (or the bottom, your choice) add the following entry to the file:
824
825# 'check_system' command definition
826define command{
827       command_name    check_system
828       command_line    /usr/lib/nagios/plugins/check_snmp -H '$HOSTADDRESS$' -C
829'$ARG1$' -o sysDescr.0
830        }
831     
832      You may wish to copy and paste this vs. trying to type this out.
833
834          Note that "command_line" is a single line. If you copy and paste in joe the line
835          may not wrap properly and you may have to manually add the part:
836         
837                        '$ARG1$' -o sysDescr.0
838                       
839          to the end of the line.
840
841    - Now you need to edit the file /etc/nagios3/conf.d/services_nagios2.cfg and add
842      in this service check. We'll run this check against all our servers in the
843      classroom, or the hostgroup "debian-servers"
844
845    - Edit the file /etc/nagios3/conf.d/services_nagios2.cfg
846
847    # editor /etc/nagios3/conf.d/services_nagios2.cfg
848
849      At the bottom of the file add the following definition:
850
851# check that snmp is up on all servers
852define service {
853        hostgroup_name                  snmp-servers
854        service_description             SNMP
855        check_command                   check_system!xxxxxx
856        use                             generic-service
857        notification_interval           0 ; set > 0 if you want to be renotified
858}
859
860      The "xxxxxx" is the community string previously (or to be) defined in class.
861     
862      Note that we have included our private community string here vs. hard-coding
863      it in the snmp.cfg file earlier. You must change the "xxxxx" to be the snmp
864      community string given in class or this check will not work.
865     
866    - Now we must create the "snmp-servers" group in our hostgroups_nagios2.cfg file.
867      Edit the file /etc/nagios3/conf.d/hostgroups_nagios2.cfg and go to the end of the
868      file. Add in the following hostgroup definition:
869     
870# A list of snmp-enabled devices on which we wish to run the snmp service check
871define hostgroup {
872           hostgroup_name       snmp-servers
873                   alias        snmp servers
874                   members      noc
875          }
876         
877        - Note that for "members" you could, also, add in the switches and routers for
878          group 1 and 2. But, the particular item (MIB) we are checking for "sysDescr.0"
879          may not be available on the switches and/or routers, so the check would then fail.
880
881    - Now verify that your changes are correct and restart Nagios.
882
883    - If you click on the Service Detail menu choice in web interface you should see
884      the SNMP check appear for the noc host.
885     
886    - After we do the SNMP presentation and exercises in class, then you could come
887      back to this exercise and add in all the classroom PCs to the members list in the
888      hostgroups_nagios2.cfg file, snmp-servers hostgroup definition. Remember to list
889      your PC as "localhost".
890
891