Agenda: exercises-nagios.txt

File exercises-nagios.txt, 29.0 KB (added by admin, 8 years ago)
Line 
1
2Nagios Installation and Configuration
3
4Notes:
5------
6* Commands preceded with "$" imply that you should execute the command as
7  a general user - not as root.
8* Commands preceded with "#" imply that you should be working as root.
9* Commands with more specific command lines (e.g. "RTR-GW>" or "mysql>")
10  imply that you are executing commands on remote equipment, or within
11  another program.
12
13Exercises
14---------
15
16Exercises Part I
17----------------
18
190. Log in to your PC or open a terminal window as the sysadm user.
20
211. You may need to install Nagios version 3. You would do this as root or as the sysadmin
22   user and use the "sudo" command. As sysadm:
23
24   $ sudo apt-get install nagios3
25
26   Unless you already have an MTA installed, nagios3 will install
27   postfix as a dependency. Select "Internet Site" option. (If you had wanted
28   to use a different MTA likely you'd install it before nagios3)
29
30   You will be prompted for nagiosadmin password. Give it the normal
31   workshop password.
32
33   To get the documentation in /usr/share/doc/nagios3-doc/html/ (which
34   can also be read via the nagios web interface), do:
35
36    $ sudo apt-get install nagios3-doc
37
38
392. Look at the file which contains the password. It's hashed (encrypted)
40
41    $ cat /etc/nagios3/htpasswd.users
42
43
443. You should already have a working Nagios!
45
46    - Open a browser, and go to
47
48    http://pcN.ws.nsrc.org/nagios3/
49
50        Check with the instructor or your neighbor if you are in doubt.
51
52    - At the login prompt, login as:
53
54        user: nagiosadmin
55        pass:
56
57    Browse to the "Host Detail" page to see what's already configured.
58
59
604. Let's look at the configuration layout... But, first, let's become the root
61   user on your machine:
62
63    $ sudo bash
64
65    # cd /etc/nagios3
66    # ls -l
67
68    -rw-r--r-- 1 root root    1882 2008-12-18 13:42 apache2.conf
69    -rw-r--r-- 1 root root   10524 2008-12-18 13:44 cgi.cfg
70    -rw-r--r-- 1 root root    2429 2008-12-18 13:44 commands.cfg
71    drwxr-xr-x 2 root root    4096 2009-02-14 12:33 conf.d
72    -rw-r--r-- 1 root root      26 2009-02-14 12:36 htpasswd.users
73    -rw-r--r-- 1 root root   42539 2008-12-18 13:44 nagios.cfg
74    -rw-r----- 1 root nagios  1293 2008-12-18 13:42 resource.cfg
75    drwxr-xr-x 2 root root    4096 2009-02-14 12:32 stylesheets
76
77    # cd conf.d
78    # ls -l   
79
80    -rw-r--r-- 1 root root 1695 2008-12-18 13:42 contacts_nagios2.cfg
81    -rw-r--r-- 1 root root  418 2008-12-18 13:42 extinfo_nagios2.cfg
82    -rw-r--r-- 1 root root 1152 2008-12-18 13:42 generic-host_nagios2.cfg
83    -rw-r--r-- 1 root root 1803 2008-12-18 13:42 generic-service_nagios2.cfg
84    -rw-r--r-- 1 root root  210 2009-02-14 12:33 host-gateway_nagios3.cfg
85    -rw-r--r-- 1 root root  976 2008-12-18 13:42 hostgroups_nagios2.cfg
86    -rw-r--r-- 1 root root 2167 2008-12-18 13:42 localhost_nagios2.cfg
87    -rw-r--r-- 1 root root 1005 2008-12-18 13:42 services_nagios2.cfg
88    -rw-r--r-- 1 root root 1609 2008-12-18 13:42 timeperiods_nagios2.cfg
89
90    Notice that the package installs files with "nagios2" in their name.
91    This is because they are the same files as were used for the Nagios
92    version 2 Debian package. However there was a change made to the
93    host-gateway configuration file, so this has a new name.
94
95
965. You have a config which is already monitoring your own system
97(localhost_nagios2.cfg) and your upstream default gateway
98(host-gateway_nagios3.cfg).
99
100Have a look at the config file for the default gateway: it's very simple.
101(Note: tab completion is useful here. Type cat host-g then hit tab; the
102filename will be filled in for you)
103
104    # cat host-gateway_nagios3.cfg
105
106    # a host definition for the gateway of the default route
107    define host {
108            host_name   gateway
109            alias       Default Gateway
110            address     10.10.0.254
111            use         generic-host
112            }
113
114
115
116PART II
117Configuring Equipment
118-----------------------------------------------------------------------------
119
1200. Order of configuration
121
122Conceptually we will build our configuration files from the "nearest" device
123then the further away ones.
124
125By going in this order you will have defined the devices that act as parents
126for other devices.
127
128The classroom GW router is already defined (10.10.0.254).
129
1301. First we need to tell Nagios to monitor the gateway for the router instances,
131   which is 10.10.254.254 or gw-254.ws.nsrc.org.
132
133   # cd /etc/nagios3/conf./
134
135Create the routers gateway like this:
136
137   # editor routers.cfg
138
139define host {
140    use         generic-host
141    host_name   gw-254
142    alias       Routers Gateway
143    address     10.10.254.254
144    parents     gateway
145}
146
147Exit and save this file.
148
149*NOTE* - "gateway" is the same machine as gw.ws.nsrc.org. Nagios has simply given
150this machine the name "gateway". While it's nice to have the host_name mactch the
151name in DNS it is not strictly necessary. This will be our only exception.
152
153
1542. the final parent we have in our network is our backbone switch. Create
155   a file called switches.cfg and add an entry for this item:
156
157   # editor switches.cfg
158
159define host {
160    use         generic-host
161    host_name   sw
162    alias       Backbone Switch
163    address     10.10.0.253
164    parents     gateway
165}
166
167At this point Nagios is configured to monitor whether our core hosts (the parents)
168are up on our classroom network. Your next steps are to add in the individual hosts
169such as the classroom virtual PC images (pc1 to pc26), the Wireless Access Points
170(ap1 and ap2), the virtual router images (r1 through r26) and the classroom noc
171host.
172
173Be sure you add in a proper "parents" entry for each host.
174
175To understand the parent relationship in our network review the logical
176network diagram located here:
177
178        http://nocws.nsrc.org/wiki/wiki/NetworkDiagram
179
180Note the Nagios parent bullet points:
181
182Nagios Parent Relationships
183
184Parents are "gw", "sw" and "gw-254". The parent relations are:
185
186    * gw is the parent of sw and gw-254
187    * gw-254 is the parent of r1 through r26
188    * sw is the parent of s0, s1, s2, ap1, ap2, noc and pc1 through pc26
189
190
191
192STEPS 2a - 2c SHOULD BE REPEATED WHENEVER YOU UPDATE THE CONFIGURATION!
193   
194
1952a. Verify that your configuration files are OK:
196
197    # nagios3 -v /etc/nagios3/nagios.cfg
198
199    ... You should get :
200Warning: Host 'bb-sw' has no services associated with it!
201Warning: Host 'bb-gw' has no services associated with it!
202...
203Total Warnings: 2
204Total Errors:   0
205
206Things look okay - No serious problems were detected during the check.
207Nagios is saying that it's unusual to monitor a device just for its
208existence on the network, without also monitoring some service.
209
210
2112b. Reload/Restart Nagios
212
213    # /etc/init.d/nagios3 restart
214
215Not always 100% reliable to use the "restart" option due to a bug in the Nagios init script.
216To be sure you may want to get used to doing:
217
218    # /etc/init.d/nagios3 stop
219    # /etc/init.d/nagios3 start
220
221
2222c. Go to the web interface (http://pcN.ws.nsrc.org/nagios3) and check that the hosts
223   you just added are now visible in the interface. Click on the "Host Detail" item
224   on the left of the Nagios screen to see this. You may see it in "PENDING"
225   status until the check is carried out.
226
227
228HINT: You will be doing this a lot. If you do it all on one line, like this,
229then you can hit cursor-up and rerun all in one go:
230
231    nagios3 -v /etc/nagios3/nagios.cfg && /etc/init.d/nagios3 restart
232
233The '&&' ensures that the restart only happens if the config is valid.
234
235
2363. Create entries for ther routers and PCs in the classroom
237
238Now that we have our routers and switches defined it is quite easy to create
239entries for all our PCs.  Think about the parent relationships:
240
241Remember, if you do not understand the parent relationship refer back to the
242classroom network diagram here:
243
244        http://noc.ws.nsrc.org/wiki/wiki/NetworkDiagram
245
246Below are three sample entries. One for the NOC, one for pc1 and one for
247pc26.  You should be able to use this example to create entries for all
248classroom PCs plus the NOC.
249
250We could put these entries in to separate files, but as our network is small
251we'll use a single file called pcs.cfg.
252
253NOTE! You do not add in an entry for your own PC or router. This has already
254been defined in the file /etc/nagios3/conf.d/localhost_nagios2.cfg.  This
255definition is what defines the Nagios network viewpoint. So, when you come to
256the spot where you might add an entry for your PC you should skip this and go
257on to the next PC in the list.
258
259        # editor pcs.cfg
260       
261# Our classroom NOC
262
263define host {
264    use         generic-host
265    host_name   noc
266    alias       Workshop NOC machine
267    address     10.10.0.250
268    parents     sw
269}
270
271# PCs
272
273define host {
274    use         generic-host
275    host_name   pc1
276    alias       pc1
277    address     10.10.0.1
278    parents     sw
279}
280
281define host {
282    use         generic-host
283    host_name   pc26
284    alias       pc26
285    address     10.10.0.26
286    parents     sw
287}
288
289Take the three entries above and now expand this to create the remaining
290entries for all active PCs. That is, fill in for PCs 2 through 25 (rememember to
291skip your PC).
292
293
294Exit and save the file pcs.cfg
295
296As before, repeat steps 2a-2c to verify your configuration, correct any
297errors, and activate it.
298
299
300
3014. Now configure Nagios to start monitoring the multiple router instances
302   we have available. These are from 10.10.254.1 through 10.10.254.26.
303
304Let's create the first router in our file called routers.cfg. Add this line
305to the bottom of the file:
306
307define host {
308    use         generic-host
309    host_name   r1
310    alias       router 1
311    address     10.10.254.1
312    parents     gw-254
313}
314
315Now create the remaining routers 2-26. Or, just create a few if you don't
316want to spend too long on this particular part of the exercise. But, remember
317which router instances you have defined!
318
319Second router:
320
321define host {
322    use         generic-host
323    host_name   r2
324    alias       router 2
325    address     10.10.254.2
326    parents     gw-254
327}
328
329Repeat this until router number 26:
330
331define host {
332    use         generic-host
333    host_name   r26
334    alias       router 26
335    address     10.10.254.26
336    parents     gw-254
337}
338
339Save the file.
340
341
3425. Look at your Nagios instance on the web. Note that "Status Map" gives
343you a graphical view of the parent-child relationships you have defined.
344
345
346PART III
347Configure Service check for the classroom NOC
348-----------------------------------------------------------------------------
349
3500. Configuring
351
352Now that we have our hardware configured we can start telling Nagios what services to monitor
353on the configured hardware, how to group the hardware in interesting ways, how to group
354services, etc.
355
3561. Associate a service check for our classroom NOC
357
358    # joe hostgroups_nagios2.cfg
359
360    - Find the hostgroup named "ssh-servers". In the members section of the defintion
361      change the line:
362
363members                 localhost
364
365    to
366
367members                 localhost,noc
368
369Exit and save the file.
370
371Verify that your changes are OK:
372
373        # nagios3 -v /etc/nagios3/nagios.cfg
374       
375Restart Nagios to see the new service assocation with your host:
376
377        # /etc/init.d/nagios3 restart
378
379Click on the "Service Detail" link in the Nagios web interface to see your new entry.
380
381
382PART IV
383Defining Services for all PCs
384-----------------------------------------------------------------------------
385
3860. For services, the default normal_check_interval is 5 (minutes) in
387   generic-service_nagios2.cfg. You may wish to change this to 1 to speed up
388   how quickly service issues are detected, at least in the workshop.
389
3901. Determine what services to define for what devices
391
392   - This is core to how you use Nagios and network monitoring tools in
393     general. So far we are simply using ping to verify that physical hosts
394     are up on our network and we have started monitoring a single service on
395     a single host (your PC). The next step is to decide what services you wish
396     to monitor for each host in the classroom.
397
398   - In this particular class we have:
399
400     routers:  running ssh and snmp
401     switches: running telnet and possibly ssh as well as snmp
402     pcs:      All PCs are running ssh and http and should be running snmp
403               The NOC is currently running an snmp daemon
404             
405     So, let's configure Nagios to check for these services for these
406     devices.
407
4082.) Verify that SSH is running on the routers and workshop PCs images
409
410   - In the file services_nagios2.cfg there is already an entry for the SSH
411     service check, so you do not need to create this step. Instead, you
412     simply need to re-define the "ssh-servers" entry in the file
413     /etc/nagios3/conf.d/hostgroups_nagios2.cfg. The initial entry in the file
414     looked like:
415
416# A list of your ssh-accessible servers
417define hostgroup {
418        hostgroup_name  ssh-servers
419                alias           SSH servers
420                members         localhost,noc
421        }
422
423     What do you think you should change? Correct, the "members" line. You should
424     add in entries for all the classroom pcs, routers and  the switches that run ssh.
425     With this information and the network diagram you should be able complete this entry.
426     
427     The entry will look something like this:
428
429define hostgroup {
430        hostgroup_name  ssh-servers
431                alias           SSH servers
432                members         localhost,pc1,pc2,pc3,pc4....,noc,ap1,ap2,r1,r2,r3....
433        }
434
435         Note: leave in "localhost" - This is your PC and represents Nagios' network point of
436         view. So, for instance, if you are on "pc3" you would not include "pc3" in the list
437         of all the classroom pcs as it is represented by the "localhost" entry.
438         
439         The "members" entry will be a long line and will likely wrap on the screen.
440
441         Remember to include all your PCs and all your routers that you have defined. Do no
442         include any entries if they are not already defined in pcs.cfg, switches.cfg or
443         routers.cfg.
444
445    - Once you are done, run the pre-flight check:
446
447    # nagios3 -v /etc/nagios3/nagios.cfg
448
449    If everything looks good, then restart Nagios
450
451    # /etc/init.d/nagios3 stop
452    # /etc/init.d/nagios3 start
453
454    and view your changes in the Nagios web interface.
455
4563.) Check that http is running on all the classroom PCs.
457
458    - This is almost identical to the previous exercise. Just make the change to the
459      HTTP service adding in each PC (no routers or switches). Remember, you don't need
460      to add your machine as it is already defined as "localhost".     
461
4624.)  OPTIONAL EXTRA: as opposed to just checking that a web server is
463     running on the classroom PCs, you could also check that the nagios3
464     service is available, by requesting the /nagios3/ path. This means
465     passing extra options to the check_http plugin.
466
467     For a description of the available options, type this:
468
469      # /usr/lib/nagios/plugins/check_http
470      # /usr/lib/nagios/plugins/check_http --help
471
472     and of course you can browse the online nagios documentation or google
473     for information on check_http. You can even run the plugin by hand to
474     perform a one-shot service check:
475
476     # /usr/lib/nagios/plugins/check_http -H localhost -u /nagios3/
477
478     So the goal is to configure nagios to call check_http in this way.
479
480define command{
481        command_name    check_http_arg
482        command_line    /usr/lib/nagios/plugins/check_http -H '$HOSTADDRESS$' $ARG1$
483        }
484
485define service {
486        hostgroup_name                  nagios-servers
487        service_description             NAGIOS
488        check_command                   check_http_arg!-u /nagios3/
489        use                             generic-service
490}
491
492     and of course you'll need to create a hostgroup called nagios-servers to
493     link to this service check.
494
495     Once you have done this, check that Nagios warns you about failing
496     authentication (because it's trying to fetch the page without providing
497     the username/password). There's an extra parameter you can pass to
498     check_http_arg to provide that info, see if you can find it.
499
500      WARNING: in the tradition of "Debian Knows Best", their definition of the
501      check_http command in /etc/nagios-plugins/config/http.cfg
502      is *not* the same as that recommended in the nagios3 documentation.
503      It is missing $ARG1$, so any parameters to pass to check_http are
504      ignored. So you might think you are monitoring /nagios3/ but actually
505      you are monitoring root!
506
507     This is why we had to make a new command definition "check_http_arg".
508     You could make a more specific one like "check_nagios", or you could
509     modify the Ubuntu check_http definition to fit the standard usage.
510
511
512
513PART V
514Create More Host Groups
515-----------------------------------------------------------------------------
516
5170. In the web view, look at the pages "Hostgroup Overview", "Hostgroup
518   Summary", "Hostgroup Grid". This gives a convenient way to group together
519   hosts which are related (e.g. in the same site, serving the same purpose).
520
5211. Update /etc/nagios3/conf.d/hostgroups_nagios2.cfg
522
523    - For the following exercises it will be very useful if we have created
524      or update the following hostgroups:
525
526      debian-servers
527      routers
528      switches
529 
530      If you edit the file /etc/nagios3/conf.d/hostgroups_nagios2.cfg you
531      will see an entry for debian-servers that just contains localhost.
532      Update this entry to include all the classroom PCs, including the
533      noc (this assumes that you created a "noc" entry in your pcs.cfg
534      file). Remember to skip your PC entry as it is represented by the
535      localhost entry.
536
537    # editor /etc/nagios3/conf.d/hostgroups_nagios2.cfg
538
539     Update the entry that says:
540
541
542# A list of your Debian GNU/Linux servers
543define hostgroup {
544        hostgroup_name  debian-servers
545                alias           Debian GNU/Linux Servers
546                members         localhost
547        }
548     
549      So that the "members" parameter contains something like this. Use your
550      classroom network diagram to confirm the exact number of machines and names
551      in your workshop.
552
553                members         localhost,pc1,pc2,pc3,pc4,pc5,pc6,pc7,pc8,pc9
554                                pc10,pc11,pc12,pc13,pc14,pc15,pc16,pc17,pc18,
555                                pc19,pc20,pc21,pc22,pc23,pc24,pc25,pc26
556
557        Be sure that the line wraps and is not on two separate lines. Otherwise
558        you will get an error when you go to restart Nagios. Remember that
559        your own PC is "localhost".
560
561      - Once you have done this, add in two more host groups, one for routers and
562        one for switches. Call these entries "routers" and "switches".
563
564      - When you are done be sure to verify your work and restart Nagios.
565 
5662. Go back to the web interface and look at your new hostgroups
567
568
569PART VI
570Extended Host Information ("making your graphs pretty")
571-----------------------------------------------------------------------------
572
5731. Update extinfo_nagios2.cfg
574
575    - If you would like to use appropriate icons for your defined hosts in
576      Nagios this is where you do this. We have the three types of devices:
577
578      Cisco routers
579      Cisco switches
580      Ubuntu servers
581
582      There is a fairly large repository of icon images available for you to
583      use located here:
584
585      /usr/share/nagios/htdocs/images/logos/
586
587      these were installed by default as dependent packages of the nagios3
588      package in Ubuntu. In some cases you can find model-specific icons for
589      your hardware, but to make things simpler we will use the following
590      icons for our hardware:
591
592      /usr/share/nagios/htodcs/images/logos/base/debian.*
593      /usr/share/nagios/htdocs/images/logos/cook/router.*
594      /usr/share/nagios/htdocs/images/logos/cook/switch.*
595
596    - The next step is to edit the file /etc/nagios3/conf.d/extinfo_nagios2.cfg
597      and tell nagios what image you would like to use to represent your devices.
598
599    # editor /etc/nagios3/conf.d/extinfo_nagios2.cfg
600
601      Here is what an entry for your routers looks like (there is already an entry
602      for debian-servers that will work as is). Note that the router model (3600)
603      is not all that important. The image used represents a router in general.
604
605define hostextinfo {
606        hostgroup_name   routers
607        icon_image       cook/router.png
608        icon_image_alt   Cisco Routers (3600)
609        vrml_image       router.png
610        statusmap_image  cook/router.gd2
611}
612
613      Now add an entry for your switches. Once you are done check your
614      work and restart Nagios. Take a look at the Status Map in the web interface.
615      It should be much nicer, with real icons instead of question marks.
616
617
618PART VII
619Create Service Groups
620-----------------------------------------------------------------------------
621
6221. Create service groups for ssh and http for each set of pcs.
623
624   - The idea here is to create three service groups. Each service group will
625     be for a quarter of the classroom. We want to see these PCs grouped together
626     and include status of their ssh and http services. To do this edit
627     and create the file:
628
629   # joe /etc/nagios3/conf.d/servicegroups.cfg
630
631     Here is a sample of the service group for group 1:
632
633define servicegroup {
634        servicegroup_name       group1-servers
635        alias                   group 1 servers
636        members                 pc1,SSH,pc1,HTTP,pc2,SSH,pc2,HTTP,pc3,SSH,pc3,HTTP,pc4,SSH,pc4
637        }
638
639        - Note that the members line should wrap and not be on two lines.
640       
641        - Note that "SSH" and "HTTP" need to be uppercase as this is how the service_description is
642          written in the file /etc/nagios3/conf.d/services_nagios2.cfg
643         
644        - You should create an entry for other groups of servers too
645
646    - Save your changes, verify your work and restart Nagios. Now if you click on
647      the Servicegroup menu items in the Nagios web interface you should see
648      this information grouped together.
649
650
651
652PART VIII
653Configure Guest Access to the Nagios Web Interface
654-----------------------------------------------------------------------------
655
6561. Edit /etc/nagios3/cgi.cfg to give read-only guest user access to the Nagios
657   web interface.
658
659    - By default Nagios is configured to give full r/w access via the Nagios
660      web interface to the user nagiosadmin. You can change the name of this
661      user, add other users, change how you authenticate users, what users
662      have access to what resources and more via the cgi.cfg file.
663
664    - First, lets create a "guest" user and password in the htpasswd.users
665      file.
666     
667    # htpasswd /etc/nagios3/htpasswd.users guest
668
669      You can use any password you want (or none). A password of "guest" is
670      not a bad choice.
671
672    - Next, edit the file /etc/nagios3/cgi.cfg and look for what type of access
673      has been given to the nagiosadmin user. By default you will see the following
674      directives (note, there are comments between each directive):
675
676      authorized_for_system_information=nagiosadmin
677      authorized_for_configuration_information=nagiosadmin
678      authorized_for_system_commands=nagiosadmin
679      authorized_for_all_services=nagiosadmin
680      authorized_for_all_hosts=nagiosadmin
681      authorized_for_all_service_commands=nagiosadmin
682      authorized_for_all_host_commands=nagiosadmin
683
684      Now let's tell Nagios to allow the "guest" user some access to
685      information via the web interface. You can choose whatever you would
686      like, but what is pretty typical is this:
687
688      authorized_for_system_information=nagiosadmin,guest
689      authorized_for_configuration_information=nagiosadmin,guest
690      authorized_for_system_commands=nagiosadmin
691      authorized_for_all_services=nagiosadmin,guest
692      authorized_for_all_hosts=nagiosadmin,guest
693      authorized_for_all_service_commands=nagiosadmin
694      authorized_for_all_host_commands=nagiosadmin
695
696    - Once you make the changes, save the file cgi.cfg, verify your
697      work and restart Nagios.
698
699    - To see if you can log in as the "guest" user you may need to clear
700      the cookies in your web browser. You will not notice any difference
701      in the web interface. The difference is that a number of items that
702      are available via the web interface (forcing a service/host check,
703      scheduling checks, comments, etc.) will not work for the guest
704      user.
705
706
707OPTIONAL
708--------
709
710* Check that SNMP is running on the classroom NOC
711
712    - First you will need to add in the appropriate service check for SNMP in the file
713      /etc/nagios3/conf.d/services_nagios2.cfg. This is where Nagios is impressive. There
714      are hundreds, if not thousands, of service checks available via the various Nagios
715      sites on the web. You can see what plugins are installed by Ubuntu in the nagios3
716      package that we've installed by looking in the following directory:
717
718    # ls /usr/lib/nagios/plugins
719
720      As you'll see there is already a check_snmp plugin available to us. If you are
721      interested in the options the plugin takes you can execute the plugin from the
722      command line by typing:
723
724    # /usr/lib/nagios/plugins/check_snmp
725    # /usr/lib/nagios/plugins/check_snmp --help
726
727      to see what options are available, etc. You can use the check_snmp plugin and
728      Nagios to create very complex or specific system checks.
729
730    - Now to see all the various service/host checks that have been created using the
731      check_snmp plugin you can look in /etc/nagios-plugins/config/snmp.cfg. You will
732      see that there are a lot of preconfigured checks using snmp, including:
733
734      snmp_load
735      snmp_cpustats
736      snmp_procname
737      snmp_disk
738      snmp_mem
739      snmp_swap
740      snmp_procs
741      snmp_users
742      snmp_mem2
743      snmp_swap2
744      snmp_mem3
745      snmp_swap3
746      snmp_disk2
747      snmp_tcpopen
748      snmp_tcpstats
749      snmp_bgpstate
750      check_netapp_uptime
751      check_netapp_cupuload
752      check_netapp_numdisks
753      check_compaq_thermalCondition
754     
755      And, even better, you can create additional service checks quite easily.
756      For the case of verifying that snmpd (the SNMP service on Linux) is running we
757      need to ask SNMP a question. If we don't get an answer, then Nagios can assume
758      that the SNMP service is down on that host. When you use service checks such as
759      check_http, check_ssh and check_telnet this is what they are doing as well.
760
761    - In our case, let's create a new service check and call it "check_system". This
762      service check will connect with the specified host, use the private community
763      string we have defined in class and ask a question of snmp on that ask - in this
764      case we'll ask about the System Description, or the OID "sysDescr.0" -
765
766    - To do this start by editing the file /etc/nagios-plugins/config/snmp.cfg:
767
768    # joe /etc/nagios-plugins/config/snmp.cfg
769
770      At the top (or the bottom, your choice) add the following entry to the file:
771
772# 'check_system' command definition
773define command{
774       command_name    check_system
775       command_line    /usr/lib/nagios/plugins/check_snmp -H '$HOSTADDRESS$' -C
776'$ARG1$' -o sysDescr.0
777        }
778     
779      You may wish to copy and paste this vs. trying to type this out.
780
781          Note that "command_line" is a single line. If you copy and paste in joe the line
782          may not wrap properly and you may have to manually add the part:
783         
784                        '$ARG1$' -o sysDescr.0
785                       
786          to the end of the line.
787
788    - Now you need to edit the file /etc/nagios3/conf.d/services_nagios2.cfg and add
789      in this service check. We'll run this check against all our servers in the
790      classroom, or the hostgroup "debian-servers"
791
792    - Edit the file /etc/nagios3/conf.d/services_nagios2.cfg
793
794    # joe /etc/nagios3/conf.d/services_nagios2.cfg
795
796      At the bottom of the file add the following definition:
797
798# check that snmp is up on all servers
799define service {
800        hostgroup_name                  snmp-servers
801        service_description             SNMP
802        check_command                   check_system!xxxxxx
803        use                             generic-service
804        notification_interval           0 ; set > 0 if you want to be renotified
805}
806
807      The "xxxxxx" is the community string previously (or to be) defined in class.
808     
809      Note that we have included our private community string here vs. hard-coding
810      it in the snmp.cfg file earlier. You must change the "xxxxx" to be the snmp
811      community string given in class or this check will not work.
812     
813    - Now we must create the "snmp-servers" group in our hostgroups_nagios2.cfg file.
814      Edit the file /etc/nagios3/conf.d/hostgroups_nagios2.cfg and go to the end of the
815      file. Add in the following hostgroup definition:
816     
817# A list of snmp-enabled devices on which we wish to run the snmp service check
818define hostgroup {
819           hostgroup_name       snmp-servers
820                   alias        snmp servers
821                   members      noc
822          }
823         
824        - Note that for "members" you could, also, add in the switches and routers for
825          group 1 and 2. But, the particular item (MIB) we are checking for "sysDescr.0"
826          may not be available on the switches and/or routers, so the check would then fail.
827
828    - Now verify that your changes are correct and restart Nagios.
829
830    - If you click on the Service Detail menu choice in web interface you should see
831      the SNMP check appear for the noc host.
832     
833    - After we do the SNMP presentation and exercises in class, then you could come
834      back to this exercise and add in all the classroom PCs to the members list in the
835      hostgroups_nagios2.cfg file, snmp-servers hostgroup definition. Remember to list
836      your PC as "localhost".
837
838