Files: exercises-nagios-rt.txt

File exercises-nagios-rt.txt, 7.9 KB (added by hervey, 9 years ago)

Nagios+RT and Mailgate Exercises - TEXT

Line 
1Registry Operations Curriculum
2Nagios and Request Tracker Ticket Creation
3
4Notes:
5------
6* Commands preceded with "$" imply that you should execute the command as
7  a general user - not as root.
8* Commands preceded with "#" imply that you should be working as root.
9* Commands with more specific command lines (e.g. "RTR-GW>" or "mysql>")
10  imply that you are executing commands on remote equipment, or within
11  another program.
12
13Exercises
14---------
15
16To configure RT and Nagios so that alerts from Nagios automatically
17create tickets requires a few steps:
18
19* Create a proper contact entry for Nagios in
20  /etc/nagios3/conf.d/contacts_nagios2.cfg
21* Create the proper command in Nagios to use the rt-mailgate
22  interface. The command is defined in /etc/nagios3/commands.cfg
23
24These next two items should already be done in RT if you have
25finished the RT exercises.
26
27* Install the rt-mailgate software and configure it properly
28  in your /etc/aliases file for your MTA in use.
29* Configure the appropriate queues in RT to receive emails
30  passed to it from Nagios via the rt-mailgate software.
31
32Exercises
33---------------------------------
34
350. Log in to your PC or open a terminal window as the tladmain user.
36
37
381.) Configure a Contact in Nagios
39---------------------------------
40
41   - Edit the file /etc/nagios3/conf.d/contacts_nagios2.cfg
42
43   # vi /etc/nagios3/conf.d/contacts_nagios2.cfg
44
45   - In this file we will first add a new contact name under
46     the default root contact entry. The new contact should
47     look like this:
48
49define contact{
50        contact_name                    net
51        alias                           RT Alert Queue
52        service_notification_period     24x7
53        host_notification_period        24x7
54        service_notification_options    c
55        host_notification_options       d
56        service_notification_commands   notify-service-ticket-by-email
57        host_notification_commands      notify-host-ticket-by-email
58        email                           net@localhost
59        }
60
61
62   - the service_notification_option of "c" means only notify once a
63     service is considered "critical" by Nagios (i.e. down). The
64     host_notification_option of "d" means down. By specify only "c"
65     and "d" this means that notifications will not be sent for other
66     states.
67
68   - Note the email address in use "net@localhost" - this is important
69     as this was previously defined for RT.
70
71   - Now we must create a Contact Group that contains this contact.
72     We will call this group "tickets." Do this at the end of the file:
73
74define contactgroup{
75        contactgroup_name       tickets
76        alias                   email to ticket system for RT
77        members                 net,root
78        }
79
80   - You could leave off "root" as a member, but we've left this on to
81     have another user that receives email to help us troubleshoot if
82     there are issues.
83
84   - Now that your contact have been created you need to create the commands
85     that were referenced in the initial contact creation above, these are
86     "notify-service-ticket-by-email" and "notify-host-ticket-by-email"
87
88
892.) Update Nagios Commands
90--------------------------
91
92   - To create the notify-service-ticket-by-email and notify-host-ticket-by-email
93     commands we need to edit the file /etc/nagios3/commands.cfg.
94
95   # vi /etc/nagios3/commands.cfg
96
97   - In this file you should have two command definitions at the top of the file
98     called notify-host-by-email and notify-service-by-email. Now we need to add
99     in our new ticket notification commands below these two commands. We suggest
100     you copy and paste the following two command definitions. Do this below the
101     notify-service-by-email command definition. Note that the command_line
102     entry is very long and should not contain any line breaks:
103
104################################################################
105# Additional commands created for network management workshop #
106################################################################
107
108# 'notifiy-host-ticket-by-email' command definition
109define command{
110        command_name    notify-host-ticket-by-email
111        command_line    /usr/bin/printf "%b" "***** Nagios *****\n\nNotification Type: $NOTIFICATIONTYPE$\nHost: $HOSTNAME$\nState: $HOSTSTATE$\nAddress: $HOSTADDRESS$\nInfo: $HOSTOUTPUT$\n\nDate/Time: $LONGDATETIME$\n" | /usr/bin/mail -s "** $NOTIFICATIONTYPE$ Host Alert: $HOSTNAME$ is $HOSTSTATE$ **" $CONTACTEMAIL$
112        }
113
114# 'notify-service-ticket-by-email' command definition
115define command{
116        command_name    notify-service-ticket-by-email
117        command_line    /usr/bin/printf "%b" "***** Nagios *****\n\nNotification Type: $NOTIFICATIONTYPE$\n\nService: $SERVICEDESC$\nHost: $HOSTALIAS$\nAddress: $HOSTADDRESS$\nState: $SERVICESTATE$\n\nDate/Time: $LONGDATETIME$\n\nAdditional Info:\n\n$SERVICEOUTPUT$" | /usr/bin/mail -s "** $NOTIFICATIONTYPE$ Service Alert: $HOSTALIAS$/$SERVICEDESC$ is $SERVICESTATE$ **" $CONTACTEMAIL$
118        }
119
120
121   - As you can see these are a bit complicated ;-) The key is that these define the format
122     of the email that will be sent. In Nagios you've indicated that the contact
123     named "net" will use these commands, and that this contact sends email to "net@localhost" -
124     as there should already be an alias entry in /etc/aliases for the user "net" and this
125     points to the rt-mailgate definition this means that email formatted as shown above
126     will go to the user net@localhost in this format and be passed to rt-mailgate, which,
127     in turn will pass this to RT, which in turn has the proper queue set up for this.
128
129
1303.) Choose a Service to Monitor with RT Tickets
131-----------------------------------------------
132
133   - The final step is to tell Nagios that you wish to notify the contact "tickets" for a
134     particular service. If you look in /etc/nagios3/conf.d/generic-service_nagios2.cfg the
135     default contact_groups is "admins". To override this for a service edit the file
136     /etc/nagios3/conf.d/services_nagios2.cfg and a contact_groups entry for one of the
137     service definitions. For example, to send email to generate tickets in RT if SSH goes
138     down on a box you would edit the SSH service check so that it looks like this:
139
140# check that ssh services are running
141define service {
142        hostgroup_name                  ssh-servers
143        service_description             SSH
144        check_command                   check_ssh
145        use                             generic-service
146            notification_interval           0 ; set > 0 if you want to be renotified
147            contact_groups                                      tickets
148}
149
150     Note the additional item that we now have, "contact_groups." You can do this for other
151     entries as well if you wish.
152
153   - When you are done, save the file and exit.
154
155   - Now restart Nagios to verify your changes are correct.
156
157   # /etc/init.d/nagios3 stop
158   # /etc/init.d/nagios3 start
159
160
1614.) Generate RT Tickets for Hosts
162---------------------------------
163
164   - To do this you must either specify "contact_groups tickets" for individual host
165     definitions, or you must update the template file for all hosts and change the
166     default contact_groups entry to tickets. This file is generic-host_nagios2.cfg.
167
168   - If you wish to do this go ahead. Tickets will be generated if a host goes down
169     and you have specified the contact_groups for that host as being "tickets"
170
1715. See Nagios Tickets in RT
172---------------------------
173
174   - To verify your changes have worked you will need to stop the ssh service on your
175     machine or another machine.
176
177   # /etc/init.d/ssh stop
178
179   - It will take a while (up to 10 minutes) for Nagios to report that SSH is
180     "critical", but once that happens a new ticket should appear in your RT instance
181     in the net queue generated by Nagios.
182
183   - Remember to see this go to http://localhost/rt/ and log in as Username "tldadmin"
184     with the password you chose when you created the RT tldadmin account. The new
185     ticket should appear in the "10 newest unowned tickets" box in the main log in
186     page in RT.