| 1 | ANOG 16 - Nagios Installation and Configuration |
|---|
| 2 | |
|---|
| 3 | Notes: |
|---|
| 4 | ------ |
|---|
| 5 | * Commands preceded with "$" imply that you should execute the command as |
|---|
| 6 | a general user - not as root. |
|---|
| 7 | * Commands preceded with "#" imply that you should be working as root. |
|---|
| 8 | * Commands with more specific command lines (e.g. "RTR-GW>" or "mysql>") |
|---|
| 9 | imply that you are executing commands on remote equipment, or within |
|---|
| 10 | another program. |
|---|
| 11 | |
|---|
| 12 | Exercises |
|---|
| 13 | --------- |
|---|
| 14 | |
|---|
| 15 | PART I |
|---|
| 16 | ------ |
|---|
| 17 | |
|---|
| 18 | 1. Start by enabling Nagios |
|---|
| 19 | |
|---|
| 20 | * in /etc/rc.conf add the following line: |
|---|
| 21 | |
|---|
| 22 | nagios_enable="YES" |
|---|
| 23 | |
|---|
| 24 | Configuration templates are available in /usr/local/etc/nagios as |
|---|
| 25 | *.cfg-sample files. Copy them to *.cfg files where required and |
|---|
| 26 | edit to suit your needs. Documentation is available in HTML form |
|---|
| 27 | in /usr/local/www/nagios/docs. |
|---|
| 28 | |
|---|
| 29 | Before we can use Nagios, we need to configure it -- we will do |
|---|
| 30 | this in Part II |
|---|
| 31 | |
|---|
| 32 | 2. Configure Aapche for Nagios |
|---|
| 33 | |
|---|
| 34 | * create /usr/local/etc/apache22/Includes/nagios.conf |
|---|
| 35 | |
|---|
| 36 | In the file, add: |
|---|
| 37 | |
|---|
| 38 | |
|---|
| 39 | <Directory /usr/local/www/nagios> |
|---|
| 40 | Order deny,allow |
|---|
| 41 | Allow from all |
|---|
| 42 | </Directory> |
|---|
| 43 | |
|---|
| 44 | <Directory /usr/local/www/nagios/cgi-bin> |
|---|
| 45 | Options ExecCGI |
|---|
| 46 | </Directory> |
|---|
| 47 | |
|---|
| 48 | <Directory /usr/local/www/nagios> |
|---|
| 49 | AllowOverride AuthConfig |
|---|
| 50 | </Directory> |
|---|
| 51 | |
|---|
| 52 | ScriptAlias /nagios/cgi-bin/ /usr/local/www/nagios/cgi-bin/ |
|---|
| 53 | Alias /nagios/ /usr/local/www/nagios/ |
|---|
| 54 | |
|---|
| 55 | |
|---|
| 56 | |
|---|
| 57 | Save this file and exit. |
|---|
| 58 | |
|---|
| 59 | 3. Create the Web user password file: |
|---|
| 60 | |
|---|
| 61 | # htpasswd -c /usr/local/etc/nagios/htpasswd.users nagiosadmin |
|---|
| 62 | |
|---|
| 63 | New password: |
|---|
| 64 | Re-type new password: |
|---|
| 65 | |
|---|
| 66 | We suggest you use your standard user password used in class. |
|---|
| 67 | |
|---|
| 68 | Now, create a .htaccess file to ask for a password when opening |
|---|
| 69 | the Nagios page: |
|---|
| 70 | |
|---|
| 71 | # vi /usr/local/www/nagios/.htaccess |
|---|
| 72 | |
|---|
| 73 | In the file, add: |
|---|
| 74 | |
|---|
| 75 | |
|---|
| 76 | AuthName "Nagios Access" |
|---|
| 77 | AuthType Basic |
|---|
| 78 | AuthUserFile /usr/local/etc/nagios/htpasswd.users |
|---|
| 79 | require valid-user |
|---|
| 80 | |
|---|
| 81 | |
|---|
| 82 | Save the file, and exit. |
|---|
| 83 | |
|---|
| 84 | 4. The web interface of Nagios should be ready at this point, but |
|---|
| 85 | most views won't work since Nagios has not been configured yet! |
|---|
| 86 | |
|---|
| 87 | - Open a browser, and go to |
|---|
| 88 | |
|---|
| 89 | http://wsXX.ws3.conference.sanog.org/nagios/ |
|---|
| 90 | |
|---|
| 91 | - At the login prompt, login as: |
|---|
| 92 | |
|---|
| 93 | user: nagiosadmin |
|---|
| 94 | pass: |
|---|
| 95 | |
|---|
| 96 | |
|---|
| 97 | Now we need to configure Nagios |
|---|
| 98 | |
|---|
| 99 | PART II |
|---|
| 100 | Configuring Equipment |
|---|
| 101 | ----------------------------------------------------------------------------- |
|---|
| 102 | |
|---|
| 103 | 1. Let's configure Nagios |
|---|
| 104 | |
|---|
| 105 | |
|---|
| 106 | # cd /usr/local/etc/nagios/ |
|---|
| 107 | # cp cgi.cfg-sample cgi.cfg <- Web module config |
|---|
| 108 | # cp resource.cfg-sample resource.cfg <- Nagios internal config |
|---|
| 109 | # cp nagios.cfg-sample nagios.cfg <- Nagios main config |
|---|
| 110 | |
|---|
| 111 | # cd /usr/local/etc/nagios/objects/ |
|---|
| 112 | # cp commands.cfg-sample commands.cfg <- plugin configuration |
|---|
| 113 | # cp contacts.cfg-sample contacts.cfg <- contact people |
|---|
| 114 | # cp templates.cfg-sample templates.cfg <- predefined objects |
|---|
| 115 | # cp timeperiods.cfg-sample timeperiods.cfg <- timeperiods |
|---|
| 116 | |
|---|
| 117 | # cp localhost.cfg-sample localhost.cfg <- a sample config |
|---|
| 118 | |
|---|
| 119 | This is the most basic and minimal configuration -- we have taken |
|---|
| 120 | all the default configuration files and enabled them. |
|---|
| 121 | |
|---|
| 122 | The last file, "localhost.cfg", defines a monitoring configuration |
|---|
| 123 | for your own PC (wsXX). |
|---|
| 124 | |
|---|
| 125 | 2. Let's verify that nagios is happy with the configuration: |
|---|
| 126 | |
|---|
| 127 | # cd /usr/local/etc/nagios |
|---|
| 128 | # nagios -v nagios.cfg |
|---|
| 129 | |
|---|
| 130 | |
|---|
| 131 | You should see: |
|---|
| 132 | |
|---|
| 133 | ... output ... |
|---|
| 134 | |
|---|
| 135 | Total Warnings: 0 |
|---|
| 136 | Total Errors: 0 |
|---|
| 137 | |
|---|
| 138 | Things look okay - No serious problems were detected during the pre-flight check |
|---|
| 139 | |
|---|
| 140 | We are ready to start nagios! |
|---|
| 141 | |
|---|
| 142 | # /usr/local/etc/rc.d/nagios start |
|---|
| 143 | |
|---|
| 144 | |
|---|
| 145 | 3. You can now go back to the web interface |
|---|
| 146 | |
|---|
| 147 | http://wsXX.ws3.conference.sanog.org/nagios/ |
|---|
| 148 | |
|---|
| 149 | ... over the next few minutes, Nagios will update the status |
|---|
| 150 | for the services on the localhost (wsXX). |
|---|
| 151 | |
|---|
| 152 | You can check out "Hostgroup grid" and "Hostgroup overview" |
|---|
| 153 | options in the left menu, then click on "localhost" to get |
|---|
| 154 | the details. |
|---|
| 155 | |
|---|
| 156 | Look at the file: |
|---|
| 157 | |
|---|
| 158 | /usr/local/etc/nagios/objects/localhost.cfg |
|---|
| 159 | |
|---|
| 160 | ... and try to understand what is being monitored, by comparing |
|---|
| 161 | what you see in the "localhost" view in the Nagios web interface, |
|---|
| 162 | and the .cfg file. |
|---|
| 163 | |
|---|
| 164 | We need to make a small change to the file: |
|---|
| 165 | |
|---|
| 166 | /usr/local/etc/nagios/nagios.cfg |
|---|
| 167 | |
|---|
| 168 | ... so it will automatically read all .cfg files from the |
|---|
| 169 | /usr/local/etc/nagios/objects directory, and we don't have |
|---|
| 170 | to always edit nagios.cfg to add them. |
|---|
| 171 | |
|---|
| 172 | So edit /usr/local/etc/nagios/nagios.cfg, and make the |
|---|
| 173 | following changes: |
|---|
| 174 | |
|---|
| 175 | COMMENT the 4 lines like this: |
|---|
| 176 | |
|---|
| 177 | cfg_file=/usr/local/etc/nagios/objects/commands.cfg |
|---|
| 178 | cfg_file=/usr/local/etc/nagios/objects/contacts.cfg |
|---|
| 179 | cfg_file=/usr/local/etc/nagios/objects/timeperiods.cfg |
|---|
| 180 | cfg_file=/usr/local/etc/nagios/objects/templates.cfg |
|---|
| 181 | |
|---|
| 182 | Comment = add '#' at the beginning, so they look |
|---|
| 183 | like this: |
|---|
| 184 | |
|---|
| 185 | # cfg_file=/usr/local/etc/nagios/objects/commands.cfg |
|---|
| 186 | # cfg_file=/usr/local/etc/nagios/objects/contacts.cfg |
|---|
| 187 | # cfg_file=/usr/local/etc/nagios/objects/timeperiods.cfg |
|---|
| 188 | # cfg_file=/usr/local/etc/nagios/objects/templates.cfg |
|---|
| 189 | |
|---|
| 190 | Do the same for |
|---|
| 191 | |
|---|
| 192 | cfg_file=/usr/local/etc/nagios/objects/localhost.cfg |
|---|
| 193 | |
|---|
| 194 | ... so it becomes: |
|---|
| 195 | |
|---|
| 196 | # cfg_file=/usr/local/etc/nagios/objects/localhost.cfg |
|---|
| 197 | |
|---|
| 198 | ... and add another line: |
|---|
| 199 | |
|---|
| 200 | cfg_dir=/usr/local/etc/nagios/objects |
|---|
| 201 | |
|---|
| 202 | ... Now save the file and exit. |
|---|
| 203 | |
|---|
| 204 | One last change: |
|---|
| 205 | |
|---|
| 206 | # cd /usr/local/etc/nagios/objects/ |
|---|
| 207 | # mv localhost.cfg main.cfg |
|---|
| 208 | |
|---|
| 209 | main.cfg is a nicer name than localhost, since we're going to |
|---|
| 210 | be adding new hosts and parameters to it. |
|---|
| 211 | |
|---|
| 212 | Test that nagios is happy: |
|---|
| 213 | |
|---|
| 214 | # nagios -v nagios.cfg |
|---|
| 215 | |
|---|
| 216 | If Nagios complains, double check your changes, and if |
|---|
| 217 | it is still a problem, ask one of the instructors for help. |
|---|
| 218 | |
|---|
| 219 | Finally, restart Nagios: |
|---|
| 220 | |
|---|
| 221 | # /usr/local/etc/rc.d/nagios restart |
|---|
| 222 | |
|---|
| 223 | |
|---|
| 224 | 4. Let's start monitoring another computer in our classroom: |
|---|
| 225 | |
|---|
| 226 | - Pick any other WS in the class, which you will monitor. |
|---|
| 227 | |
|---|
| 228 | # cd /usr/local/etc/nagios/objects |
|---|
| 229 | |
|---|
| 230 | # vi ws-all.cfg |
|---|
| 231 | |
|---|
| 232 | |
|---|
| 233 | define host { |
|---|
| 234 | use freebsd-server |
|---|
| 235 | host_name wsYY |
|---|
| 236 | alias WS YY in WS3 |
|---|
| 237 | address _______________ [wsYY's IP address here] |
|---|
| 238 | } |
|---|
| 239 | |
|---|
| 240 | |
|---|
| 241 | Note: YY is *another* machine in the class, not your own. |
|---|
| 242 | |
|---|
| 243 | ... Save and quit |
|---|
| 244 | |
|---|
| 245 | 5. Let's create a new hostgroup for the occasion, and add our host |
|---|
| 246 | to it |
|---|
| 247 | |
|---|
| 248 | - Let's add the hostgroup to the "main.cfg" file: |
|---|
| 249 | |
|---|
| 250 | # cd /usr/local/etc/nagios/objects/ |
|---|
| 251 | # vi main.cfg |
|---|
| 252 | |
|---|
| 253 | Find the section called "Define an optional hostgroup for FreeBSD machines", |
|---|
| 254 | and just under it, add: |
|---|
| 255 | |
|---|
| 256 | |
|---|
| 257 | |
|---|
| 258 | define hostgroup { |
|---|
| 259 | hostgroup_name classroom |
|---|
| 260 | alias All WS in the class |
|---|
| 261 | members wsYY |
|---|
| 262 | } |
|---|
| 263 | |
|---|
| 264 | |
|---|
| 265 | 6. Now let's associate some services to that host |
|---|
| 266 | |
|---|
| 267 | Still in the file "main.cfg", find the section called: |
|---|
| 268 | "Define a service to check SSH on the local machine" |
|---|
| 269 | and change the line: |
|---|
| 270 | |
|---|
| 271 | |
|---|
| 272 | host_name localhost |
|---|
| 273 | |
|---|
| 274 | to |
|---|
| 275 | |
|---|
| 276 | hostgroup_name freebsd-servers, classroom |
|---|
| 277 | |
|---|
| 278 | |
|---|
| 279 | Save the file and exit |
|---|
| 280 | |
|---|
| 281 | |
|---|
| 282 | 7. Verify that your configuration file is OK: |
|---|
| 283 | |
|---|
| 284 | # nagios -v /usr/local/etc/nagios/nagios.cfg |
|---|
| 285 | |
|---|
| 286 | ... You should get : |
|---|
| 287 | |
|---|
| 288 | Total Warnings: 0 |
|---|
| 289 | Total Errors: 0 |
|---|
| 290 | |
|---|
| 291 | Things look okay - No serious problems were detected during the check. |
|---|
| 292 | |
|---|
| 293 | |
|---|
| 294 | 8. Reload/Restart Nagios |
|---|
| 295 | |
|---|
| 296 | # /usr/local/etc/rc.d/nagios restart |
|---|
| 297 | |
|---|
| 298 | 9. Go to the web interface (http://wsXX.ws3.conference.sanog.org/nagios) |
|---|
| 299 | and check the host you just added. |
|---|
| 300 | |
|---|
| 301 | |
|---|
| 302 | 10. Add ALL the PCs (WS1 - WS15) in your classroom. |
|---|
| 303 | |
|---|
| 304 | - Remember to verify the configuration file! |
|---|
| 305 | |
|---|
| 306 | - I suggest that you create a single config file called "ws-all.cfg" |
|---|
| 307 | to do this, and put all the hosts in it. |
|---|
| 308 | |
|---|
| 309 | - You will repeat step 4 for each machine. |
|---|
| 310 | |
|---|
| 311 | - When finished, remember to add all the hosts into the "classroom" group |
|---|
| 312 | in the file main.cfg. The format of the members statement is: |
|---|
| 313 | |
|---|
| 314 | members wsXX,wsYY,wsZZ,... |
|---|
| 315 | |
|---|
| 316 | 11. Reload/Restart Nagios |
|---|
| 317 | |
|---|
| 318 | # /usr/local/etc/rc.d/nagios restart |
|---|
| 319 | |
|---|
| 320 | - Take a look at http://wsXX.ws3.conference.sanog.org/nagios to see your changes. |
|---|
| 321 | |
|---|
| 322 | - Click on the "Status Map" link to see how things look. |
|---|
| 323 | |
|---|
| 324 | |
|---|
| 325 | PART III |
|---|
| 326 | Adding Services |
|---|
| 327 | ----------------------------------------------------------------------------- |
|---|
| 328 | |
|---|
| 329 | 1. Determine what services to add for what devices |
|---|
| 330 | |
|---|
| 331 | - This is core to how you use Nagios and network monitoring tools in |
|---|
| 332 | general. So far we are simply checking SSH to see if the machines |
|---|
| 333 | are up on our network. The next step is to decide what services you |
|---|
| 334 | wish to monitor for each host. |
|---|
| 335 | |
|---|
| 336 | - In this particular class we have: |
|---|
| 337 | |
|---|
| 338 | pcs: All wsXX are running ssh, http and imap/pop |
|---|
| 339 | All student pcs are running an snmp daemon |
|---|
| 340 | |
|---|
| 341 | So, let's configure Nagios to check for all of these services for these |
|---|
| 342 | devices. |
|---|
| 343 | |
|---|
| 344 | 2.) Verify that HTTP is running on the classrom PCs |
|---|
| 345 | |
|---|
| 346 | - In the file main.cfg there is already an entry for the HTTP |
|---|
| 347 | service check, so you do not need to create this step. |
|---|
| 348 | |
|---|
| 349 | Instead, you simply need to change "host_name localhost" for |
|---|
| 350 | that service, to use a "hostgroup_name classroom", just |
|---|
| 351 | like we did in step 6 in part II. |
|---|
| 352 | |
|---|
| 353 | So make this change in the "main.cfg" file -- find the section |
|---|
| 354 | "Define a service to check HTTP on the local machine", and update |
|---|
| 355 | the line: |
|---|
| 356 | |
|---|
| 357 | host_name localhost |
|---|
| 358 | |
|---|
| 359 | to |
|---|
| 360 | |
|---|
| 361 | hostgroup_name classroom |
|---|
| 362 | |
|---|
| 363 | |
|---|
| 364 | And save the file. This tells Nagios that the HTTP service is not only |
|---|
| 365 | running on the single host "localhost", but on ALL hosts in the hostgroup |
|---|
| 366 | "classroom". |
|---|
| 367 | |
|---|
| 368 | - Once you are done, run the pre-flight check: |
|---|
| 369 | |
|---|
| 370 | # nagios -v /usr/local/etc/nagios/nagios.cfg |
|---|
| 371 | |
|---|
| 372 | If everything looks good, then restart Nagios and see your changes in the |
|---|
| 373 | Nagios web interface. |
|---|
| 374 | |
|---|
| 375 | 3.) Check that all hosts answer to ping. |
|---|
| 376 | |
|---|
| 377 | - Like for HTTP, there is already a check_ping service defined and it automatically |
|---|
| 378 | applies to the freebsd-servers group. (Note, you can add additional groups of hosts |
|---|
| 379 | for any service check if you wish). So, you need to update the "PING" service |
|---|
| 380 | definition in the main.cfg file, and make it use the hostgroup_name classroom, |
|---|
| 381 | as we did in the previous step. |
|---|
| 382 | |
|---|
| 383 | - See the previous exercise and make the appropriate change to do this. If you have |
|---|
| 384 | any questions ask your instructor for help. |
|---|
| 385 | |
|---|
| 386 | |
|---|
| 387 | 4.) Let's add IMAP and POP monitoring |
|---|
| 388 | |
|---|
| 389 | The *commands* for check_pop and check_imap are already configured, |
|---|
| 390 | so all we need to do is create *services* for them. |
|---|
| 391 | |
|---|
| 392 | |
|---|
| 393 | First, edit the main.cfg file, and at the end, add the service definition |
|---|
| 394 | for the check_pop: |
|---|
| 395 | |
|---|
| 396 | |
|---|
| 397 | define service { |
|---|
| 398 | use local-service |
|---|
| 399 | hostgroup_name classroom |
|---|
| 400 | service_description POP |
|---|
| 401 | check_command check_pop |
|---|
| 402 | } |
|---|
| 403 | |
|---|
| 404 | |
|---|
| 405 | Check the configuration (nagios -v ...) and restart Nagios, |
|---|
| 406 | then go to the web interface. |
|---|
| 407 | |
|---|
| 408 | |
|---|
| 409 | Now, you add IMAP check in the same fashion (create service |
|---|
| 410 | definition, called "check_imap", etc...). Don't forget to |
|---|
| 411 | update the service_description! |
|---|
| 412 | |
|---|
| 413 | 5.) One last change... |
|---|
| 414 | |
|---|
| 415 | In the file main.cfg, REMOVE all the lines like: |
|---|
| 416 | |
|---|
| 417 | notifications_enabled 0 |
|---|
| 418 | |
|---|
| 419 | (there should be 2 lines like this -- delete them) |
|---|