SlideShare a Scribd company logo
Using Nagios XI as the platform for Monitoring as a Service
Bryan Heden
Introduction and Agenda
I’m Bryan Heden, Director of Systems at Agile Networks headquartered in Canton, Ohio
• Who we are and what we do
• Some customers of ours
• Last year’s presentation recap
• The major problems we were faced with
• Solving hardware issues
• Automating user management and multi-tenancy
• Further configuration wizard and component customizations
• MRTG overloads and other issues
• Remote MRTG bandwidth polling and Nagios checks
• Empowering standard users
• Geospatial information system integration
• Conclusion
2
Who we are and what we do
Agile Networks
We engineer and operate The Agile Network, a general purpose
backhaul network with Last-Mile AgilityTM
We provide world class connectivity to:
• The public sector (Public Safety)
• Tier 1 Carriers
• The Oil and Gas industry
• Underserved communities
• Business and Residential customers
• Wireless Internet Service Providers
3
Some customers of ours
4
Last year’s presentation recap
10,000 Services (and growing!) Across the State of Ohio
Choosing Nagios XI and ModGearman
• Easy to use and understandable front-end interface
• ModGearman’s distributed checks
Customizing configuration wizards and components
• Specialized config wizards for our networking equipment
• NOC Overview map to provide geospatially based status info
• ModGearman management, Smokeping component and portal
Offloading MRTG, MySQL, Smokeping and IO improvements
• Upgraded hardware several times to keep up
• Offloaded MRTG, split the processes up
• Offloaded MySQL
• Installed and then immediately offloaded Smokeping
5
The major problems we were faced with
Midnight alerts should power cycle my coffee maker
• IOwait was continuing to grow and memory was limited
• Engineers need to see backhaul, sales needs to see customer equipment
• Our configuration wizards’ defects became glaringly obvious
• MRTG graphs were starting to saw tooth again (checks completing once every 10-15 minutes)
• The need arose to segment some bandwidth polling and Nagios checks entirely away from our backhaul network
• Network or Sales Engineers should not have to be Nagios Administrators to remove devices from monitoring
• The basic NOC Overview map was fast approaching end-of-life. We needed a better way to manage geospatial data that could be
utilized by more than one team of engineers
6
Solving hardware issues
IOwait was continuing to grow and memory was limited
• We had already migrated hardware several times
• Latest migration was to a 24 SSD DAS (12GB SAS) array attached to a 3 node VMWare 6 cluster
• XI VM has 24 cores, 24GB RAM
• MRTG and MySQL VMs are similar
• Several RAMDisks are in use
• Famous Last Words:
“I haven’t seen IOwait over 1% in a long long time now!”
7
Automating user management and multi-tenancy
Engineers need to see backhaul, sales needs to see customer equipment
• We needed to limit the views of company department users (network
engineers, network operations, sales engineers, operations) and
telecommunication customer users (public safety, oil and gas, wireless
resellers)
• Automating this process was on the roadmap for far too long before it was
developed. Manual maintenance was a nightmare!
• We built an intermediary database that manages user groups, i.e.: Agile
Networks Engineering
8
Automating user management and multi-tenancy
Engineers need to see backhaul, sales needs to see customer equipment
• This database links those user groups with contactgroups and default
hostgroups
• We have a component/portal that populates the database upon user group
creation, and uses that data to create users in Nagios and assign them to the
proper contactgroup upon creation
• The default hostgroup is useful for ModGearman and also for keeping track of
who is tracking what
9
Further configuration wizard and component customizations
Our configuration wizards’ defects became glaringly obvious
• The old configuration wizards were specific to a device type. The service
checks were added to that host specifically, which became a problem if we
ever introduced a new OID to monitor, or needed to get rid of one!
• We wrote a script that creates the configuration wizards based on a generic
script. While it is creating the configwizard, it is also creating the device
hostgroup that any device created with this wizard will be added to.
10
Further configuration wizard and component customizations
Our configuration wizards’ defects became glaringly obvious
• Now, we assign service checks to those device hostgroups (Satellites
Tracked, Temperature, SysUpTime). If we ever need to make a change, we
make it at one place, and it is applied to all devices.
• We still track interface information via cfgmaker command, and allow the
user to decide which ports they want checks performed on.
11
MRTG overloads and other issues
MRTG graphs were starting to saw tooth again (checks completing once every 10-15 minutes)
• MRTG was already split manually into 8 separate processes
• ~12k checks every 5 minutes
• If some part of the network became unavailable overnight, and an
error became present on 1 interface that stopped that process from
completing successfully, we all of a sudden didn’t have bandwidth for
~1500 ports. Unacceptable!
12
MRTG overloads and other issues
MRTG graphs were starting to saw tooth again (checks completing once every 10-15 minutes)
• We created a database synchronization tool (MRTGQL?), and converted our configuration wizards to write directly to tables
• Now we can handle duplicate checking in a sane manner!
• We split our MRTG processes based on information in a config array present in the synchronization script, which updates our crontab
file and rewrites all of the individual config files
• We also monitor the log file directory for errors, and send out alerts based on these findings – no more bandwidthless nights
13
Remote MRTG bandwidth polling and Nagios checks
The need arose to segment some bandwidth polling and Nagios checks onto logical networks
• We have all kinds of customers, and support calls are expensive
• Lets give them access to their own monitoring solution!
• It will be fun and easy, they said!
14
Remote MRTG bandwidth polling and Nagios checks
The need arose to segment some bandwidth polling and Nagios checks entirely away from our backhaul network
• Executing remote Nagios checks is as easy as ensuring that each different customer’s device has the
default hostgroup appropriately added. Their remote ModGearman takes care of the rest!
• We changed the configuration wizards to hide all of the default hostgroups from the user’s selectable
listbox, and only assign the one that that user is linked up with in our intermediary management
database
• But what about remote bandwidth polling?
• We changed the configuration wizards to execute cfgmaker on their remote ModGearman box. Once the
user selects which ports to monitor, these are all stored in our mrtg database with the appropriate
remote information so that our database sync occurs on the proper server
15
Remote MRTG bandwidth polling and Nagios checks
16
Empowering standard users
Network or Sales Engineers should not have to be Nagios Administrators to remove devices from monitoring
• We had to train people on Core Config Manager if they ever
planned on creating or removing hostgroups, removing hosts
or services, renaming anything, removing hosts from
hostgroups, etc.
• So we figured out exactly what the most commonly used
features of Core Config Manager were internally (Hint: it is all
the ones I listed in the last bullet point)
17
Empowering standard users
Network or Sales Engineers should not have to be Nagios Administrators to remove devices from monitoring
• Then we built a component that does all of those things via direct calls to the NagiosQL DB
and the filesystem
• Now all of my users can only remove the objects that they have permissions to. Network
Engineers can’t remove customer equipment, and Sales can’t remove backhaul routers!
18
Geospatial information system integration
The basic NOC Overview map was fast approaching end-of-life.
• The original map was extended based on the Google map component
• It had a decent interface that tied existing hostgroups to lat/lng coordinates
(locations) and displayed them on the map based on that hostgroup’s hosts’ statuses
• We tied locations together by linking a specific host and service at a location to a
specific host and service at another location (relationship)
• We displayed relationships as lines between locations
19
Geospatial information system integration
The basic NOC Overview map was fast approaching end-of-life.
• We also built an animated radar layer and overlayed it on top of the map
• This is all fine and good, but that data only existed inside of this portal inside of our
Nagios XI instance
• We needed to export that data to a true geospatial information system (PostGIS,
GeoServer)
20
Geospatial information system integration (continued)
We needed a better way to manage geospatial data that could be utilized by more than one team of engineers
• We built a GeoServer, and built a component that pulls WMS in OpenLayers
• We created multiple datastores for each particular customer we service, with multiple layers in each (locations, wireless relationships, fiber
relationships, etc.)
• We built an awesome interface for that portal that allows any user of our Nagios XI instance to add locations and relationships with ease
• We built an application that parses status data and rebuilds all of the WMS layers with the proper styling (red for down, green for up, etc.)
• Now we can log in to the GeoServer via separate credentials and view the relevant data
• This is useful for our GIS and Project Management departments
21
Geospatial information system integration (continued)
22
Conclusion
• We (seriously) beefed up the hardware to accommodate almost doubling hosts and services
• We automated everything we could possibly automate
• We built a layer on top of MRTG to manage configuration files and remote workers
• We refactored our configuration wizards to be extremely efficient, and tie in directly to our MRTG sync tool
• We built our map functionality on top of a real Geo Server
What’s next?
• Automating the deployment of XI instances based on growth and location
• Tying password change component into LDAP
• Automatic interference detection in the frequency map (with alerting!)
• Receive signal threshold alarming based on propagation prediction
• Alerting based on average values over time and a percentage change in those values
• Deeper geospatial integration (propagation/coverage maps)
Contact and Questions
• bheden@agilenetworks.com
• Any questions?
23
Ad

More Related Content

What's hot (20)

Nagios Conference 2012 - Mike Weber - Failover
Nagios Conference 2012 - Mike Weber - FailoverNagios Conference 2012 - Mike Weber - Failover
Nagios Conference 2012 - Mike Weber - Failover
Nagios
 
Nagios World Conference 2015 - Scott Wilkerson Opening
Nagios World Conference 2015 - Scott Wilkerson OpeningNagios World Conference 2015 - Scott Wilkerson Opening
Nagios World Conference 2015 - Scott Wilkerson Opening
Nagios
 
Nagios Conference 2014 - Eric Mislivec - Getting Started With Nagios Core
Nagios Conference 2014 - Eric Mislivec - Getting Started With Nagios CoreNagios Conference 2014 - Eric Mislivec - Getting Started With Nagios Core
Nagios Conference 2014 - Eric Mislivec - Getting Started With Nagios Core
Nagios
 
Jesse Olson - Nagios Log Server Architecture Overview
Jesse Olson - Nagios Log Server Architecture OverviewJesse Olson - Nagios Log Server Architecture Overview
Jesse Olson - Nagios Log Server Architecture Overview
Nagios
 
Mike Weber - Nagios and Group Deployment of Service Checks
Mike Weber - Nagios and Group Deployment of Service ChecksMike Weber - Nagios and Group Deployment of Service Checks
Mike Weber - Nagios and Group Deployment of Service Checks
Nagios
 
Nagios Conference 2014 - Dorance Martinez Cortes - Customizing Nagios
Nagios Conference 2014 - Dorance Martinez Cortes - Customizing NagiosNagios Conference 2014 - Dorance Martinez Cortes - Customizing Nagios
Nagios Conference 2014 - Dorance Martinez Cortes - Customizing Nagios
Nagios
 
Thomas Schmainda - Tracking Boeing Satellites With Nagios - Nagios World Conf...
Thomas Schmainda - Tracking Boeing Satellites With Nagios - Nagios World Conf...Thomas Schmainda - Tracking Boeing Satellites With Nagios - Nagios World Conf...
Thomas Schmainda - Tracking Boeing Satellites With Nagios - Nagios World Conf...
Nagios
 
Nagios Conference 2014 - Mike Merideth - The Art and Zen of Managing Nagios w...
Nagios Conference 2014 - Mike Merideth - The Art and Zen of Managing Nagios w...Nagios Conference 2014 - Mike Merideth - The Art and Zen of Managing Nagios w...
Nagios Conference 2014 - Mike Merideth - The Art and Zen of Managing Nagios w...
Nagios
 
Nagios Conference 2013 - Eric Stanley and Andy Brist - API and Nagios
Nagios Conference 2013 - Eric Stanley and Andy Brist - API and NagiosNagios Conference 2013 - Eric Stanley and Andy Brist - API and Nagios
Nagios Conference 2013 - Eric Stanley and Andy Brist - API and Nagios
Nagios
 
Nagios
NagiosNagios
Nagios
guest7e7e305
 
Nagios Conference 2014 - Shamas Demoret - An Overview of Nagios Solutions
Nagios Conference 2014 - Shamas Demoret - An Overview of Nagios SolutionsNagios Conference 2014 - Shamas Demoret - An Overview of Nagios Solutions
Nagios Conference 2014 - Shamas Demoret - An Overview of Nagios Solutions
Nagios
 
Nagios XI Best Practices
Nagios XI Best PracticesNagios XI Best Practices
Nagios XI Best Practices
Nagios
 
Nagios Conference 2014 - Dave Williams - Multi-Tenant Nagios Monitoring
Nagios Conference 2014 - Dave Williams - Multi-Tenant Nagios MonitoringNagios Conference 2014 - Dave Williams - Multi-Tenant Nagios Monitoring
Nagios Conference 2014 - Dave Williams - Multi-Tenant Nagios Monitoring
Nagios
 
Nagios Conference 2011 - Nate Broderick - Nagios XI Large Implementation Tips...
Nagios Conference 2011 - Nate Broderick - Nagios XI Large Implementation Tips...Nagios Conference 2011 - Nate Broderick - Nagios XI Large Implementation Tips...
Nagios Conference 2011 - Nate Broderick - Nagios XI Large Implementation Tips...
Nagios
 
USENIX LISA15: How TubeMogul Handles over One Trillion HTTP Requests a Month
USENIX LISA15: How TubeMogul Handles over One Trillion HTTP Requests a MonthUSENIX LISA15: How TubeMogul Handles over One Trillion HTTP Requests a Month
USENIX LISA15: How TubeMogul Handles over One Trillion HTTP Requests a Month
Nicolas Brousse
 
Nagios Conference 2014 - Luis Contreras - Monitoring SAP System with Nagios Core
Nagios Conference 2014 - Luis Contreras - Monitoring SAP System with Nagios CoreNagios Conference 2014 - Luis Contreras - Monitoring SAP System with Nagios Core
Nagios Conference 2014 - Luis Contreras - Monitoring SAP System with Nagios Core
Nagios
 
Nagios Conference 2014 - Shamas Demoret - Getting Started With Nagios XI
Nagios Conference 2014 - Shamas Demoret - Getting Started With Nagios XINagios Conference 2014 - Shamas Demoret - Getting Started With Nagios XI
Nagios Conference 2014 - Shamas Demoret - Getting Started With Nagios XI
Nagios
 
Nagios Conference 2014 - Janice Singh - Real World Uses for Nagios APIs
Nagios Conference 2014 - Janice Singh - Real World Uses for Nagios APIsNagios Conference 2014 - Janice Singh - Real World Uses for Nagios APIs
Nagios Conference 2014 - Janice Singh - Real World Uses for Nagios APIs
Nagios
 
Nagios Conference 2014 - Scott Wilkerson - Log Monitoring and Log Management ...
Nagios Conference 2014 - Scott Wilkerson - Log Monitoring and Log Management ...Nagios Conference 2014 - Scott Wilkerson - Log Monitoring and Log Management ...
Nagios Conference 2014 - Scott Wilkerson - Log Monitoring and Log Management ...
Nagios
 
Neutron high availability open stack architecture openstack israel event 2015
Neutron high availability  open stack architecture   openstack israel event 2015Neutron high availability  open stack architecture   openstack israel event 2015
Neutron high availability open stack architecture openstack israel event 2015
Arthur Berezin
 
Nagios Conference 2012 - Mike Weber - Failover
Nagios Conference 2012 - Mike Weber - FailoverNagios Conference 2012 - Mike Weber - Failover
Nagios Conference 2012 - Mike Weber - Failover
Nagios
 
Nagios World Conference 2015 - Scott Wilkerson Opening
Nagios World Conference 2015 - Scott Wilkerson OpeningNagios World Conference 2015 - Scott Wilkerson Opening
Nagios World Conference 2015 - Scott Wilkerson Opening
Nagios
 
Nagios Conference 2014 - Eric Mislivec - Getting Started With Nagios Core
Nagios Conference 2014 - Eric Mislivec - Getting Started With Nagios CoreNagios Conference 2014 - Eric Mislivec - Getting Started With Nagios Core
Nagios Conference 2014 - Eric Mislivec - Getting Started With Nagios Core
Nagios
 
Jesse Olson - Nagios Log Server Architecture Overview
Jesse Olson - Nagios Log Server Architecture OverviewJesse Olson - Nagios Log Server Architecture Overview
Jesse Olson - Nagios Log Server Architecture Overview
Nagios
 
Mike Weber - Nagios and Group Deployment of Service Checks
Mike Weber - Nagios and Group Deployment of Service ChecksMike Weber - Nagios and Group Deployment of Service Checks
Mike Weber - Nagios and Group Deployment of Service Checks
Nagios
 
Nagios Conference 2014 - Dorance Martinez Cortes - Customizing Nagios
Nagios Conference 2014 - Dorance Martinez Cortes - Customizing NagiosNagios Conference 2014 - Dorance Martinez Cortes - Customizing Nagios
Nagios Conference 2014 - Dorance Martinez Cortes - Customizing Nagios
Nagios
 
Thomas Schmainda - Tracking Boeing Satellites With Nagios - Nagios World Conf...
Thomas Schmainda - Tracking Boeing Satellites With Nagios - Nagios World Conf...Thomas Schmainda - Tracking Boeing Satellites With Nagios - Nagios World Conf...
Thomas Schmainda - Tracking Boeing Satellites With Nagios - Nagios World Conf...
Nagios
 
Nagios Conference 2014 - Mike Merideth - The Art and Zen of Managing Nagios w...
Nagios Conference 2014 - Mike Merideth - The Art and Zen of Managing Nagios w...Nagios Conference 2014 - Mike Merideth - The Art and Zen of Managing Nagios w...
Nagios Conference 2014 - Mike Merideth - The Art and Zen of Managing Nagios w...
Nagios
 
Nagios Conference 2013 - Eric Stanley and Andy Brist - API and Nagios
Nagios Conference 2013 - Eric Stanley and Andy Brist - API and NagiosNagios Conference 2013 - Eric Stanley and Andy Brist - API and Nagios
Nagios Conference 2013 - Eric Stanley and Andy Brist - API and Nagios
Nagios
 
Nagios Conference 2014 - Shamas Demoret - An Overview of Nagios Solutions
Nagios Conference 2014 - Shamas Demoret - An Overview of Nagios SolutionsNagios Conference 2014 - Shamas Demoret - An Overview of Nagios Solutions
Nagios Conference 2014 - Shamas Demoret - An Overview of Nagios Solutions
Nagios
 
Nagios XI Best Practices
Nagios XI Best PracticesNagios XI Best Practices
Nagios XI Best Practices
Nagios
 
Nagios Conference 2014 - Dave Williams - Multi-Tenant Nagios Monitoring
Nagios Conference 2014 - Dave Williams - Multi-Tenant Nagios MonitoringNagios Conference 2014 - Dave Williams - Multi-Tenant Nagios Monitoring
Nagios Conference 2014 - Dave Williams - Multi-Tenant Nagios Monitoring
Nagios
 
Nagios Conference 2011 - Nate Broderick - Nagios XI Large Implementation Tips...
Nagios Conference 2011 - Nate Broderick - Nagios XI Large Implementation Tips...Nagios Conference 2011 - Nate Broderick - Nagios XI Large Implementation Tips...
Nagios Conference 2011 - Nate Broderick - Nagios XI Large Implementation Tips...
Nagios
 
USENIX LISA15: How TubeMogul Handles over One Trillion HTTP Requests a Month
USENIX LISA15: How TubeMogul Handles over One Trillion HTTP Requests a MonthUSENIX LISA15: How TubeMogul Handles over One Trillion HTTP Requests a Month
USENIX LISA15: How TubeMogul Handles over One Trillion HTTP Requests a Month
Nicolas Brousse
 
Nagios Conference 2014 - Luis Contreras - Monitoring SAP System with Nagios Core
Nagios Conference 2014 - Luis Contreras - Monitoring SAP System with Nagios CoreNagios Conference 2014 - Luis Contreras - Monitoring SAP System with Nagios Core
Nagios Conference 2014 - Luis Contreras - Monitoring SAP System with Nagios Core
Nagios
 
Nagios Conference 2014 - Shamas Demoret - Getting Started With Nagios XI
Nagios Conference 2014 - Shamas Demoret - Getting Started With Nagios XINagios Conference 2014 - Shamas Demoret - Getting Started With Nagios XI
Nagios Conference 2014 - Shamas Demoret - Getting Started With Nagios XI
Nagios
 
Nagios Conference 2014 - Janice Singh - Real World Uses for Nagios APIs
Nagios Conference 2014 - Janice Singh - Real World Uses for Nagios APIsNagios Conference 2014 - Janice Singh - Real World Uses for Nagios APIs
Nagios Conference 2014 - Janice Singh - Real World Uses for Nagios APIs
Nagios
 
Nagios Conference 2014 - Scott Wilkerson - Log Monitoring and Log Management ...
Nagios Conference 2014 - Scott Wilkerson - Log Monitoring and Log Management ...Nagios Conference 2014 - Scott Wilkerson - Log Monitoring and Log Management ...
Nagios Conference 2014 - Scott Wilkerson - Log Monitoring and Log Management ...
Nagios
 
Neutron high availability open stack architecture openstack israel event 2015
Neutron high availability  open stack architecture   openstack israel event 2015Neutron high availability  open stack architecture   openstack israel event 2015
Neutron high availability open stack architecture openstack israel event 2015
Arthur Berezin
 

Viewers also liked (17)

Mike Guthrie - Revamping Your 10 Year Old Nagios Installation
Mike Guthrie - Revamping Your 10 Year Old Nagios InstallationMike Guthrie - Revamping Your 10 Year Old Nagios Installation
Mike Guthrie - Revamping Your 10 Year Old Nagios Installation
Nagios
 
Sean Falzon - Nagios - Resilient Notifications
Sean Falzon - Nagios - Resilient NotificationsSean Falzon - Nagios - Resilient Notifications
Sean Falzon - Nagios - Resilient Notifications
Nagios
 
Trevor McDonald - Nagios XI Under The Hood
Trevor McDonald  - Nagios XI Under The HoodTrevor McDonald  - Nagios XI Under The Hood
Trevor McDonald - Nagios XI Under The Hood
Nagios
 
Nagios Log Server - Features
Nagios Log Server - FeaturesNagios Log Server - Features
Nagios Log Server - Features
Nagios
 
Nrpe - Nagios Remote Plugin Executor. NRPE plugin for Nagios Core
Nrpe - Nagios Remote Plugin Executor. NRPE plugin for Nagios CoreNrpe - Nagios Remote Plugin Executor. NRPE plugin for Nagios Core
Nrpe - Nagios Remote Plugin Executor. NRPE plugin for Nagios Core
Nagios
 
Nagios Conference 2012 - John Sellens - Non-Obvious Nagios
Nagios Conference 2012 - John Sellens - Non-Obvious NagiosNagios Conference 2012 - John Sellens - Non-Obvious Nagios
Nagios Conference 2012 - John Sellens - Non-Obvious Nagios
Nagios
 
Nagios Network Analyzer - Features
Nagios Network Analyzer - FeaturesNagios Network Analyzer - Features
Nagios Network Analyzer - Features
Nagios
 
Nagios Conference 2013 - John Sellens - Monitoring Remote Locations with Nagios
Nagios Conference 2013 - John Sellens - Monitoring Remote Locations with NagiosNagios Conference 2013 - John Sellens - Monitoring Remote Locations with Nagios
Nagios Conference 2013 - John Sellens - Monitoring Remote Locations with Nagios
Nagios
 
Ganglia Monitoring Tool
Ganglia Monitoring ToolGanglia Monitoring Tool
Ganglia Monitoring Tool
sudhirpg
 
Service Support Process PPT
Service Support Process PPTService Support Process PPT
Service Support Process PPT
Pawneshwar Datt Rai
 
Janice Singh - Writing Custom Nagios Plugins
Janice Singh - Writing Custom Nagios PluginsJanice Singh - Writing Custom Nagios Plugins
Janice Singh - Writing Custom Nagios Plugins
Nagios
 
Eric Loyd - Fractal Nagios
Eric Loyd - Fractal NagiosEric Loyd - Fractal Nagios
Eric Loyd - Fractal Nagios
Nagios
 
Itism.v20160321.2eng public
Itism.v20160321.2eng publicItism.v20160321.2eng public
Itism.v20160321.2eng public
Volodymyr Mazur
 
OTRS Consulting, Implementation, Customization and AMC
OTRS Consulting, Implementation, Customization and AMCOTRS Consulting, Implementation, Customization and AMC
OTRS Consulting, Implementation, Customization and AMC
Razak Mohammed Ali
 
Nagios Conference 2012 - John Sellens - Nagios Indirection
Nagios Conference 2012 - John Sellens - Nagios IndirectionNagios Conference 2012 - John Sellens - Nagios Indirection
Nagios Conference 2012 - John Sellens - Nagios Indirection
Nagios
 
Ganglia monitoring
Ganglia monitoringGanglia monitoring
Ganglia monitoring
Chen Robert
 
Otrs guide
Otrs guideOtrs guide
Otrs guide
ostf21
 
Mike Guthrie - Revamping Your 10 Year Old Nagios Installation
Mike Guthrie - Revamping Your 10 Year Old Nagios InstallationMike Guthrie - Revamping Your 10 Year Old Nagios Installation
Mike Guthrie - Revamping Your 10 Year Old Nagios Installation
Nagios
 
Sean Falzon - Nagios - Resilient Notifications
Sean Falzon - Nagios - Resilient NotificationsSean Falzon - Nagios - Resilient Notifications
Sean Falzon - Nagios - Resilient Notifications
Nagios
 
Trevor McDonald - Nagios XI Under The Hood
Trevor McDonald  - Nagios XI Under The HoodTrevor McDonald  - Nagios XI Under The Hood
Trevor McDonald - Nagios XI Under The Hood
Nagios
 
Nagios Log Server - Features
Nagios Log Server - FeaturesNagios Log Server - Features
Nagios Log Server - Features
Nagios
 
Nrpe - Nagios Remote Plugin Executor. NRPE plugin for Nagios Core
Nrpe - Nagios Remote Plugin Executor. NRPE plugin for Nagios CoreNrpe - Nagios Remote Plugin Executor. NRPE plugin for Nagios Core
Nrpe - Nagios Remote Plugin Executor. NRPE plugin for Nagios Core
Nagios
 
Nagios Conference 2012 - John Sellens - Non-Obvious Nagios
Nagios Conference 2012 - John Sellens - Non-Obvious NagiosNagios Conference 2012 - John Sellens - Non-Obvious Nagios
Nagios Conference 2012 - John Sellens - Non-Obvious Nagios
Nagios
 
Nagios Network Analyzer - Features
Nagios Network Analyzer - FeaturesNagios Network Analyzer - Features
Nagios Network Analyzer - Features
Nagios
 
Nagios Conference 2013 - John Sellens - Monitoring Remote Locations with Nagios
Nagios Conference 2013 - John Sellens - Monitoring Remote Locations with NagiosNagios Conference 2013 - John Sellens - Monitoring Remote Locations with Nagios
Nagios Conference 2013 - John Sellens - Monitoring Remote Locations with Nagios
Nagios
 
Ganglia Monitoring Tool
Ganglia Monitoring ToolGanglia Monitoring Tool
Ganglia Monitoring Tool
sudhirpg
 
Janice Singh - Writing Custom Nagios Plugins
Janice Singh - Writing Custom Nagios PluginsJanice Singh - Writing Custom Nagios Plugins
Janice Singh - Writing Custom Nagios Plugins
Nagios
 
Eric Loyd - Fractal Nagios
Eric Loyd - Fractal NagiosEric Loyd - Fractal Nagios
Eric Loyd - Fractal Nagios
Nagios
 
Itism.v20160321.2eng public
Itism.v20160321.2eng publicItism.v20160321.2eng public
Itism.v20160321.2eng public
Volodymyr Mazur
 
OTRS Consulting, Implementation, Customization and AMC
OTRS Consulting, Implementation, Customization and AMCOTRS Consulting, Implementation, Customization and AMC
OTRS Consulting, Implementation, Customization and AMC
Razak Mohammed Ali
 
Nagios Conference 2012 - John Sellens - Nagios Indirection
Nagios Conference 2012 - John Sellens - Nagios IndirectionNagios Conference 2012 - John Sellens - Nagios Indirection
Nagios Conference 2012 - John Sellens - Nagios Indirection
Nagios
 
Ganglia monitoring
Ganglia monitoringGanglia monitoring
Ganglia monitoring
Chen Robert
 
Otrs guide
Otrs guideOtrs guide
Otrs guide
ostf21
 
Ad

Similar to Bryan Heden - Agile Networks - Using Nagios XI as the platform for Monitoring as a Service (20)

Nagios Conference 2014 - Bryan Heden - 10,000 Services Across The State of Ohio
Nagios Conference 2014 - Bryan Heden - 10,000 Services Across The State of OhioNagios Conference 2014 - Bryan Heden - 10,000 Services Across The State of Ohio
Nagios Conference 2014 - Bryan Heden - 10,000 Services Across The State of Ohio
Nagios
 
Resume2015
Resume2015Resume2015
Resume2015
David Youngworth
 
Monitoring federation open stack infrastructure
Monitoring federation open stack infrastructureMonitoring federation open stack infrastructure
Monitoring federation open stack infrastructure
Fernando Lopez Aguilar
 
November 2013 HUG: Real-time analytics with in-memory grid
November 2013 HUG: Real-time analytics with in-memory gridNovember 2013 HUG: Real-time analytics with in-memory grid
November 2013 HUG: Real-time analytics with in-memory grid
Yahoo Developer Network
 
MongoDB World 2019: Why NBCUniversal Migrated to MongoDB Atlas
MongoDB World 2019: Why NBCUniversal Migrated to MongoDB AtlasMongoDB World 2019: Why NBCUniversal Migrated to MongoDB Atlas
MongoDB World 2019: Why NBCUniversal Migrated to MongoDB Atlas
MongoDB
 
Why NBC Universal Migrated to MongoDB Atlas
Why NBC Universal Migrated to MongoDB AtlasWhy NBC Universal Migrated to MongoDB Atlas
Why NBC Universal Migrated to MongoDB Atlas
Datavail
 
Technology insights: Decision Science Platform
Technology insights: Decision Science PlatformTechnology insights: Decision Science Platform
Technology insights: Decision Science Platform
Decision Science Community
 
Dubbo and Weidian's practice on micro-service architecture
Dubbo and Weidian's practice on micro-service architectureDubbo and Weidian's practice on micro-service architecture
Dubbo and Weidian's practice on micro-service architecture
Huxing Zhang
 
PLNOG19 - Piotr Marecki - Espresso: Scalable and Programmable Peering Edge
 PLNOG19 - Piotr Marecki - Espresso: Scalable and Programmable Peering Edge PLNOG19 - Piotr Marecki - Espresso: Scalable and Programmable Peering Edge
PLNOG19 - Piotr Marecki - Espresso: Scalable and Programmable Peering Edge
PROIDEA
 
Deployability
DeployabilityDeployability
Deployability
Len Bass
 
(R)evolutionize APM
(R)evolutionize APM(R)evolutionize APM
(R)evolutionize APM
Andreas Grabner
 
Arcadia overview nr2
Arcadia overview nr2Arcadia overview nr2
Arcadia overview nr2
EU ARCADIA PROJECT
 
Visualizing Your Network Health - Know your Network
Visualizing Your Network Health - Know your NetworkVisualizing Your Network Health - Know your Network
Visualizing Your Network Health - Know your Network
DellNMS
 
Montreal Kubernetes Meetup: Developer-first workflows (for microservices) on ...
Montreal Kubernetes Meetup: Developer-first workflows (for microservices) on ...Montreal Kubernetes Meetup: Developer-first workflows (for microservices) on ...
Montreal Kubernetes Meetup: Developer-first workflows (for microservices) on ...
Ambassador Labs
 
Navigator Systems ltd HireTrack NX questions
Navigator Systems ltd   HireTrack NX questionsNavigator Systems ltd   HireTrack NX questions
Navigator Systems ltd HireTrack NX questions
David Rose
 
Barbri barbri's journey from on-prem to cloud, featuring auto-remediation wi...
Barbri  barbri's journey from on-prem to cloud, featuring auto-remediation wi...Barbri  barbri's journey from on-prem to cloud, featuring auto-remediation wi...
Barbri barbri's journey from on-prem to cloud, featuring auto-remediation wi...
Laura Stack
 
How Pulsar Enables Netdata to Offer Unlimited Infrastructure Monitoring for F...
How Pulsar Enables Netdata to Offer Unlimited Infrastructure Monitoring for F...How Pulsar Enables Netdata to Offer Unlimited Infrastructure Monitoring for F...
How Pulsar Enables Netdata to Offer Unlimited Infrastructure Monitoring for F...
StreamNative
 
Dynomite @ RedisConf 2017
Dynomite @ RedisConf 2017Dynomite @ RedisConf 2017
Dynomite @ RedisConf 2017
Ioannis Papapanagiotou
 
Challenges in Cloud Computing – VM Migration
Challenges in Cloud Computing – VM MigrationChallenges in Cloud Computing – VM Migration
Challenges in Cloud Computing – VM Migration
Sarmad Makhdoom
 
SDN Demystified, by Dean Pemberton [APNIC 38]
SDN Demystified, by Dean Pemberton [APNIC 38]SDN Demystified, by Dean Pemberton [APNIC 38]
SDN Demystified, by Dean Pemberton [APNIC 38]
APNIC
 
Nagios Conference 2014 - Bryan Heden - 10,000 Services Across The State of Ohio
Nagios Conference 2014 - Bryan Heden - 10,000 Services Across The State of OhioNagios Conference 2014 - Bryan Heden - 10,000 Services Across The State of Ohio
Nagios Conference 2014 - Bryan Heden - 10,000 Services Across The State of Ohio
Nagios
 
Monitoring federation open stack infrastructure
Monitoring federation open stack infrastructureMonitoring federation open stack infrastructure
Monitoring federation open stack infrastructure
Fernando Lopez Aguilar
 
November 2013 HUG: Real-time analytics with in-memory grid
November 2013 HUG: Real-time analytics with in-memory gridNovember 2013 HUG: Real-time analytics with in-memory grid
November 2013 HUG: Real-time analytics with in-memory grid
Yahoo Developer Network
 
MongoDB World 2019: Why NBCUniversal Migrated to MongoDB Atlas
MongoDB World 2019: Why NBCUniversal Migrated to MongoDB AtlasMongoDB World 2019: Why NBCUniversal Migrated to MongoDB Atlas
MongoDB World 2019: Why NBCUniversal Migrated to MongoDB Atlas
MongoDB
 
Why NBC Universal Migrated to MongoDB Atlas
Why NBC Universal Migrated to MongoDB AtlasWhy NBC Universal Migrated to MongoDB Atlas
Why NBC Universal Migrated to MongoDB Atlas
Datavail
 
Technology insights: Decision Science Platform
Technology insights: Decision Science PlatformTechnology insights: Decision Science Platform
Technology insights: Decision Science Platform
Decision Science Community
 
Dubbo and Weidian's practice on micro-service architecture
Dubbo and Weidian's practice on micro-service architectureDubbo and Weidian's practice on micro-service architecture
Dubbo and Weidian's practice on micro-service architecture
Huxing Zhang
 
PLNOG19 - Piotr Marecki - Espresso: Scalable and Programmable Peering Edge
 PLNOG19 - Piotr Marecki - Espresso: Scalable and Programmable Peering Edge PLNOG19 - Piotr Marecki - Espresso: Scalable and Programmable Peering Edge
PLNOG19 - Piotr Marecki - Espresso: Scalable and Programmable Peering Edge
PROIDEA
 
Deployability
DeployabilityDeployability
Deployability
Len Bass
 
Visualizing Your Network Health - Know your Network
Visualizing Your Network Health - Know your NetworkVisualizing Your Network Health - Know your Network
Visualizing Your Network Health - Know your Network
DellNMS
 
Montreal Kubernetes Meetup: Developer-first workflows (for microservices) on ...
Montreal Kubernetes Meetup: Developer-first workflows (for microservices) on ...Montreal Kubernetes Meetup: Developer-first workflows (for microservices) on ...
Montreal Kubernetes Meetup: Developer-first workflows (for microservices) on ...
Ambassador Labs
 
Navigator Systems ltd HireTrack NX questions
Navigator Systems ltd   HireTrack NX questionsNavigator Systems ltd   HireTrack NX questions
Navigator Systems ltd HireTrack NX questions
David Rose
 
Barbri barbri's journey from on-prem to cloud, featuring auto-remediation wi...
Barbri  barbri's journey from on-prem to cloud, featuring auto-remediation wi...Barbri  barbri's journey from on-prem to cloud, featuring auto-remediation wi...
Barbri barbri's journey from on-prem to cloud, featuring auto-remediation wi...
Laura Stack
 
How Pulsar Enables Netdata to Offer Unlimited Infrastructure Monitoring for F...
How Pulsar Enables Netdata to Offer Unlimited Infrastructure Monitoring for F...How Pulsar Enables Netdata to Offer Unlimited Infrastructure Monitoring for F...
How Pulsar Enables Netdata to Offer Unlimited Infrastructure Monitoring for F...
StreamNative
 
Challenges in Cloud Computing – VM Migration
Challenges in Cloud Computing – VM MigrationChallenges in Cloud Computing – VM Migration
Challenges in Cloud Computing – VM Migration
Sarmad Makhdoom
 
SDN Demystified, by Dean Pemberton [APNIC 38]
SDN Demystified, by Dean Pemberton [APNIC 38]SDN Demystified, by Dean Pemberton [APNIC 38]
SDN Demystified, by Dean Pemberton [APNIC 38]
APNIC
 
Ad

Recently uploaded (18)

The Mettle of Honor 05.11.2025.pptx
The  Mettle  of  Honor   05.11.2025.pptxThe  Mettle  of  Honor   05.11.2025.pptx
The Mettle of Honor 05.11.2025.pptx
FamilyWorshipCenterD
 
Cross-Cultural-Communication-and-Adaptation.pdf
Cross-Cultural-Communication-and-Adaptation.pdfCross-Cultural-Communication-and-Adaptation.pdf
Cross-Cultural-Communication-and-Adaptation.pdf
rash64487
 
Guiding the Behavior of Young Children.ppt
Guiding the Behavior of Young Children.pptGuiding the Behavior of Young Children.ppt
Guiding the Behavior of Young Children.ppt
FelixOlalekanBabalol
 
Hurricane Milton powerpoint Andrea Giuliano Nacuzi.pdf
Hurricane Milton powerpoint Andrea Giuliano Nacuzi.pdfHurricane Milton powerpoint Andrea Giuliano Nacuzi.pdf
Hurricane Milton powerpoint Andrea Giuliano Nacuzi.pdf
wolfryx99
 
stackconf 2025 | Operator All the (stateful) Things by Jannik Clausen.pdf
stackconf 2025 | Operator All the (stateful) Things by Jannik Clausen.pdfstackconf 2025 | Operator All the (stateful) Things by Jannik Clausen.pdf
stackconf 2025 | Operator All the (stateful) Things by Jannik Clausen.pdf
NETWAYS
 
ICST/SBFT Tool Competition 2025 - UAV Testing Track
ICST/SBFT Tool Competition 2025 - UAV Testing TrackICST/SBFT Tool Competition 2025 - UAV Testing Track
ICST/SBFT Tool Competition 2025 - UAV Testing Track
Sebastiano Panichella
 
NL-based Software Engineering (NLBSE) '25
NL-based Software Engineering (NLBSE) '25NL-based Software Engineering (NLBSE) '25
NL-based Software Engineering (NLBSE) '25
Sebastiano Panichella
 
All_India_Situation_Presentation. by Dr Jesmina Khatun
All_India_Situation_Presentation. by Dr Jesmina KhatunAll_India_Situation_Presentation. by Dr Jesmina Khatun
All_India_Situation_Presentation. by Dr Jesmina Khatun
DRJESMINAKHATUN
 
Mastering Public Speaking: Key Skills for Confident Communication
Mastering Public Speaking: Key Skills for Confident CommunicationMastering Public Speaking: Key Skills for Confident Communication
Mastering Public Speaking: Key Skills for Confident Communication
karthikeyans20012004
 
stackconf 2025 | Building a Hyperconverged Proxmox VE Cluster with Ceph by Jo...
stackconf 2025 | Building a Hyperconverged Proxmox VE Cluster with Ceph by Jo...stackconf 2025 | Building a Hyperconverged Proxmox VE Cluster with Ceph by Jo...
stackconf 2025 | Building a Hyperconverged Proxmox VE Cluster with Ceph by Jo...
NETWAYS
 
criminal law kajsgdasn cakjsbciaYSVC aschaios
criminal law kajsgdasn cakjsbciaYSVC aschaioscriminal law kajsgdasn cakjsbciaYSVC aschaios
criminal law kajsgdasn cakjsbciaYSVC aschaios
eleazaranghel023
 
Modernization of Parliaments: The Way Forward
Modernization of Parliaments: The Way ForwardModernization of Parliaments: The Way Forward
Modernization of Parliaments: The Way Forward
Dr. Fotios Fitsilis
 
Navigating the Digital Asset Landscape-From Blockchain Foundations to Future ...
Navigating the Digital Asset Landscape-From Blockchain Foundations to Future ...Navigating the Digital Asset Landscape-From Blockchain Foundations to Future ...
Navigating the Digital Asset Landscape-From Blockchain Foundations to Future ...
BobPesakovic
 
stackconf 2025 | Building high-performance apps & controlling costs with CNCF...
stackconf 2025 | Building high-performance apps & controlling costs with CNCF...stackconf 2025 | Building high-performance apps & controlling costs with CNCF...
stackconf 2025 | Building high-performance apps & controlling costs with CNCF...
NETWAYS
 
We Are The World-USA for Africa : Written By Lionel Richie And Michael Jackso...
We Are The World-USA for Africa : Written By Lionel Richie And Michael Jackso...We Are The World-USA for Africa : Written By Lionel Richie And Michael Jackso...
We Are The World-USA for Africa : Written By Lionel Richie And Michael Jackso...
hershtara1
 
The history of Human Rights powerpoint Andrea Giuliano Nacuzi.pdf
The history of Human Rights powerpoint Andrea Giuliano Nacuzi.pdfThe history of Human Rights powerpoint Andrea Giuliano Nacuzi.pdf
The history of Human Rights powerpoint Andrea Giuliano Nacuzi.pdf
wolfryx99
 
A Brief Introduction About John Smith
A Brief Introduction About John SmithA Brief Introduction About John Smith
A Brief Introduction About John Smith
John Smith
 
stackconf 2025 | 2025: I Don’t Know K8S and at This Point, I’m Too Afraid To ...
stackconf 2025 | 2025: I Don’t Know K8S and at This Point, I’m Too Afraid To ...stackconf 2025 | 2025: I Don’t Know K8S and at This Point, I’m Too Afraid To ...
stackconf 2025 | 2025: I Don’t Know K8S and at This Point, I’m Too Afraid To ...
NETWAYS
 
The Mettle of Honor 05.11.2025.pptx
The  Mettle  of  Honor   05.11.2025.pptxThe  Mettle  of  Honor   05.11.2025.pptx
The Mettle of Honor 05.11.2025.pptx
FamilyWorshipCenterD
 
Cross-Cultural-Communication-and-Adaptation.pdf
Cross-Cultural-Communication-and-Adaptation.pdfCross-Cultural-Communication-and-Adaptation.pdf
Cross-Cultural-Communication-and-Adaptation.pdf
rash64487
 
Guiding the Behavior of Young Children.ppt
Guiding the Behavior of Young Children.pptGuiding the Behavior of Young Children.ppt
Guiding the Behavior of Young Children.ppt
FelixOlalekanBabalol
 
Hurricane Milton powerpoint Andrea Giuliano Nacuzi.pdf
Hurricane Milton powerpoint Andrea Giuliano Nacuzi.pdfHurricane Milton powerpoint Andrea Giuliano Nacuzi.pdf
Hurricane Milton powerpoint Andrea Giuliano Nacuzi.pdf
wolfryx99
 
stackconf 2025 | Operator All the (stateful) Things by Jannik Clausen.pdf
stackconf 2025 | Operator All the (stateful) Things by Jannik Clausen.pdfstackconf 2025 | Operator All the (stateful) Things by Jannik Clausen.pdf
stackconf 2025 | Operator All the (stateful) Things by Jannik Clausen.pdf
NETWAYS
 
ICST/SBFT Tool Competition 2025 - UAV Testing Track
ICST/SBFT Tool Competition 2025 - UAV Testing TrackICST/SBFT Tool Competition 2025 - UAV Testing Track
ICST/SBFT Tool Competition 2025 - UAV Testing Track
Sebastiano Panichella
 
NL-based Software Engineering (NLBSE) '25
NL-based Software Engineering (NLBSE) '25NL-based Software Engineering (NLBSE) '25
NL-based Software Engineering (NLBSE) '25
Sebastiano Panichella
 
All_India_Situation_Presentation. by Dr Jesmina Khatun
All_India_Situation_Presentation. by Dr Jesmina KhatunAll_India_Situation_Presentation. by Dr Jesmina Khatun
All_India_Situation_Presentation. by Dr Jesmina Khatun
DRJESMINAKHATUN
 
Mastering Public Speaking: Key Skills for Confident Communication
Mastering Public Speaking: Key Skills for Confident CommunicationMastering Public Speaking: Key Skills for Confident Communication
Mastering Public Speaking: Key Skills for Confident Communication
karthikeyans20012004
 
stackconf 2025 | Building a Hyperconverged Proxmox VE Cluster with Ceph by Jo...
stackconf 2025 | Building a Hyperconverged Proxmox VE Cluster with Ceph by Jo...stackconf 2025 | Building a Hyperconverged Proxmox VE Cluster with Ceph by Jo...
stackconf 2025 | Building a Hyperconverged Proxmox VE Cluster with Ceph by Jo...
NETWAYS
 
criminal law kajsgdasn cakjsbciaYSVC aschaios
criminal law kajsgdasn cakjsbciaYSVC aschaioscriminal law kajsgdasn cakjsbciaYSVC aschaios
criminal law kajsgdasn cakjsbciaYSVC aschaios
eleazaranghel023
 
Modernization of Parliaments: The Way Forward
Modernization of Parliaments: The Way ForwardModernization of Parliaments: The Way Forward
Modernization of Parliaments: The Way Forward
Dr. Fotios Fitsilis
 
Navigating the Digital Asset Landscape-From Blockchain Foundations to Future ...
Navigating the Digital Asset Landscape-From Blockchain Foundations to Future ...Navigating the Digital Asset Landscape-From Blockchain Foundations to Future ...
Navigating the Digital Asset Landscape-From Blockchain Foundations to Future ...
BobPesakovic
 
stackconf 2025 | Building high-performance apps & controlling costs with CNCF...
stackconf 2025 | Building high-performance apps & controlling costs with CNCF...stackconf 2025 | Building high-performance apps & controlling costs with CNCF...
stackconf 2025 | Building high-performance apps & controlling costs with CNCF...
NETWAYS
 
We Are The World-USA for Africa : Written By Lionel Richie And Michael Jackso...
We Are The World-USA for Africa : Written By Lionel Richie And Michael Jackso...We Are The World-USA for Africa : Written By Lionel Richie And Michael Jackso...
We Are The World-USA for Africa : Written By Lionel Richie And Michael Jackso...
hershtara1
 
The history of Human Rights powerpoint Andrea Giuliano Nacuzi.pdf
The history of Human Rights powerpoint Andrea Giuliano Nacuzi.pdfThe history of Human Rights powerpoint Andrea Giuliano Nacuzi.pdf
The history of Human Rights powerpoint Andrea Giuliano Nacuzi.pdf
wolfryx99
 
A Brief Introduction About John Smith
A Brief Introduction About John SmithA Brief Introduction About John Smith
A Brief Introduction About John Smith
John Smith
 
stackconf 2025 | 2025: I Don’t Know K8S and at This Point, I’m Too Afraid To ...
stackconf 2025 | 2025: I Don’t Know K8S and at This Point, I’m Too Afraid To ...stackconf 2025 | 2025: I Don’t Know K8S and at This Point, I’m Too Afraid To ...
stackconf 2025 | 2025: I Don’t Know K8S and at This Point, I’m Too Afraid To ...
NETWAYS
 

Bryan Heden - Agile Networks - Using Nagios XI as the platform for Monitoring as a Service

  • 1. Using Nagios XI as the platform for Monitoring as a Service Bryan Heden
  • 2. Introduction and Agenda I’m Bryan Heden, Director of Systems at Agile Networks headquartered in Canton, Ohio • Who we are and what we do • Some customers of ours • Last year’s presentation recap • The major problems we were faced with • Solving hardware issues • Automating user management and multi-tenancy • Further configuration wizard and component customizations • MRTG overloads and other issues • Remote MRTG bandwidth polling and Nagios checks • Empowering standard users • Geospatial information system integration • Conclusion 2
  • 3. Who we are and what we do Agile Networks We engineer and operate The Agile Network, a general purpose backhaul network with Last-Mile AgilityTM We provide world class connectivity to: • The public sector (Public Safety) • Tier 1 Carriers • The Oil and Gas industry • Underserved communities • Business and Residential customers • Wireless Internet Service Providers 3
  • 5. Last year’s presentation recap 10,000 Services (and growing!) Across the State of Ohio Choosing Nagios XI and ModGearman • Easy to use and understandable front-end interface • ModGearman’s distributed checks Customizing configuration wizards and components • Specialized config wizards for our networking equipment • NOC Overview map to provide geospatially based status info • ModGearman management, Smokeping component and portal Offloading MRTG, MySQL, Smokeping and IO improvements • Upgraded hardware several times to keep up • Offloaded MRTG, split the processes up • Offloaded MySQL • Installed and then immediately offloaded Smokeping 5
  • 6. The major problems we were faced with Midnight alerts should power cycle my coffee maker • IOwait was continuing to grow and memory was limited • Engineers need to see backhaul, sales needs to see customer equipment • Our configuration wizards’ defects became glaringly obvious • MRTG graphs were starting to saw tooth again (checks completing once every 10-15 minutes) • The need arose to segment some bandwidth polling and Nagios checks entirely away from our backhaul network • Network or Sales Engineers should not have to be Nagios Administrators to remove devices from monitoring • The basic NOC Overview map was fast approaching end-of-life. We needed a better way to manage geospatial data that could be utilized by more than one team of engineers 6
  • 7. Solving hardware issues IOwait was continuing to grow and memory was limited • We had already migrated hardware several times • Latest migration was to a 24 SSD DAS (12GB SAS) array attached to a 3 node VMWare 6 cluster • XI VM has 24 cores, 24GB RAM • MRTG and MySQL VMs are similar • Several RAMDisks are in use • Famous Last Words: “I haven’t seen IOwait over 1% in a long long time now!” 7
  • 8. Automating user management and multi-tenancy Engineers need to see backhaul, sales needs to see customer equipment • We needed to limit the views of company department users (network engineers, network operations, sales engineers, operations) and telecommunication customer users (public safety, oil and gas, wireless resellers) • Automating this process was on the roadmap for far too long before it was developed. Manual maintenance was a nightmare! • We built an intermediary database that manages user groups, i.e.: Agile Networks Engineering 8
  • 9. Automating user management and multi-tenancy Engineers need to see backhaul, sales needs to see customer equipment • This database links those user groups with contactgroups and default hostgroups • We have a component/portal that populates the database upon user group creation, and uses that data to create users in Nagios and assign them to the proper contactgroup upon creation • The default hostgroup is useful for ModGearman and also for keeping track of who is tracking what 9
  • 10. Further configuration wizard and component customizations Our configuration wizards’ defects became glaringly obvious • The old configuration wizards were specific to a device type. The service checks were added to that host specifically, which became a problem if we ever introduced a new OID to monitor, or needed to get rid of one! • We wrote a script that creates the configuration wizards based on a generic script. While it is creating the configwizard, it is also creating the device hostgroup that any device created with this wizard will be added to. 10
  • 11. Further configuration wizard and component customizations Our configuration wizards’ defects became glaringly obvious • Now, we assign service checks to those device hostgroups (Satellites Tracked, Temperature, SysUpTime). If we ever need to make a change, we make it at one place, and it is applied to all devices. • We still track interface information via cfgmaker command, and allow the user to decide which ports they want checks performed on. 11
  • 12. MRTG overloads and other issues MRTG graphs were starting to saw tooth again (checks completing once every 10-15 minutes) • MRTG was already split manually into 8 separate processes • ~12k checks every 5 minutes • If some part of the network became unavailable overnight, and an error became present on 1 interface that stopped that process from completing successfully, we all of a sudden didn’t have bandwidth for ~1500 ports. Unacceptable! 12
  • 13. MRTG overloads and other issues MRTG graphs were starting to saw tooth again (checks completing once every 10-15 minutes) • We created a database synchronization tool (MRTGQL?), and converted our configuration wizards to write directly to tables • Now we can handle duplicate checking in a sane manner! • We split our MRTG processes based on information in a config array present in the synchronization script, which updates our crontab file and rewrites all of the individual config files • We also monitor the log file directory for errors, and send out alerts based on these findings – no more bandwidthless nights 13
  • 14. Remote MRTG bandwidth polling and Nagios checks The need arose to segment some bandwidth polling and Nagios checks onto logical networks • We have all kinds of customers, and support calls are expensive • Lets give them access to their own monitoring solution! • It will be fun and easy, they said! 14
  • 15. Remote MRTG bandwidth polling and Nagios checks The need arose to segment some bandwidth polling and Nagios checks entirely away from our backhaul network • Executing remote Nagios checks is as easy as ensuring that each different customer’s device has the default hostgroup appropriately added. Their remote ModGearman takes care of the rest! • We changed the configuration wizards to hide all of the default hostgroups from the user’s selectable listbox, and only assign the one that that user is linked up with in our intermediary management database • But what about remote bandwidth polling? • We changed the configuration wizards to execute cfgmaker on their remote ModGearman box. Once the user selects which ports to monitor, these are all stored in our mrtg database with the appropriate remote information so that our database sync occurs on the proper server 15
  • 16. Remote MRTG bandwidth polling and Nagios checks 16
  • 17. Empowering standard users Network or Sales Engineers should not have to be Nagios Administrators to remove devices from monitoring • We had to train people on Core Config Manager if they ever planned on creating or removing hostgroups, removing hosts or services, renaming anything, removing hosts from hostgroups, etc. • So we figured out exactly what the most commonly used features of Core Config Manager were internally (Hint: it is all the ones I listed in the last bullet point) 17
  • 18. Empowering standard users Network or Sales Engineers should not have to be Nagios Administrators to remove devices from monitoring • Then we built a component that does all of those things via direct calls to the NagiosQL DB and the filesystem • Now all of my users can only remove the objects that they have permissions to. Network Engineers can’t remove customer equipment, and Sales can’t remove backhaul routers! 18
  • 19. Geospatial information system integration The basic NOC Overview map was fast approaching end-of-life. • The original map was extended based on the Google map component • It had a decent interface that tied existing hostgroups to lat/lng coordinates (locations) and displayed them on the map based on that hostgroup’s hosts’ statuses • We tied locations together by linking a specific host and service at a location to a specific host and service at another location (relationship) • We displayed relationships as lines between locations 19
  • 20. Geospatial information system integration The basic NOC Overview map was fast approaching end-of-life. • We also built an animated radar layer and overlayed it on top of the map • This is all fine and good, but that data only existed inside of this portal inside of our Nagios XI instance • We needed to export that data to a true geospatial information system (PostGIS, GeoServer) 20
  • 21. Geospatial information system integration (continued) We needed a better way to manage geospatial data that could be utilized by more than one team of engineers • We built a GeoServer, and built a component that pulls WMS in OpenLayers • We created multiple datastores for each particular customer we service, with multiple layers in each (locations, wireless relationships, fiber relationships, etc.) • We built an awesome interface for that portal that allows any user of our Nagios XI instance to add locations and relationships with ease • We built an application that parses status data and rebuilds all of the WMS layers with the proper styling (red for down, green for up, etc.) • Now we can log in to the GeoServer via separate credentials and view the relevant data • This is useful for our GIS and Project Management departments 21
  • 22. Geospatial information system integration (continued) 22
  • 23. Conclusion • We (seriously) beefed up the hardware to accommodate almost doubling hosts and services • We automated everything we could possibly automate • We built a layer on top of MRTG to manage configuration files and remote workers • We refactored our configuration wizards to be extremely efficient, and tie in directly to our MRTG sync tool • We built our map functionality on top of a real Geo Server What’s next? • Automating the deployment of XI instances based on growth and location • Tying password change component into LDAP • Automatic interference detection in the frequency map (with alerting!) • Receive signal threshold alarming based on propagation prediction • Alerting based on average values over time and a percentage change in those values • Deeper geospatial integration (propagation/coverage maps) Contact and Questions • bheden@agilenetworks.com • Any questions? 23
  翻译: