The concept of a single pane of glass is easy to grasp… build a tool that can handle multiple feeds of data, compile and summarize that data into logical and simple nuggets of wisdom, and as a result… have a single place that anyone can go to in order to see the data they need to see to make the decisions they need to make. Sounds good, but how does that apply for APM?
First, APM means two things in this industry… Application Performance Monitoring - looking at the end user experience and the components making up that experience in a production environment; and Application Performance Management - the holistic view of the IT infrastructure supporting the business including Monitoring, Performance Testing, Capacity Planning, and Analytics. For the purposes of this series I’ll be referring to Application Performance Management as IT Performance Management (ITPM) and the Monitoring side of the house as APM. In almost all of this series I’ll be talking about ITPM. Now that we have that piece of housekeeping aside, let’s dive right in.
So how does all this apply to ITPM? It allows you, the user of such a product to make decisions on how to keep your IT services running without the need to wait for end users to call and complain about something being down or slow. It simplifies the process of identifying problems and drilling down to the root cause because all the data is together in a single screen. For an executive, it allows a high-level view of the entire IT infrastructure so, at a glance, they can tell how well IT is serving their business.
Awesome right? Well, the industry never quite figured out how to make such a utopia a reality. We here at InsightETE took a close look at why that is and we’ve come up with a few key things that must be followed to make a SPoG effective and valuable.
Drop the data-bias, become data agnostic
Here’s a quick story, names have all been changed to protect the innocent and guilty… A bank (we’ll call them Pursue Bank) needed to import their event aggregation data from one company (let’s say: CTR) into a new dashboard created by a different company (we’ll call them Tungsten.) Well, $10,000,000 and six months later, Tungsten came back, hat in hand, and told Pursue that they just couldn’t get CTR’s data in their dashboard. They left, kept the money, and Pursue fired the executive that brought in Tungsten.
So what happened?
According to Tungsten, CTR’s data was just too incomprehensible to be understood by anyone other than CTR. This simply wasn’t true. Indeed, the people at Pursue created a custom dashboard that pulled in data from CTR, Tungsten, and 30+ other data sources and put it all together to be viewed by anyone who wanted to see it. If a bank had the development chops to pull it off, how could a software company supposedly specializing in such programming be stumped? The answer is simple: They didn’t WANT to succeed. They wanted Pursue to buy all of the Tungsten solution, including the pieces of software they produce that could have replaced the CTR stuff they were using.
At the end of the day, this part of the recipe is the easiest thing to do. All we have to do is keep our word that we want to help the customer get to the best solution for them and to ensure that the solution they chose works the best it possibly can. The APM market is $4bn and growing. There are plenty of problems to solve and money to go around. Now if only the other big players would get on-board with that philosophy…
That said, I’d really like to give a shout-out to Splunk for adopting at least this part of the formula. As best I can tell, they are the only major player in the APM market to do so (we’re still too small to be considered major at this point.)
Here’s how we implemented this tenant… take a look at a screenshot of our dashboard:
The design is intentionally unassuming. Each one of the columns (End User Performance, Transaction Volume, etc) are all generated by different modules that plug into the dashboard. Those modules pull from a data-store that is populated by many different tools. For instance:
- End User Performance is populated by 6 data feeds (only 2 of which are ours)
- Transaction Volume is populated by the same 6 data feeds
- Service Availability is populated from our own availability data, along with an integration to a ticketing system that determines planned outages
- Events is populated by 8 data sources (only 2 of which are ours)
- CPU/Disk/Memory is populated by 4 data sources, none of which come from our data collectors
- Ticketing is populated by 1 data source, but it isn’t from us either
It is important to note that any of those columns could be populated by any number of tools. If you’ve standardized to one, great! If you have a dozen tools that collect the same type of data, we can handle that too! Also, we can add, modify, or remove modules for each user or group so they see the information they want to see.
Simply put, the point of this tool is not to promote our own data collection (yes, it is among the best in the world for what it does, but that is another story entirely!) but to analyze information. We (and by extension our tool) don’t care where the data comes from, so long as it is accurate and current. That is what being data agnostic is all about.
The next piece of the recipe… Changing your perspective from the bottom up to a top down approach. What’s that mean? Tune in next week to find out!
ABOUT THE AUTHOR
Matthew Bradford has been in the I.T. Performance Business for 15 years and has been critical to the success of many Fortune 500 Performance Management groups. He is currently the CTO of InsightETE, an I.T. Performance Management company specializing in passive monitoring and big data analytics with a focus on real business metrics.