Introduction

In Part 1 (link here) I discussed the first ingredient we used in our recipe for designing an effective Single Pane of Glass.  A quick recap:

  • For the purposes of this series, APM is referring to Application Performance Monitoring, a segment of IT Performance Management.  Most of the principles we’ll be discussing here will be more related to ITPM, though we’ll touch on a lot of APM topics as well.
  • The first ingredient of the 4 part recipe is being “Data Agnostic” meaning, we needed to design our solution to be able to import data from any ITPM-centric datasource.

In Part 2 (link here) I discussed an important philosophy for problem solving: The Top-Down Approach.  A quick recap:

  • IT Solutions are too complex and interconnected to be focused on a single component unless you know that is the source of an overall problem
  • A Top-Down approach would allow a user to look at the entire IT landscape, and drill in where a potential problem is indicated, saving time and enhancing problem correlation, even between seemingly unrelated systems

So without further ado, the third ingredient in our recipe is…

Get rid of the clutter!

The most disturbing trend I’ve found in this industry is the need to push as much detailed information as possible onto each and every single screen.  The desire to put the user directly into the weeds is just bad problem solving.  It is just like trying to find a needle in the haystack… where do you begin to look?  Why would people want this and why would vendors enable it?  Well, here’s one theory:

Vendors want to ensure that their products provide business value, and they also want to ensure that they are just complicated enough to use that their clients will need to engage them for services.  Why do the clients accept this paradigm?  Frankly, it looks impressive to have a thousand little dials everywhere.  Ever step into the cockpit of even a small plane?  It is a very impressive sight indeed.  Now, I’m not a pilot, and therefore could tell you absolutely nothing useful about what that plane is doing… but it looks amazing.  Sure, you could make the argument that to the trained eye, the cockpit makes perfect sense and, for the most part, you’d be right.  There are, however, some key differences in how even a cockpit is implemented.  Chief among them: warning lights and beeping alarms.  The alarm lets you know you need to look at the dashboard of the cockpit.  The light draws the eye to the problem at hand.  Even the best pilots can’t watch the entire dashboard and fly the plane without some sort of aid to show them where the interesting information is at.  There is just too much there.  So it is as well with the vast majority of the APM dashboards out there.  Take a look at a few…

dashboards

Impressive right?  Except these aren’t telling you much of anything.  Even in some of the better examples in here are focused in on a single application and their tiers.  Not a single one of these is suitable for an enterprise view.  All of them are acceptable for a drill-down screen, but these are presented as top level dashboards.  In contrast:

UIC

Our approach is easy to understand for anyone looking at it for the first time.  Even if the green/yellow/red system is new to you, there’s a clear dashboard key to tell you what each bulb color means.  The grid is split to show overall systems and their tiers on the Y axis, and the X axis shows the types of information being tracked on the dashboard.  This view gives you exactly enough information to know, at a high level, if there are any problems within the entirety of the application stack.  If there are problems, all you have to do is click on the color that is interesting (red in the screenshot above.)

Once clicked, you have an instant graph showing the performance of the overall system over the past hour.  The specific business transaction that is having performance problems is selected by default.  Additionally, the shaded area tells you the normal average for that hour on that day of the week, and the normal level of variance the system sees during that time (the standard deviation.)  You can open as many of those bulbs as you want (and have each of them auto-update with new data) or you can keep them all closed.  The power is completely in your hands.  See as much, or as little, information as you need.

Also, if you decided to click within that window again it would actually go to a very detailed historical charting tool so you can drill down and gain other historical context.  In there you can view alert overlays, trending data, service level agreement data, and even a performance distribution histogram along with the standard deviation and baseline data already present.  It is important to note, this works with data that we collect… and, most importantly, data we did NOT collect.  (A throwback to the first point in this article.)  That detail screen might look something like this:

UIC Detail

While that screenshot has a lot going on, the point behind it is that all that same detail everyone else covets is still there.  With our approach however, we allow the user to turn different analysis tools off and on.  A trend line may or may not be important to the task you’re trying to complete.  You may or may not care about an event overlay at a certain time, etc.

It all boils down to this: Present simplicity, and allow the user to introduce complicated data layer by layer.

Up Next…

The final ingredient… Make a dashboard that tells on itself when it doesn’t follow the 2 cardinal rules of dashboards… What are those two cardinal rules?  Find out next week!

ABOUT THE AUTHOR
Matthew Bradford has been in the I.T. Performance Business for 15 years and has been critical to the success of many Fortune 500 Performance Management groups. He is currently the CTO of InsightETE, an I.T. Performance Management company specializing in passive monitoring and big data analytics with a focus on real business metrics.