Introduction

Site down again? You've got to be kidding me!

Maybe he won’t notice your site is down… he noticed.

Let’s face it, customers expect the world. They want the best deal, great service and rapid delivery. Should a company deny them those rights, so help them they will turn to social media for justice! Just look to apps like Twitter, Yelp or Facebook for countless examples of times customers have voiced their dissatisfaction with the quality of service that they’ve received.

With this being so important to buyers, why do companies continue to suffer tech glitches that impact the customer experience? Well, it’s very difficult to develop, test and support enterprise applications across complex systems. Doing so requires time, planning, coordination and support from the business side, things that are sometimes considered a luxury to IT departments. So what can companies do? Shifting IT culture towards serving the business is key. IT should function as an organization that develops and supports business objectives, including the goal of providing a superior customer experience.

How can this be done? Immediately implementing Quality Assurance and Performance Testing efforts is a great start. By performing QA and Performance testing, many of the tech glitches that plague users today can be eliminated, helping IT to make the first step towards transforming into a supporting unit to the business. Let’s take a look at 6 key types of tests that every company needs in order to protect the customer experience.

6 Types of Quality Assurance and Performance Testing

QA Testing

Quality Assurance (QA) testing focuses on ensuring that coding behind the service will perform as expected. This type of testing relies on simulating the actions that real end-users will take when using your applications or website. It can be used to test changes to existing services, upgrades, revisions, or new offerings. Whether launching a new online banking feature to customers, or revising a single process flow on a website, it is important to test the quality of these services ahead of time. Two types of QA testing are highlighted in this article: Functional Testing and Parallel Testing.

  1. Functional Testing – Often performed by the development team, this type of QA testing ensures that the code behind the application or website functions as expected. Functional testing is limited to examining only how the code performs under the most ideal of circumstances.
  2. Parallel Testing – This type of QA testing is performed to compare the quality of a service between a preexisting and new system. If upgraded from a legacy to modern system, or moving to virtualized or cloud services, it is important to compare the quality of your service on both systems. To do so, production data from the old system is tested on the new system to create a before and after comparison and quality check.

Performance Testing

Once QA testing has ensured that applications and websites are performing as expected under ideal circumstances, performance testing should follow in order to determine how non-ideal circumstances effect the service. Examples of non-ideal circumstances include instances when many end-users access the system simultaneously, or a component of the system (such as a server in a cluster) has failed. There are four types of Performance Tests described below:

  1. Smoke Testing – This type of performance testing aims to apply just enough load to the system in order to exercise all the individual components that make up the service. This is used to ensure each component of the system is functioning as expected.
  2. Peak Hour Testing – This performance test applies the amount of load expected during the busiest hour of the day, referred to as peak hour, and runs that volume on the system for 60 continuous minutes. Simulating this worst-case hour allows IT to determine if the system can withstand that volume of traffic without any major issues emerging in the short-term. Typically, IT will test the peak hour load (x), plus 1.5x and ideally 2x or 3x to adequately measure short-term performance. With Peak Hour Testing, IT is also looking to determine if, in a load balanced environment, a server or other component failure will affect the quality of the service.
  3. Soak Testing – The soak test builds upon the Peak Hour Test by applying the highest peak hour load found to still allow your system to run comfortably for 8-12 hours continuously. Just as the Peak Hour Test examines performance in the short-term, the Soak Test looks at the performance of the system over the long-term, checking for memory leaks or other major issues that will develop and worsen over time. By running the Soak Test for a full 24 hours, IT can see how the system will perform after one month’s worth of transaction data.
  4. Stress Testing – Stress Testing increasingly turns up the load applied to the system to find the breaking point. This is helpful for capacity planning purposes, since it tells IT when they should next be worried about exceeding the capabilities of existing systems. It also gives production support an excellent baseline to work with in determining the weakest link within a system and how the service will behave when over-stressed.

Do We Really Need All This Testing?

Yes, yes you do! Just ask Apple, NASDAQ, Sears, Kohl’s, Kmart or those responsible for healthcare.gov.

Just this year, Apple launched a new payment system, however, things did not go off without a hitch. Instead, Apple mistakenly double charged Bank of America customers using the Apple Payment system [1].  Had Apple effectively performed functional and performance testing prior to the launch, they could have caught and eliminated this error before it impacted customers. Most likely, this would have helped Apple to avoid much of the reluctance from the marketplace that they’ve since faced.

In the case of NASDAQ, the exchange suffered an outage last year that was attributed to glitches in communication between two merged systems [2]. In this incident, parallel testing could have examined how applications would perform in the converged IT system and have identified points of weakness. In fact, with a change such as this, all the types of QA and Performance Testing discussed above should have been performed. Doing so would have caught the memory leaks also experienced by NASDAQ, and could have better equipped IT with an understanding about how a single server failure would impact the overall service.

There’s also the Black Friday crisis for Sears, Kohl’s and Kmart in 2012. These companies experienced website outages for nearly 11 hours on the busiest shopping day of the year [3]. It was perfectly clear that these retailers did not adequately prepare for the volume of traffic visiting their sites. Modified Peak Hour Testing, using load data from estimated Black Friday traffic, and Soak Testing would have helped them to determine at which point their systems would become overwhelmed. Armed with this data, these retailers could have worked with IT to modify existing infrastructure to support anticipated volumes of traffic.

Lastly, there’s the example of healthcare.gov that infamously failed to apply any of the QA and Performance Testing categories discussed throughout this blog. The website’s launch was a disaster, with frequent outages and illogical process flows. Without functional testing, the project managers and developers of the website were left completely unaware of coding errors that resulted in incomplete end-user transactions. Because they did not adequately perform peak, soak, smoke or stress testing, healthcare.gov was not prepared to handle the volume of end-users attempting to register at the same time. In fact, pre-launch stress tests examined the impact of up to 60,000 simultaneous users and found that even 1,100 users accessing the site at the same time would slow performance [4]. In reality, 8.1 million users visited the site in 4 days, with up to 250,000 users accessing the site at once[4]. Only after these tests were finally performed was a development team able to correct the issues plaguing the site, and quell the savage paparazzi magnifying the scandal.

Summary

It is crucial for both business and IT to understand the impact that technology can have on the bottom line. While rushed deadlines, tight budgets, staffing constraints and other challenges make it tempting to skip QA and Performance Testing, the case studies examined in this article, along with countless other instances, highlight the damaging cost of doing so. Without QA and Performance Testing, companies remain in the dark about their future customer experiences and continue to be trapped in a reactive state. Reacting to lost revenue, poor customer experiences, and bad publicity is far less profitable than executing proper testing to prevent these instances from occurring.

About the Authors

Matthew Bradford

Matthew Bradford has been in the I.T. Performance Business for 15 years and has been critical to the success of many Fortune 500 Performance Management groups. He is currently the CTO of InsightETE, an I.T. Performance Management company specializing in passive monitoring and big data analytics with a focus on real business metrics.

Tara Sharif

Tara Sharif has extensive experience in IT recruiting and sales.  She is consistently a top performer at every organization that employs her.  She is currently working as an Executive Account Manager at InsightETE where she partners with enterprise organizations to provide APM consulting services.  InsightETE is an Application Performance Management firm offering a full line of solutions to fortune 1000 and larger companies.  InsightETE offers APM consulting, IT staffing, and a proprietary and patented method to perform root cause analysis in 15 minutes or less. InsightETE’s software gives their clients the ability to measure and troubleshoot IT system performance on a granular level.  Additionally, InsightETE clients can measure true response time, track service levels, and reduce outages as they root out problems from their verified source. What’s more, they see an increase in their customer service satisfaction by eliminating service level disagreements.

Footnotes

[1] Retrieved 10 October 2014 http://www.bloomberg.com/news/2014-10-22/bank-of-america-customers-double-charged-in-apple-pay-snafu.html

[2] Retrieved 1 October 2014 from http://www.bloomberg.com/news/2013-08-26/nasdaq-three-hour-halt-highlights-vulnerability-in-market.html

[3] Retrieved 5 October 2014 from http://www.joinfora.com/black-friday-website-outages-could-have-been-prevented/

[4] Retrieved 5 October 2014 from http://blogs.forrester.com/lauren_nelson/13-12-23-overview_of_healthcaregovs_tech_challenges and http://www.forbes.com/sites/lorenthompson/2013/12/03/healthcare-gov-diagnosis-the-government-broke-every-rule-of-project-management/