I read a great post by Aleksis Tulonen Brainstorming Test Ideas With Developers and wanted to share it with you.

The ideas in the post provide a very efficient way to:

  1. increase the robustness and thoroughness of testing,
  2. prioritize among different testing ideas (both from a relative importance standpoint and from a relative timing perspective)
  3. surface new testing ideas that would not otherwise be considered
  4. provide some useful risk mitigation protection against the reality that "all models are wrong; some models are useful" problem

I'd suggest also scheduling a debrief with the same testers and developers a few weeks or months after the code was deployed.

At that time take a look at photos of the pre-testing Post It notes and a list of defects that slipped past testing and ask

  1. "what extra tests would we need to have added to have found these defects?"
  2. "what specific inputs did we forget to include?",
  3. would techniques such as pairwise testing have been beneficial?
  4. what areas did we spent a lot of effort on that did not result in learning much?
  5. what lessons learned should we incorporate for next time?

The idea of delibrately examining your software development and testing practices will be familar to those using agile retrospectives. The power of continually improving the development practices used withing the organization is hard to appreciate but it is immense. The gains compound over time so the initial benefits are only a glimpse of what can be achieve by continuing to iterate and improve.

Related: Pairwise and Combinatorial Software Testing in Agile Projects - Applying Toyota Kata to Agile Retrospectives - Create a Risk-based Testing Plan With Extra Coverage on Higher Priority Areas

By: John Hunter and Justin Hunter on Dec 15, 2016

Categories: Agile, Software Testing, Testing Strategies

Performance testing examines how the software performs (normally "how fast") in various situations.

Performance testing does not just result in one value. You normally performance test various aspects of the software in differing conditions to learn about the overall performance characteristics. It can well be that certain changes will improve the performance results for some conditions (say a powerful laptop with a fiber connection) and greatly degrade the performance for other use cases. And often the software can be coded to attempt to provide different solutions under different conditions.

All this makes performance testing complex. But trying to over-simplify performance testing removes much of its value.

Another form of performance testing is done on sub components of a system to determine what solutions may be best. These often are server based issues. These likely don't depend on individual user conditions but can be impacted by other things. Such as under normal usage option 1 provides great performance but under larger load option 1 slows down a great deal and option 2 is better.

Focusing on these tests of sub components run the risk of sub-optimization, where optimizing individual sub-components result in a less than optimal overall performance. Performance testing sub-components is important but it is most important is testing the performance of the overall system. Performance testing should always place a priority on overall system performance and not fall into the trap of creating a system with components that perform well individually but when combined do not work well together.

Load testing, stress testing and configuration testing are all part of performance testing.

Continue reading about performance testing in the Hexawise software testing glossary.

By: John Hunter on Sep 27, 2016

Categories: User Experience, Software Testing Testing Strategies, Software Testing

Software testing concepts help us compartmentalize the complexity that we face in testing software. Breaking the testing domain into various areas (such as usability testing, performance testing, functional testing, etc.) helps us organize and focus our efforts.

But those concepts are constructs that often have fuzzy boundries. What matters isn't where we should place certain software testing efforts. What matters is helping create software that users find worthwhile and hopeful enjoyable.

One of the frustration I have faced in using internet based software in the last few years is that it often seems to be tested without considering that some users will not have fiber connections (and might have high latency connections). I am not certain latency (combined maybe with lower bandwidth) is the issue but I have often found websites either actually physically unusable or mentally unusable (it is way too frustrating to use).

It might be the user experience I face (on the poorly performing sites) is as bad for all users, but my guess is it is a decent user experience on the fiber connections that the managers have when they decide this is an OK solution. It is a usbility issue but it is also a performance issue in my opinion.

It is certainly possible to test performance results on powerful laptops with great internet connections and get good performance results for web applications that will provide bad performance results on smart phones via wifi or less than ideal cell connections. This failure to understand the real user conditions is a significant problem and an area of testing that should be improved.

I consider this an interaction between performance testing and user-experience testing (I use "user-experience" to distinguish it from "usability testing", since I can test aspects of the user experience without users testing the software). The page may load in under 1 second on a laptop with a fiber connection but that isn't the only measure of performance. What about your users that are connecting via a wifi connection with high latency? What if the performance in that case is that it takes 8 seconds to load and your various interactive features either barely work or won't work at all given the high latency.

In some cases ignoring the performance for some users may be OK. But if you care about a system that delivers fast load times to users you need to consider the performance not just for a subset of users but consider how it performs for users overall. The extent you will prioritize various use cases will depend on your specific situation.

I have a large bias for keeping the basic experience very good for all users. If I add fancy features that are useful I do not like to accept meaningful degradation to any user's experience - graceful degradation is very important to me. That is less important to many of sites that I use, unfortunately. What priority you place on it is a decision that impacts your software development and software testing process.

Hexawise attempts to add features that are useful while at the same time paying close attention to making sure we don't make things worse for users that don't care about the new feature. Making sure the interface remains clear and easy to use is very important to us. It is also a challenge when you have powerful and fairly complex software to keep the usability high. It is very easy to slip and degrade the users experience. Sean Johnson does a great job making sure we avoid doing that.

Maintaining the responsiveness of Hexawise is a huge effort on our part given the heavy computation required in generating tests in large test case scenarios.

You also have to realize where you cannot be all things to all people. Using Hexawise on a smart phone is just not going to be a great experience. Hexawise is just not suited to that use case at all and therefore we wouldn't test such a use case.

For important performance characteristics it may well be that you should create a separate Hexawise test plan to test the performance under server different conditions (relating to latency, bandwidth and perhaps phone operating system). It could be done within a test plan it just seems to me more likely separate test plans would be more effective most of the time. It may well be that you have the primary test plan to cover many functional aspects and have a much smaller test plan just to check that several things work fine in a high latency and smart phone use case).

Within that plan you may well want to test out various parameter values for certain parameters operating system iOS Android 7 Android 6 Android 5

latency ...

Of course, what should be tested depends on the software being tested. If none of the items above matter in your case they shouldn't be used. If you are concerned about a large user base you may well be concerned about performance on various Android versions since the upgrade cycle to new versions is so slow (while most iOS users are on the latest version fairly quickly).

If latency has a big impact on performance then including a parameter on latency would be worthwhile and testing various parameter values for it could be sensible (maybe high, medium and low). And the same with testing various levels of bandwidth (again, depending on your situation).

My view is always very user focused so the way I naturally think is relating pretty much everything I do to how it impacts the end user's experience.

Related: 10 Questions to Ask When Designing Software Tests - Don’t Do What Your Users Say - Software Testers Are Test Pilots

By: John Hunter on Jul 26, 2016

Categories: User Experience, Testing Strategies, Software Testing

Usability testing is the practice of having actual users try the software. Outcomes include the data of the tasks given to the user to complete (successful completion, time to complete, etc.), comments the users make and expert evaluation of their use of the software (noticing for example that none of the users follow the intended path to complete the task, or that many users looked for a different way to complete a task but failing to find it eventually found a way to succeed).

Usability testing involves systemic evaluation of real people using the software. This can be done in a testing lab where an expert can watch the user but this is expensive. Remote monitoring (watching the screen of the user; communication via voice by the user and expert; and viewing a webcam showing the user) is also commonly used.

In these setting the user will be given specific tasks to complete and the testing expert will watch what the user does. The expert will also ask the user questions about what they found difficult and confusing (in addition to what they liked) about the software.

The concept of usability testing is to have feedback from real users. In the event you can't test with the real users of a system it is important to consider if you are fairly accuratately representing that population with your usability testers. If the users of the system of fairly unsophisticated users if you use usability testers that are very computer savy they may well not provide good feedback (as their use of the software may be very different from the actually users).

"Usability testing" does not encompass experts evaluating the software based on known usability best practices and common problems. This form of expert knowledge of wise usability practices is important but it is not considered part of "usability testing."

Find more exploration of software testing terms in the Hexawise software testing glossary.

Related: Usability Testing Demystified - Why You Only Need to Test with 5 Users (this is not a complete answer, it does provide insite into the value of quick testing to run during the development of the software) - Streamlining Usability Testing by Avoiding the Lab - Quick and Dirty Remote User Testing - 4 ways to combat usability testing avoidance

By: John Hunter on Jul 4, 2016

Categories: User Experience, User Interface, Testing Strategies, Software Testing

At Hexawise we aim to improve the way software is tested. Achieving that aim requires not only providing our clients with a wonderful software tool (which our customers say we’re succeeding at) but also a commitment from the users of our tool to adopt new ways of thinking about software testing.

We have written previously about our focus on the importance of the values Bill Hunter (our founder's father) to Hexawise. That has led us to constantly focus on how maximize the benefits our customers gain using Hexawise. This focus has led us to realize that our customers that take advantage of the high-touch training services and ongoing expert test design support on demand that we offer often realize unusually large benefits and roll out usage of Hexawise more quickly and broadly than our customers who acquire licenses to Hexawise and try to “get the tool and make it available to the team.”

We are now looking for someone to take on the challenge of helping our clients succeed. The principles behind our decision to put so much focus on helping our customers succeed are obvious to those that understand the thinking of Bill Hunter, W. Edwards Deming, Russel Ackoff etc. but they may seem a bit odd to others. The focus of this senior-level position really is to help our customers improve their software testing results. It isn't just a happy sounding title that has no bearing on what the job actually entails.

The person holding this position will report to the CEO and work with other executives at Hexawise who all share a commitment to delighting our customers and improving the practice of software testing.

Hexawise is an innovative SaaS firm focused on helping large companies use smarter approaches to test their enterprise software systems. Teams using Hexawise get to market faster with higher quality products. We are the world’s leading firm in our niche market and have a growing client base of highly satisfied customers. Since we launched in 2009, we have grown both revenues and profits every year. Hexawise is changing the way that large companies test software. More than 100 Fortune 500 companies and hundreds of other smaller firms use our industry leading software.

Join our journey to transform how companies test their software systems.

Hexawise office

Description: VP of Customer Success

In the Weeks Prior to a Sale Closing

  • Partner with sales representatives to conduct virtual technical presentations and demonstrations of our Hexawise test design solution.

  • Clearly explain the benefits and limitations of combinatorial test design to potential customers using language and concepts relevant to their context by drawing upon your own “been there, done that” experiences of having successfully introduced combinatorial test design methods in multiple similar situations.

  • Identify and assess business and technical requirements, and position Hexawise solutions accordingly.

Immediately Upon a New Sale Closing

  • Assess a new client’s existing testing-related processes, tools, and methods (as well as their organizational structure) in order to provide the client with customized, actionable recommendations about how they can best incorporate Hexawise.

  • Collaborate with client stakeholders to proactively identify potential barriers to successful adoption and put plans in place to mitigate / overcome such barriers.

  • Provide remote, instructor-led training sessions via webinars.

  • Provide multi-day onsite instructor-led training sessions that: cover basic software test design concepts (such as Equivalence Class Partitioning, the definition of Pairwise-Testing coverage, etc.) as well as how to use the specific features of Hexawise.

  • Include industry-specific and customer-specific customized training modules and hands-on test design exercises to help make the sessions relevant to the testers and BA’s who attend the training sessions.

  • Collaborate with new users and help them iterate, improve, and finalize their first few sets of Hexawise-generated software tests.

  • Set rollout and adoption success criteria with clients and put plans in place to help them achieve their goals.

Months After a New Sale Closing

  • Continue to engage with customers onsite and virtually to understand their needs, answer their test design questions, and help them achieve large benefits from test optimization.

  • Monitor usage statistics of Hexawise clients and proactively reach out to clients, as appropriate, to provide proactive assistance at the first sign that they might be facing any potential adoption/rollout challenges.

  • Collaborate with stakeholders and end users at our clients to identify opportunities to improve the features and capabilities of Hexawise and then collaborate with our development team to share that feedback and implement improvements.

Required Skills and Experience

We are looking for a highly-experienced combinatorial test design expert with outstanding analytical and communication skills to provide these high touch on-boarding services and partner with our sales team with prospective clients.

Education and Experience

  • Bachelor’s or technical university degree.

  • Deep experience successfully introducing combinatorial test design methods on multiple different kinds of projects to several different groups of testers.

  • Set rollout and adoption success criteria with multiple teams and put plans in place to achieve them.

  • Minimum 5 years in software testing, preferably at a IT consulting firm or large financial services firm.

Knowledge and Skills

  • Ability to present and demonstrate capabilities of the Hexawise tool, and the additional services we provides to our clients beyond our tool.
  • Exhibit excellent communication and presentation skills, including questioning techniques.
  • Demonstrate passion regarding consulting with customers.
  • Understand how IT and enterprise software is used to address the business and technical needs of customers.
  • Demonstrate hands-on level skills with relevant and/or related software technology domains.
  • Communicate the value of products and solutions in terms of financial return and impact on customer business goals.
  • Possess a solid level of industry acumen; keeping current with software testing trends and able to converse with customers at a detailed level on pertinent issues and challenges.
  • Represents Hexawise knowledgeably, based on a solid understanding of Hexawise’s business direction, portfolio and capabilities
  • Understand the competitive landscape for Hexawise and position Hexawise effectively.
  • A cover letter that describes who you are, what you've done, and why you want to join Hexawise.
  • Ability to work and learn independently and as part of a team
  • Desire to work in a fast-paced, challenging start-up environment

Why join Hexawise?

salary + bonus; medical and dental, 401(k) plans; free parking and very slick Chapel Hill office! Opportunity to experience work with a fast-growing, innovative technology company that is changing the way software is tested.

Key Benefits:

Salary: Negotiable, but minimum of $100,000 + Commissions based upon client license renewals Benefits: Health, dental included, 401k plan Travel: Average of no more than 2-3 days onsite per week Location: Chapel Hill, NC*

*Working from our offices would be highly preferable. We might consider remote working arrangements for an exceptional candidate based in the US.

Apply for the VP of Customer Success position at Hexawise.

By: John Hunter on May 12, 2016

Categories: Hexawise, Career, Software Testing, Lean, Customer Success, Agile

It has been quite a long time since we last posted a roundup of great content on software testing from around the web.

  • Mistakes We Make in Testing by Michael Sowers - "Not being involved in development at the beginning and reviewing and providing feedback on the artifacts, such as user stories, requirements, and so forth. Not collaborating and developing a positive relationship with the development team members..."
  • Changing the conversation about change in testing by Katrina Clokie - "I'm planning to stop talking about bugs, time and money. Instead I'm going to start framing my reasoning in terms of corporate image, increasing quality awareness among all disciplines, and improving end-user satisfaction."
  • How to Pack More Coverage Into Fewer Software Tests by Justin Hunter - "There are simply too many interactions for a tester to keep track of on their own. As a result, manually-selected test sets almost always fail to test for a rather large number of potentially important interactions."
  • Building Quality by Alan Page - "your best shot at creating a product that customers like crave is to get quantitative and qualitative feedback directly from your users... Get it in their hands, listen to them, and make it better."
  • Dr. StrangeCareer or: How I Learned to Stop Worrying and Love the Software Testing Industry by Keith Klain - "Testing is hard. Doing it right is very hard. An ambiguous, unchartered journey into a sea of bias and experimentation, but as the line from the movie goes, 'the hard is what makes it great'."
  • Exploratory Testing 3.0 by James Bach and Michael Bolton - "Because we now define all testing as exploratory. Our definition of testing is now this: 'Testing is the process of evaluating a product by learning about it through exploration and experimentation, which includes: questioning, study, modeling, observation and inference, output checking, etc.'"
  • Coverage Is Not Strongly Correlated With Test Suite Effectiveness by Laura Inozemtseva and Reid Holmes - "Our results suggest that coverage, while useful for identifying under-tested parts of a program, should not be used as a quality target because it is not a good indicator of test suite effectiveness. "
  • How Software Testers Can Teach, Share and Play with Others by Justin Rohrman - Software testers "bring a varied skill set to software development organizations — math, experiment design, modeling, and critical thinking."
  • Who Killed My Battery: Analyzing Mobile Browser Energy Consumption - " dynamic Javascript requests can greatly increase the cost of rendering the page... by modifying scripts on the Wikipedia mobile site we reduced by 30% the energy needed to download and render Wikipedia pages with no change to the user experience."

By: John Hunter on Apr 27, 2016

Categories: Roundup, Software Testing

When you outsource testing responsibilities to a third party, there are several key points to consider beyond the three most frequently asked questions of:

  • What is the cost per hour of your services?
  • What evidence can you provide of your firm’s level of expertise in our industry and/or the technical skills we feel you’ll need? and
  • What is your plan for increasing testing automation? In particular, it is worth developing a thorough understanding how the vendor will identify what kinds of things should be tested, how those things should be tested, and how that information will be communicated to you.
  1. How do you ensure testing is performed thoroughly - beyond just “happy path”/ confirmatory tests?
  2. Do you use Exploratory Testing broadly? Would you recommend we use it? Exploratory Testing is a well-established testing approach.
    • Champions of exploratory testing include James Bach and Michael Bolton.
    • If your vendor hasn’t heard of exploratory testing and starts to explain the disadvantages of undisciplined “ad hoc testing” it’s a warning sign that they might have rigid tunnel vision about what good testing entails. Ad hoc testing is not what exploratory testing is but those that haven't learned about exploratory testing may get the two ideas confused.
  3. How do you communicate the diminishing marginal returns that exist when you execute sets of test scripts?
    • In Hexawise, analyze tests screen is how we help show this concept. That screen shows both pecentage of parameter value pairs tested and the coverage matrix showing the parameter values pairs tested and untested. This view can seen for any specific number of tests along the test plan (using the adjustable bar at the top of the image). This image shows 80% of the paired values have been tested after 10 test cases. Using the slider this view lets you see exactly which parameter value pairs are untested at each point. In this example after 15 tests there is 96% pairwise coverage and 19 tests covers all pairwise combinations.
    • Be conscious that your vendor has a strong conflict of interest here. 80% 2-way coverage might be sufficient for your needs on a given project but testing to that level of thoroughness (instead of the complete set of tests) could represent 50% of their billable hours in which case they may go to 100% 2-way coverage (to increase billable hours). Of course, 100% 2-way coverage may be called for, these decisions have to be made with an understanding of the software being tested.
    • Implications: ideally, you would like to have a testing vendor that is happy to openly share and discuss the tradeoffs that exist and there strategy to get you the best results based on your situation.
  4. Do your testers use a structured test design approach such as pairwise test design?
    • If so, how many of your testers regularly use pairwise test design or similar methods to design tests?
    • During the process of selling you their testing services, vendors have a strong incentive to highlight their efficiency-enhancing capabilities. As soon as the ink dries on the contract though they have an incentive to keep billable hours high.
  5. Why (and in what types of testing) do they use it?
    • Pairwise and combinatorial test design approaches can be widely used in virtually all kinds of software testing.
    • If your vendor indicates that they use these approaches only in narrow areas (e.g., most likely functional testing or configuration testing), that is a large red flag that they don’t understand these test design approaches very well at all.
    • If your vendor does not understand how to apply these test design methods broadly (including in systems integration testing, performance testing, etc.) then you can safely assume that there will be considerable wasteful redundancy in any areas where these test design approaches are not used.
  6. If we gave you 50 test scripts we already have, could you generate a set of pairwise tests for us that could test the same system more thoroughly in significantly fewer tests?
    • A knowledgeable vendor should be able to analyze your tests and provide a draft set for your review within 24 hours.
    • If the vendor takes multiple days and they need to search far and wide for an available expert to create a set of optimized set of tests for you, that indicates that the actual number of testers capable of performing this kind of test design is probably pretty low.
    • Another practical way to double-check vendor claims is to search sites like LinkedIn for keywords such as pairwise testing, orthogonal array testing, and/or Hexawise.

Making sure your software testing process is staying current with the best ideas in software testing is an important factor in creating great software solutions that your customers love. Often companies understand the need to stay current on software coding practices that are successful but fewer organizations pay attention to good practices in software testing. This often means there is a good deal to be gained by spending some time to examine and improve your software testing practices.

By: John Hunter and Justin Hunter on Mar 29, 2016

Categories: Testing Strategies, Software Testing

When thinking about how to test software these questions can help you think of ideas you might otherwise miss. These tips are useful for anyone testing software; we do integrate tips that are specfically relevant to those using the Hexawise software testing application.

  • What additional things could I add that probably won't matter (but might?)
  • What are the two parameters in the plan with the most values? And should I shrink the number of parameter values for either case?). If you have a parameter or two with many more parameter values than other parameters have that can greatly increase the number of test cases that must be run to explore every pairwise (or higher) combination. If those are critical to test, then they should be, but often a high number of parameter values is an indication that they really could be reduced. Using value expantions may well be a wise choice.
  • Do the tests cover all the high priority scenarios that stakeholders want covered? Once you generate the tests think about high priority scenarios and if they are missing add them as required test cases. It may well be worth adding tests for the most common scenarios. Often this can be done without requiring extra tests when using Hexawise. Hexawise will just consider those and then create test cases to match your criteria which often won't require an extra test. Sometimes it will add a test or a few tests to the total, in which case you can decide the benefit is worth the added cost of those tests.
  • If you're testing a process, what are the if/then points where the process diverges? Make sure the alternative processes are being tested.
  • If the plan is heavily constrained, would it make sense to split it into multiple separate plans?
  • Might it make a difference to how the system operates if you were to include additional variation ideas related to: who, what, when, where, why, how and how many? Those questions are often helpful in identifying parameters that should be tested (and occassionally in identifying parameter values to test).
  • Have you accidentally included values in your plan that everyone is confident won't impact the system? You want to test everything that is important but if you test things that are not going to impact whether the software works or not it adds cost with no benefit.
  • Have you accidentally included "script killing" negative tests in your plan? See details on this point in our training file, different types of negative tests need to be handled differently when using Hexawise
  • Have you clearly thought about the implications of the distinction between impossible to test for combinations vs combinations that will lead the system to display error messages? Impossible to test combinations can be avoided using the invalid pair feature. But situations where users can enter values that should results in error messages should be tested to validate those error messages appear.
  • Have you considered the cost/benefit trade-off of executing, say, 50% of the Hexawise generated test set vs 75% vs 100%? It is best to think about what is the most sensible business decision instead of just automatically going for some specfic percentage. Depending on the softare being tested and the impact on your business and customers different levels of testing will make sense.

Too often software testing fails to emphasis the importance of experienced software testers using their brains, insight, experience and knowledge to make judgements about what is the best course of action. Hexawise is a wonderful tool to help software testers but the tool needs a smart and thoughtful person to wield it and produce the results we all want to see.

Related: Software Testers Are Test Pilots - Create a Risk-based Testing Plan With Extra Coverage on Higher Priority Areas - How Much Detail to Include in Test Cases?

By: John Hunter and Justin Hunter on Mar 14, 2016

Categories: Testing Strategies, Software Testing, Hexawise test case generating tool, Combinatorial Software Testing, Hexawise tips

Michael Bolton is one of the software testing industry's deep thinkers. He has an impressive ability to logically analyze testing problems and clearly explain complex issues.

I like how Michael summarizes what people often really mean when they say "it works"*

It works really means...

There's a lot of truth in those words, isn't there? I've shared these words from Michael in test design trainings I've done recently and found that they immediately resonate with quite a few testers. It seems that anyone who has been in testing for more than a while has seen teams of testers test a feature or function a bit, declare that "it works," only to discover later that the feature/function works in some situations but does not work in other situations.

What's a tester to do? We recommend testers use two deliberate strategies: use a rich oracle and cover critical interactions.

First, use a "rich oracle" to enage your brain more actively and train your eye to better recognize potential issues. Imagine the following scenario. 3 people are in a room. The first person, a guy plucked from the street outside at random, is given a set of 10 written test scripts to execute and told to follow the test scripts, step-by-step in return for a six pack of beer. Being a fan of beer, and endowed with the dual-abilities of being able to both read instructions and follow instructions, he performs what is asked of him dutifully.

In the room are two testers who are allowed to observe the tests being executed but who are not allowed to communicate.

  • The first tester has a list of ten numbers, each with three boxes for checkmarks. He operates in a world of black and white where if the documented "Expected Result" is consistent with what they observe, they write a green check mark. If the "Expected Result" is inconsistent, they write a green "X."
  • The second tester operates differently. She goes beyond. She goes deeper. She notices subtle things along the way that look unexpected, or not quite right. She makes notes of those things. In doing so, she thinks of new test ideas that have not been executed yet. She documents those test ideas to explore further later, provided there is time.

I think you see where I'm going with this. My point is that the more curious approach adopted by the second tester is a far more valuable one to people who care about software quality. Why is this? Here too, Michael Bolton has words of wisdom to share that resonate well:

If you insist you need written requirements to find bugs

Second, testers should adopt test case design approaches in order to avoid the "under some conditions... once" risk. One of the most important benefits of using our Hexawise test case design tool is that, even with very basic pairwise test sets, every feature or function you test will automatically be tested multiple times. And under as many different conditions as possible in the time you have available.

After close to 10 years of introducing new groups of software testers to these types of test design approaches, people have a hard time believing how big efficiency and thoroughness improvements in this area are. That's why we always strongly encourage teams using our optimized test case selection approach to do apples-to-apples comparisons of coverage and defect-finding effectiveness. We work with teams to compare the coverage gaps of their existing "business as usual" test sets to how thorough they are when Hexawise is used to generate an optimized set of tests. Images of one recent coverage analysis is shown below. Data on defect-finding effectiveness and defect finding efficiency improvements resulting from optimized test case selection can also be found here.

Testing Coverage Analysis Hexawise Pairwise Combinatorial Testing OATS

 

*With thanks to Jon Bach for sharing this on his blog.

By: Justin Hunter on Sep 25, 2014

Categories: Pairwise Software Testing, Pairwise Testing, Combinatorial Testing, Software Testing

Begin with the “Goldilocks Rule” in mind to identify how much detail is appropriate.

unnamed

If your tests cover a large scope, as in a set of end-to-end tests of a process, you focus first on entering only the most important elements of that process into Hexawise. If your tests cover a small scope, as in tests that focus on a few items on a single screen, the amount of detail you will want to include in Hexawise is higher. Learn more about the Goldilocks Rule.

 

Imagine explaining your System Under Test to someone’s mother. Start with 5 things that may change in each test

Suggestion: do not start this process with detailed Requirements or Technical Specifications. Instead, start with your basic common sense description of some things that would change from test to test.

  • If you were explaining the application to someone’s mother, how would you explain what it does in 2 minutes?
  • What kinds of things would be important to vary from test to test?
    • Hardware and software configurations?
    • User types?
    • Different actions that a user might take?
  • Identify 5 things that change from test to test and turn those 5 things into your first Parameters.
    • How might those things change? Add one or more Values for each Parameter.
    • At this point general descriptions might be fine; (e.g., SUV’s or Economy cars vs. Toyota Corolla).
    • Remember that, where possible, you should avoid creating long lists of values.

 

Create a draft set tests, assess obvious big gaps in the tests, and start filling them.

What obvious types of scenarios are missing? Add parameters and values as necessary to fill those big, obvious holes.

 

Create tests again, assess whether you’re covering necessary business rules and requirements.

If your tests are not yet testing a business rule or Requirement that you want to test:

  • Consider adding a Parameter or Value
  • Consider adding a specific combination of values to be tested together in the requirements tab

 

Reduce scope if the test plan is too complex

  • Consider cutting the scope of the plan (e.g., create two different plans with largely similar parameters – one for regular users and one for special users – instead of one big plan which tries to “do it all.”)
  • Consider changing the way you are describing “hard coded” values. Instead, of “iPhone 4S with International roaming” (which might not be a valid option after the first part of the draft test case suggests a transaction for a phone for corporate customers from a Northern location responding to the special holiday offer…) consider using descriptive parameters and values along the lines of “from the available options of phones available at this point in the test case, select an option that meets as many of these conditions as you can…

 

Consider adding additional details into your plan from other sources.

Other sources for test ideas could be:

J Michael Hunter’s “You Are Not Done Yet”
Elisabeth Hendrikson’s Test Heuristics Cheat Sheet
Hans Buwalda’s Soap Opera Testing

By: John Hunter on Jun 17, 2014

Categories: Combinatorial Software Testing, Software Testing

The Hexawise Software Testing blog carnival focuses on sharing interesting and useful blogposts related to software testing.

 

  • The Zen of Application Test Suites by Curtis “Ovid” Poe – “This document is about testing applications — it’s not about how to write tests. Application test suites require a different, more disciplined approach than library test suites. I describe common misfeatures experienced in large application test suites and follow with recommendations on best practices.”

  • The Software Tester’s Easter Egg Hunt by Ben Austin – “The testing industry will benefit greatly if more people follow Holland’s example and prioritize critical thinking over the ability to write test scripts. That said, it is equally important to recognize that scripting tools, when used in the situations they’re intended for, can be great time-savers that can enable more thoughtful, context-driven exploratory testing.

  • “Grapefruit Juice Bugs” – A New Term for a Surprisingly Common Type of Surprising Bugs by Justin Hunter – “Like grapefruit juice’s impact on prescription drugs, software testing involves critical interactions between different parts of the system. And risks exist when these different parts interact with one another.”

  • Testing at Airbnb by Lou Kosak – “Building good habits around testing is hard, especially at an established company. One of the biggest lessons I’ve learned this year is that as long as you have a team that’s open to the idea, it’s always possible to start testing (even if you’ve got a six year old monolithic Rails app to contend with). Once you have a decent test suite in place, you can actually start refactoring your legacy code, and from there anything is possible. The trick is just to get started.”

  • Disruptive Testing – James Bach Interviewed by Rodney Urquhart – “the future I am helping to build is about systematically training up skilled testers, some of whom but not all with coding skills, so that a small number of testers can do– or coordinate to be done– all the testing that a large project might need. A good future for testing would be one with a lot fewer “testers” but each one of those testers being passionate about his craft.”

  • The role of a Test Manager in agile by Katrina Clokie – “In a cross-skilled team, the agile tester must ensure that the quality of testing is good regardless of who in the team is completing it. The tester becomes the spokesperson for collaborative testing practices, and provides coaching via peer reviews or workshops to those without a testing background.”

  • Speaking to Your Business Using Measurements by Justin Rohrman – “In my experience, no one measure did a great job of telling the story about my software ecosystem. I’ve been deceived by groups of measures, too, because I misunderstood their weaknesses. If we are so easily deceived by measurements, imagine what happens when we send them off to others who need quick, high-level information.”

 

unnamed

Photo in Ranakpur India by Justin Hunter.

 

  • Shine a light by Rob Lambert – “Tours, personas and a variety of other test ideas give you a way of re-shining your light. Use these ideas to see the product in different ways, but never forget that it’s often time that is against you. And time is one of the hardest commodities to argue for during your testing phase.”

  • Using mind-mapping software as a visual test management tool by Aaron Hodder – “I like to employ a style of software testing that emphasises the personal freedom and responsibility of the individual tester to continually optimise the quality of his/her work by treating test-related learning, test design, test execution and test result interpretation as mutually supportive activities that run in parallel throughout the project. When performed by a skilled tester, this approach yields valuable and consistent results.”

  • Helpful Tips for Hiring Better Testers by Isaac Howard – “I looked at the good testers around me and tried to identify the “whys” of their success. All of them were driven to learn and capable of adapting to change. If they didn’t know a tool or a tech, they learned it. Because under the hood, testing is learning and relearning software everyday. The following are seven changes I made to my interviewing process.”

By: John Hunter on Jun 10, 2014

Categories: Software Testing

Justin posted this to the Hexawise Twitter feed

cave-question

It sparked some interesting and sometimes humorous discussion.

cave-exploration

The parallels to software testing are easy to see. In order to learn useful information we need to understand what the purpose of the visit to the cave is (or what we are seeking to learn in software testing). Normally the most insight can be gained by adjusting what you seek to learn based on what you find. As George Box said

Always remember the process of discovery is iterative. The results of each stage of investigation generating new questions to answered during the next.

Some "software testing" amounts to software checking or confirming that a specific known result is as we expect. This is important for confirming that the software continues to work as expected as we make changes to the code (due to the complexity of software unintended consequences of changes could lead to problems if we didn't check). This can, as described in the tweet, take a form such as 0:00 Step 1- Turn on flashlight with pre-determined steps all the way to 4:59 Step N- Exit But checking is only a portion of what needs to be done. Exploratory testing relies on the brain, experience and insight of a software tester to learn about the state of the software; exploratory testing seeks to go beyond what pre-defined checking can illuminate. In order to explore you need to understand the aim of the mission (what is important to learn) and you need to be flexible as you learn to adjust based on your understanding of the mission and the details you learn as you take your journey through the cave or through the software. Exploratory software testing will begin with ideas of what areas you wish to gain an understanding of, but it will provide quite a bit of flexibility for the path that learning takes. The explorer will adjust based on what they learn and may well find themselves following paths they had not thought of prior to starting their exploration.

 

Related: Maximizing Software Tester Value by Letting Them Spend More Time Thinking - Rapid Software Testing Overview - Software Testers Are Test Pilots - What is Exploratory Testing? by James Bach

By: John Hunter on Apr 8, 2014

Categories: Exploratory Testing, Scripted Software Testing, Software Testing, Testing Checklists

Coining a New Term

I'm coining a new term today, "grapefruit juice bugs."

My inspiration for this term is a blog post in the New York Times that David Pogue wrote. I was fascinated by the post and it got me to thinking about a particular kind of bugs in software that are more common than most people may realize. You could say that these bugs are surprisingly common. In fact, if you wanted to be more precise, you could even say that this term applies to a specific type of "surprisingly common type of surprising bugs." Let me explain.

There's something about the chemical makeup of grapefruit juice that makes it interact with our biology and a large number of different drugs in ways which result in dangerous conditions. For example, certain drugs lose their effectiveness dramatically when interacting with grapefruit juice which can have life-threatening consequences. Other times, the interactions with grapefruit juice can dramatically increase a drug's potency. This can result in "safe doses" becoming very unsafe.

Grapefruit Is a Culprit in More Drug Reactions

The 42-year-old was barely responding when her husband brought her to the emergency room. Her heart rate was slowing, and her blood pressure was falling. Doctors had to insert a breathing tube, and then a pacemaker, to revive her.

They were mystified: The patient’s husband said she suffered from migraines and was taking a blood pressure drug called verapamil to help prevent the headaches. But blood tests showed she had an alarming amount of the drug in her system, five times the safe level.

Did she overdose? Was she trying to commit suicide? It was only after she recovered that doctors were able to piece the story together.

“The culprit was grapefruit juice,” said Dr. Unni Pillai, a nephrologist in St. Louis, Mo. ...

The previous week, she had been subsisting mainly on grapefruit juice. Then she took verapamil, one of dozens of drugs whose potency is dramatically increased if taken with grapefruit. In her case, the interaction was life-threatening.

Last month, Dr. David Bailey, a Canadian researcher who first described this interaction more than two decades ago, released an updated list of medications affected by grapefruit. There are now 85 such drugs on the market, he noted, including common cholesterol-lowering drugs, new anticancer agents, and some synthetic opiates and psychiatric drugs, as well as certain immunosuppressant medications taken by organ transplant patients, some AIDS medications, and some birth control pills and estrogen treatments. ... Under normal circumstances, the drugs are metabolized in the gastrointestinal tract, and relatively little is absorbed, because an enzyme in the gut called CYP3A4 deactivates them. But grapefruit contains natural chemicals called furanocoumarins, that inhibit the enzyme, and without it the gut absorbs much more of a drug and blood levels rise dramatically.

For example, someone taking simvastatin (brand name Zocor) who also drinks a small 200-milliliter, or 6.7 ounces, glass of grapefruit juice once a day for three days could see blood levels of the drug triple, increasing the risk for rhabdomyolysis, a breakdown of muscle that can cause kidney damage.

 

So what do interactions between grapefruit juice and drugs have to do with software testing?

Like grapefruit juice's impact on prescription drugs, software testing involves critical interactions between different parts of the system. And risks exist when these different parts interact with one another. This is true whether you're talking about "large parts" interact in System Testing or "small parts" interact in Unit Testing.

Interactions between things are a very rich source of bugs in software. As anyone who has heard the infernal phrase "works on my machine" can tell you, software features and functions often work perfectly fine in many usage scenarios, hardware and software configurations , etc. - only to fail to work in ever-so-slightly different situations.

 

The difference between plain old every-day "Dual-Mode Faults" and "Grapefruit Juice Bugs"

A dual-mode fault occurs whenever two test inputs must both be present to trigger a defect. Most software testers start encountering them quite frequently within days of starting their jobs. Some examples:

  • This "buy" button works fine. Except when the customer is a "new user." (First, action = "click on the buy button" and Second, customer = "new user")

  • Transaction prices for share purchases are calculated correctly. Except when denominated in Japanese Yen. (First, Action = "sell shares" and Second, Currency = "Japanese Yen")

Like grapefruit juice's impact on prescription drugs, software testing involves critical interactions between different parts of the system. And risks exist when these different parts interact with one another. This is true whether you're talking about "large parts" interact in System Testing or "small parts" interact in Unit Testing.

While all grapefruit juice bugs are dual-mode faults, not all dual-mode faults are Grapefruit Juice Bugs:

  • Grapefruit juice bugs have got to have a little of the element of surprise in them. When you explain them to a developer, their first reaction should be "Huh? How is that even possible?" or at least "Hmmm... That's odd. Let me investigate."

  • Anything along the lines of "This feature usually works, except in IE6, when..." is almost definitely not a grapefruit juice bug. Problematic interactions with IE6 are an incredibly common type of dual-mode fault, not a surprising one.

Whenever you hear "works on my machine" replies to your bug reports, and it takes a while for the issue to be replicated, odds are pretty good that a grapefruit juice bug might be involved.

Here's an example of an especially surprising grapefruit juice bug. This excerpt from Apple's online help files that the company posted after users of the original iPad complained about problems with Wi-Fi connectivity. Certain screen brightness settings were causing problems with the Wi-Fi signals. I'm not even to begin to guess how one would have anything to do with the other.

Auto-Scripting-Exercises-at-1.30.13-PM1

How to identify grapefruit juice bugs during your testing?

What is a tester to do when faced with more possible potential grapefruit juice bugs than he can handle using traditional methods?

If you're a software tester trying to do your best to determine whether a feature or function in your System Under Test will work "on everyone's machine," you've got a nightmare on your hands . Really nasty combinatorial explosions arise when you consider all of the possible combinations that would be required to test multiple hardware options, multiple software options, multiple usage scenarios, multiple test data inputs (and multiple combinations of the test data itself), multiple ways in which users enter data, and all of the rest of the "stuff that could vary" when people use your application. If you take the time to think expansively about the possible variations in a medium-sized applications, Quadrillions of possible tests often result.

While not eating grapefruit and not drinking grapefruit juice might be wise if you are taking drugs, there is rarely, if ever, such an easy method for eliminating the possibility of negative results due to software interactions. Refusing to support IE 6 in order to avoid the disproportionate number of grapefruit juice-like problematic interactions associated with IE6 would be as close as you could come in the world of software.

Design of Experiments-based test design methods can help testers come to grips with this challenge. Orthogonal array software testing (often referred to as OATS or simply OA testing) is a test design strategy that allows us to efficiently detect bugs created by interactions within the system. Orthogonal array software testing is based on the principles of multifactor designed experiments as first explored by Sir RA Fisher.

Design of Experiments-based test design methods are very-closely related to pairwise testing (AKA allpairs testing, all pairs testing, and pairwise-testing). Any of these test design strategies will allow a software tester to quickly generate a set of tests that includes tests for every single pair of test inputs.

This approach to test design often has multiple advantages, including faster test creation, more varied test scenarios, 100% coverage of all potential dual-mode faults (including hard-to-predict grape-fruit juice bugs), and often a smaller resulting set of tests that will be quicker to execute. Having said that, it is by no means a magical silver bullet. This approach to test design requires test designers with above average analytical abilities to identify the appropriate Parameters and Values for their system under test; this is sometimes easier said than done because it requires a new mindset from test designers.

Software testers can take solace that the challenges of software testing, while significant, are simple when compared to trying to understand the effects of drug interactions in people.

Combinatorial testing can look at bugs created by the interaction between multiple (3, 4, 5, 6...) variables. So if there was a bug that didn't get triggered just by using Chrome on Windows but it would get triggered if you also tried to replace an existing photo in your profile with a new profile photo into your profile (test idea number 3), then pairwise testing might not catch it. Pairwise test design would create a set of tests that would include at least one test for each of these pairs:

  • Chrome & Windows and

  • Chrome & replace photo and

  • Windows & replace photo, but...

A set of pairwise might not fail to test for the specific combination of all three of those test inputs in the same test. With the use of combinatorial test design approaches, you could create test plans with 100% coverage for 3 way interactions and be sure that all 3-way interactions or 4-way interactions are covered. When you create sets of 3-way tests, 4-way tests, 5-way tests, and 6-way tests though, you'll quickly discover that the number of tests required starts to balloon.

Hexawise allows you to create test plans with the coverage interactions you desire. This allows you to create sets of tests from 2-way up all the way up to phenomenally-thorough 6-way sets of tests. In fact, it even lets you generate clever sets of risk-based tests that will, say, prioritize comprehensive 4-way coverage on 4 sets of Parameter Values while ensuring only pairwise coverage of the other, lower-priority, interactions in your system under tests. Hexawise also lets you create mixed strength test plans so if you have certain factors that you are very concerned about and want to provide coverage for more possible interactions you can set the interaction levels for those at a higher level.

 

Related: Hexawise Tip: Using Value Expansions and Value Pairs to Handle Dependent Values - Maximize Test Coverage Efficiency And Minimize the Number of Tests Needed - How to Model and Test CRUD Functionality - 25 Great Quotes for Software Testers

By: Justin Hunter on Feb 11, 2014

Categories: Bugs, Combinatorial Testing, Design of Experiments, Multi-variate Testing, Multi-variate Testing, Pairwise Software Testing, Software Testing, Testing Strategies

The process used to hire employees is inefficient in general and even more inefficient for knowledge work. Justin Hunter, Hexawise CEO, posted the following tweet:

The labor market is highly inefficient for software testers. Many excellent testers are undervalued relative to average testers. Agree?

 

The tweet sparked quite a few responses:

inefficient-job-market-tweet

I think there are several reasons for why the job market is inefficient in general, and for why it is even more inefficient for software testing than for most jobs.

 

  • Often, how companies go about hiring people is less about finding the best people for the organization and more about following a process that the organization has created. Without intending to, people can become more more concerned about following procedural rules than in finding the best people.

  • The hiring process is often created much like software checking, a bunch of simple things to check - not because doing so is actually useful but because simple procedural checks are easy to verify. So organizations require a college degree (and maybe even require a specific major). And they will use keywords to select or reject applicants. Or require certification or experience with a specific tool. Often the checklist used to disqualify people contains items that might be useful but shouldn't be used as barriers but it is really easy for people that don't understand the work to apply the rules in the checklist to filter the list of applicants.

  • It is very hard to hire software testers well when those doing the hiring don't understand the role software testing should play. Most organizations don't understand, so they hire for software checkers. They, of course, don't value people that could provide much more value (software testers that go far beyond checking). The weakness of hiring without understanding the work is common for knowledge work positions and likely even more problematic for software testing due to the even worse understanding of what they should be doing compared to most knowledge workers.

 

And there are plenty more reasons for the inefficient market.

Here are few ideas that can help improve the process:

  • Spend time to understand and document what your organization seeks to gain from new hires.

  • Deemphasize HR's role in the talent evaluation process and eliminate dysfunctional processes that HR may have instituted. Talent evaluation should be done by people that understand the work that needs to get done. HR can be useful in providing guidance on legal and company-decided policies for hiring. Don't have people that can't evaluate the difference between great testers and good testers decide who should be hired or what salary is acceptable. Incidentally, years of experience, certifications, degrees, past salary and most anything else HR departments routinely use are often not correlated to the value a potential employee brings.

  • A wonderful idea, though a bit of a challenge in most organizations, is to use job auditions. Have the people actually do the job to figure out if they can do what you need or not (work on a project for a couple weeks, for example). This has become more common in the last 10 years but is still rare.

  • I also believe you are better off hiring for somewhat loose job descriptions, if possible, and then adjusting the job to who you hire. That way you can maximize the benefit to the organization based on the people you have. At Hexawise, for example, most of the people we hire have strengths in more than one "job description" area. Developers with strong UI skills, for instance, are encouraged to make regular contributions in both areas.

  • Creating a rewarding work environment helps (this is a long term process). One of the challenges in getting great people is they are often not interested in working for dysfunctional organizations. If you build up a strong testing reputation great testers will seek out opportunities to work for you and when you approach great testers they will be more likely to listen. This also reduces turnover and while that may not seem to relate to the hiring process is does (one reason we hire so poorly is we don't have time to do it right, which is partly because we have to do so much of it).

  • Having employees participate in user groups and attending conferences can help your organization network in the testing community. And this can help when you need to hire. But if your organization isn't a great one for testers to work in, they may well leave for more attractive organizations. The "solution" to this risk is not to stunt the development of your staff, but to improve the work environment so testers want to work for your organization.

 

Great quote from Dee Hock, founder of Visa:

Hire and promote first on the basis of integrity; second, motivation; third, capacity; fourth, understanding; fifth, knowledge; and last and least, experience. Without integrity, motivation is dangerous; without motivation, capacity is impotent; without capacity, understanding is limited; without understanding, knowledge is meaningless; without knowledge, experience is blind. Experience is easy to provide and quickly put to good use by people with all the other qualities.

Please share your thoughts and suggestions on how to improve the hiring process.

 

Related: Finding, and Keeping, Good IT People - Improving the Recruitment Process - Six Tips for Your Software Testing Career - Understanding How to Manage Geeks - People: Team Members or Costs - Scores of Workers at Amazon are Deputized to Vet Job Candidates and Ensure Cultural Fit

By: John Hunter on Jan 14, 2014

Categories: Checklists, Software Testing, Career

The Hexawise Software Testing blog carnival focuses on sharing interesting and useful blog posts related to software testing.

 

  • Using mind-mapping software as a visual test management tool by Aaron Hodder - "I want to be able to give and receive as much information as I can in the limited amount of time I have and communicate it in a way that is respectful of others' time and resources. These are my values and what I think constitutes responsible testing."

  • Healthcare.gov and the Tyranny of the Innocents by James Bach - "Management created the conditions whereby this project was 'delivered' in a non-working state. Not like the Dreamliner. The 787 had some serious glitches, and Boeing needs to shape that up. What I’m talking about is boarding an aircraft for a long trip only to be told by the captain 'Well, folks it looks like we will be stuck here at the gate for a little while. Maintenance needs to install our wings and engines. I don’t know much about aircraft building, but I promise we will be flying by November 30th. Have some pretzels while you wait.'"

 

jungle-bridge

Rope bridge in the jungle by Justin Hunter

 

  • Software Testers Are Test Pilots by John Hunter - "Software testers should be test pilots. Too many people think software testing is the pre-flight checklist an airline pilot uses."

  • Where to begin? by Katrina Clokie - "Then you need to grab your Product Owner and anyone else with an interest in testing (perhaps architect, project manager or business analyst, dependent on your team). I'm not sure what your environment is like, usually I'd book an hour meeting to do this, print out my mind map on an A3 page and take it in to a meeting room with sticky notes and pens. First tackle anything that you've left a question mark next to, so that you've fleshed out the entire model, then get them to prioritise their top 5 things that they want you to test based on everything that you could do."

  • Being a Software Tester in Scrum by Dave McNulla - "Pairing on development and testing strengthens both team members. With people crossing disciplines, they improve understanding of the product, the code, and what other stakeholders find important."

  • Stop Writing Code You Can’t Yet Test by Dennis Stevens - "The goal is not to write code faster. The goal is to produce valuable, working, testing, remediated code faster. The most expensive thing developers can do is write code that doesn’t produce something needed by anyone (product, learning, etc). The second most expensive thing developers can do is write code that can’t be tested right away."

  • Is Healthcare.gov security now fixed? by Ben Simo - "I am very happy that the most egregious issue was immediately fixed. Others issues remain. The vulnerabilities I've listed above are defects that should not make it to production. It doesn't take a security expert or “super hacker” to exploit these vulnerabilities. This is basic web security. Most of these are the kinds of issues that competent web developers try to avoid; and in the rare case that they are created, are usually found by competent testers."

  • Embracing Chaos Testing Helps Create Near-Perfect Clouds - "Chaos Monkey works on the simple premise that if we need to design for high availability, we should design for failure. To design for failure, there should be ways to simulate failures as they would happen in real-world situations. This is exactly what a Chaos Monkey helps achieve in a cloud setup.
    Netflix recently made the source code of Chaos Monkey (and other Simian Army services) open source and announced that more such monkeys will be made available to the community."

  • Bugs in UK Post Office System had Dire Consequences - "A vocal minority of sub-postmasters have claimed for years that they were wrongly accused of theft after their Post Office computers apparently notified them of shortages that sometimes amounted to tens of thousands of pounds. They were forced to pay in the missing amounts themselves, lost their contracts and in some cases went to jail. Second Sight said the Post Office's initial investigation failed at first to identify the root cause of the problems. The report says more help should have been given to sub-postmasters, who had no way of defending themselves."

  • Traceability Matrix: Myth and Tricks by Adam Howard - "And this is where we get to the crux of the problem with traceability matrices. They are too simplistic a representation of an impossibly complex thing. They reduce testing to a series of one to one relationships between intangible ideas. They allow you to place a number against testing. A percentage complete figure. What they do not do is convey the story of the testing."

  • Six Tips for Your Software Testing Career by John Hunter - "Read what software testing experts have written. It’s surprising how few software testers have read books and articles about software testing.Here are some authors (of books, articles and blogs) that I've found particularly useful..."

By: John Hunter on Dec 16, 2013

Categories: Software Testing

Since creating Hexawise, I've worked with executives at companies around the world who have found themselves convinced in the value of pairwise testing. And then they need to convince their organization of the value.

They often follow the following path: first thinking "pairwise testing is a nice method in theory, but not applicable in our case" then "pairwise is nice in theory and might be applicable in our case" to "pairwise is applicable in our case" and finally "how do I convince my organization."

In this post I review my history helping convince organizations to try and then adopt pairwise, and combinatorial, software testing methods.

About 8 years ago, I was working at a large systems integration firm and was asked to help the firm differentiate its testing offerings from the testing services provided by other firms.

While I admittedly did not know much about software testing then but by happy coincidence, my father was a leading expert in the field of Design of Experiments. Design of Experiments is a field that has applicability in many areas (including agriculture, advertising, manufacturing, etc.) The purpose of Design of Experiments is to provide people with tools and approaches to help people learn as much actionable information as possible in as few tests as possible.

I Googled "Design of Experiments Software Testing." That search led me to Dr. Madhav Phadke (who, by coincidence, had been a former student of my father). More than 20 years ago now, Dr. Phadke and his colleagues at ATT Bell Labs had asked the question you're asking now. They did an experiment using pairwise test design / orthogonal array test design to identify a subset of tests for ATT's StarMail system. The results of that testing effort were extraordinarily successful and well-documented.

Shortly after doing that, while working at that systems integration firm, I began to advocate to anyone and everyone who would listen that designing approach to designing tests promised to be both (a) more thorough and (b) require (in most but not all cases) significantly fewer tests. Results from 17 straight projects confirmed that both of these statements were true. Consistently.

 

Repeatable Steps to Confirm Whether This Approach Delivers Efficiency and Thoroughness Improvement (and/or document a business case/ROI calculation)

How did we demonstrate that this test design approach led to both more thorough testing and much more efficient testing? We followed these steps:

  1. Take an existing set of 30 - 100 existing tests that had already been created, reviewed, and approved for testing (but which had not yet been executed).

  2. Using the test ideas included in those existing tests, design a set of pairwise tests (often approximately half as many tests as were in the original set). When putting your tests together, if there are particular, known, high-priority scenarios that stakeholders believe are important to test, it is important to make sure that that you "force" your pairwise test generator to include such high-priority scenarios.

  3. Have two different testers execute both sets of tests at the same time (e.g., before developers start fixing any defects that are uncovered by testers executing either set of tests)

Document the following:

  • How long did it take to execute each set of tests?

  • How many unique defects were identified by each set of tests?

  • How long did it take to create and document each set of tests?*

 

*This third measurement was usually an estimate because a significant number of teams had not tracked the amount of time it took to create the original set of tests.

The results in 17 different pairwise testing "bake-off" projects conducted at my old firm included:

  • Defects found per tester hour during test execution: when pairwise tests were used, more than twice as many defects were found per tester hour

  • Total defects found: pairwise tests as many or more defects in every single project (despite the fact that in almost every case there were significantly more tests in the each original set of tests)

  • Defects found by pairwise tests but missed by traditional tests: a lot (I forget the exact number)

  • Defects found by traditional tests but missed by pairwise tests: zero

  • Amount of time to select and document tests: much less time required when a pairwise test generator was used (As mentioned above, precise measurements were difficult to gather here)

 

More recent project benefits have included these:

bcbs1

bcbs2

Those experiences - combined with the realization that many Fortune 500 firms were starting to try to implement smarter test design methods to achieve these kinds of benefits but were struggling to find a test design tool that was user-friendly and would integrate into their other tools - led me to the decision to create Hexawise.

 

Additional Advice and Lessons Learned Based on My Experiences

Once the testing the value of pairwise software testing at a specific organization it is very common to find the proponent of taking advantage of pairwise testing advantages to find themselves saying:

I have already elaborated some test plans that would save us up to 50% effort with that method. But now my boss and other colleagues are asking me for a proof that these pairwise test cases suffice to make sure our software is running well.

In that case, my advice is three-fold:

First, appreciate how your own thinking has evolved and understand that other people will need to follow a similar journey (and that others won't have as much time to devote as you have had to experience learnings first-hand).

When I was creating Hexawise, George Box, a Design of Experiments expert with decades of experience explaining to skeptical executives how Design of Experiments could deliver transformational improvements to their organizations' efficiency and effectiveness, told me "Justin, you'll experience three phases of acceptance and acceptance will happen more gradually than you would expect it to happen. First, people will tell you 'It won't work.' Next, they'll say "It won't work here." Eventually, he said with a smile, they'll tell you 'Of course this works. I thought of it first!'

When people hear that you can double their test execution productivity with a new approach, they won't initially believe you. They'll be skeptical. Most people you're explaining this approach to will start with the thought that "it is nice in theory but not applicable to what I'm doing." It will take some time and experience for people to understand and appreciate how dramatic the benefits are.

Second, people will instinctively be dismissive of pairwise testing case study after case study after case study that show how effective this approach has been for testers in virtually all types of industries and all types and phases of testing. George Box was right when he predicted that people will often respond with 'It won't work here.' Sometimes it is hard not to smile when people take that position.

Case in point: I will be talking to a senior executive at a large capital markets firm soon about how our tool can help them transform the efficiency and effectiveness of their testing group. And I can introduce them to a client of ours that is using our test design tool extensively in every single one of their most important IT projects. Will that executive take me up on my offer? I hope so, but based on past experience, I suspect odds are good that he'll instead react with 'Yes, yes, sure, if companies were people, that company would be our company's identical twin, but still... It won't work here.'

Third, at the end of the day, the most effective approach I have found to address that understandable skepticism and to secure organizational-level buy-in and commitment is through gathering hard, indisputable evidence on multiple projects that the approach works at the company itself through a bake-off approach (e.g., following those four steps outlined above. A few words of advice though.

My proposed approach isn't for the faint of heart. If you're working at a large company with established approaches, you'll need patience and persistence.

Even after you gather evidence that this approach works in Business Unit A, and B and C, someone from Business Unit D will be unconvinced with the compelling and irrefutable evidence you have gathered and tell you 'It won't work here. Business Unit D is unique.' The same objections may likely arise with results from "Type of Testing" A, B, and C. As powerful and widely-applicable as this test design approach is, always remember (and be clear with stakeholders) that it is not a magical silver bullet.

James Bach raises several valid limitations with using this approach. In particular, this approach won't work unless you have testers who have relatively strong analytical skills driving the test design process. Since pairwise test case generating tools are dependent upon thoughtful test designers to identify appropriate test inputs to vary, this approach (like all test design approaches) is subject to a "garbage in / garbage out" risk.

Project leads will resist "duplicating effort." But unless you do an actual bake-off stakeholders won't appreciate how broken their existing process is. There's inevitably far more wasteful repetition hidden away in standard tests than people realize. When you start reporting a doubling of tester productivity on several projects, smart managers will take notice and want to get involved. At that point - hopefully - your perseverance should be rewarded.

 

Some benefits data and case studies that you might find useful:

 

If you can't change your company, consider changing companies

Lastly, remember that your new-found skills are in high demand whether or not they're valued at your current company. And know that, despite your best efforts and intentions, your efforts might not convince skeptics. Some people inevitably won't be willing to take the time to understand. If you find yourself in a situation where you want to use this test design approach (because you know that these approaches are powerful, practical, and widely-applicable) but that you don't have management buy-in, then consider whether or not it would be worth leaving your current employer to join a company that will let you use your new-found skills.

Most of our clients, for example, are actively looking for software test designers with well developed pairwise and combinatorial test design skills. And they're even willing to pay a salary premium for highly analytical test designers who are able to design sets of powerful tests. (We publicize such job openings in the LinkedIn Hexawise Guru group for testers who have achieved "Guru" level status in the self-paced computer-based-training modules in the tool).

 

Related: Looking at the Empirical Evidence for Using Pairwise and Combinatorial Software Testing - Systematic Approaches to Selection of Test Data - Getting Known Good Ideas Adopted

By: Justin Hunter on Nov 21, 2013

Categories: Pairwise Software Testing, Software Testing, Software Testing Efficiency, Testing Case Studies, Testing Strategies

Software testers should be test pilots. Too many people think software testing is the pre-flight checklist an airline pilot uses.

 

The checklists airline pilots use before each flight are critical. Checklists are extremely valuable tools that help assure steps in a process are followed. Checklists are valuable in many professions. The Checklist – If something so simple can transform intensive care, what else can it do? by Atul Gawande

 

Sick people are phenomenally more various than airplanes. A study of forty-one thousand trauma patients—just trauma patients—found that they had 1,224 different injury-related diagnoses in 32,261 unique combinations for teams to attend to. That’s like having 32,261 kinds of airplane to land. Mapping out the proper steps for each is not possible, and physicians have been skeptical that a piece of paper with a bunch of little boxes would improve matters much. In 2001, though, a critical-care specialist at Johns Hopkins Hospital named Peter Pronovost decided to give it a try. … Pronovost and his colleagues monitored what happened for a year afterward. The results were so dramatic that they weren’t sure whether to believe them: the ten-day line-infection rate went from eleven per cent to zero. So they followed patients for fifteen more months. Only two line infections occurred during the entire period. They calculated that, in this one hospital, the checklist had prevented forty-three infections and eight deaths, and saved two million dollars in costs.

 

Checklists are extremely useful in software development. And using checklist-type automated tests is a valuable part of maintaining and developing software. But those pass-fail tests are equivalent to checklists - they provide a standardized way to check that planned checks pass. They are not equivalent to thoughtful testing by a software testing professional.

I have been learning about software testing for the last few years. This distinction between testing and checking software was not one I had before. Reading experts in the field, especially James Bach and Michael Bolton is where I learned about this idea.

 

Testing is the process of evaluating a product by learning about it through experimentation, which includes to some degree: questioning, study, modeling, observation and inference.

(A test is an instance of testing.)

Checking is the process of making evaluations by applying algorithmic decision rules to specific observations of a product.

 

I think this is a valuable distinction to understand when looking to produce reliable and useful software. Both are necessary. Both are done too little in practice. But testing (as defined above) is especially underused - in the last 5 years checking has been increasing significantly, which is good. But now we really need to focus on software testing - thoughtful experimenting.

 

Related: Mistake Proofing the Deployment of Software Code - Improving Software Development with Automated Tests - Rapid Software Testing Overview Webcast by James Bach - Checklists in Software Development

By: John Hunter on Nov 13, 2013

Categories: Checklists, Software Testing, Testing Checklists, Testing Strategies

Recently, the following defect made the news and was one of the most widely-shared articles on the New York Times web site. Here's what the article, Computer Snag Limits Insurance Penalties on Smokers said:

A computer glitch involving the new health care law may mean that some smokers won’t bear the full brunt of tobacco-user penalties that would have made their premiums much higher — at least, not for next year.

The Obama administration has quietly notified insurers that a computer system problem will limit penalties that the law says the companies may charge smokers, The Associated Press reported Tuesday. A fix will take at least a year.

 

Tip of the Iceberg

This defect was entirely avoidable and predictable. Its safe to expect that hundreds (if not thousands) of similar defects related to Obamacare IT projects will emerge in the weeks and months to come. Had testers used straightforward software test design prioritization techniques, bugs like these would have been easily found. Let me explain.

 

There's no Way to Test Everything

If the developers and/or testers were asked how could this bug could sneak past testing, they might at first say something defensive, along the lines of: "We can't test everything! Do you know how many possible combinations there are?" If you include 40 variables (demographic information, pre-existing conditions, etc.) in the scope of this software application, there would be:

41,231,686,041,600,000

possible scenarios to test. That's not a typo: 41 QUADRILLION possible combinations. As in it would take 13 million years to execute those tests if we could execute 100 tests every second. There's no way we can test all possible combinations. So bugs like these are inevitably going to sneak through testing undetected.

 

The Wrong Question

When the developers and testers of a system say there is no way they could realistically test all the possible scenarios, they're addressing the wrong challenge. "How long would it take to execute every test we can think of?" is the wrong question. It is interesting but ultimately irrelevant that it would take 13 million years to execute those tests.

 

The Right Question

A much more important question is "Given the limited time and resources we have available for testing, how can we test this system as thoroughly as possible?" Most teams of developers and software testers are extremely bad at addressing this question. And they don't realize nearly how bad they are. The Dunning Kruger effect often prevents people from understanding the extent of their incompetence; that's a different post for a different day. After documenting a few thousand tests designed to cover all of the significant business rules and requirements they can think of, testers will run out of ideas, shrug their shoulders in the face of the overwhelming number of total possible scenarios and declare their testing strategy to be sufficiently comprehensive. Whenever you're talking about hundreds or thousands of tests, that test selection strategy is a recipe for incredibly inefficient testing that both misses large numbers of easily avoidable defects and wastes time by testing certain things again and again. There's a better way.

 

The Straightforward, Effective Solution to this Common Testing Challenge: Testers Should Use Intelligent Test Prioritization Strategies

If you create a well-designed test plan using scientific prioritization approaches, you can reduce the number of individual tests to test tremendously. It comes down to testing the system as thoroughly as possible in the time that's available for testing. There are well-proven methods for doing just that.

 

There are Two Kinds of Software Bugs in the World

Bugs that don't get found by testers sneak into production for one of two main reasons, namely:

  • "We never thought about testing that" - An example that illustrates this type of defect is one James Bach told me about. Faulty calculations were being caused by an overheated server that got that way because of a blocked vent. You can't really blame a tester who doesn't think of including a test involving a scenario with a blocked vent.

  • "We tested A; it worked. We tested B; it worked too.... But we never tested A and B together." This type of bug sneaks by testers all too often. Bugs like this should not sneak past testers. They are often very quick and easy to find. And they're so common as to be highly predictable.

 

Let's revisit the high-profile bug Obamacare bug that will impact millions of people and take more than a year to fix. Here's all that would have been required to find it:

  • Include an applicant with a relatively high (pre-Medicare) age. Oh, and they smoke.

 

Was the system tested with a scenario involving an applicant who had a relatively high age? I'm assuming it must have been.

Was the system tested with a scenario involving an applicant who smoked? Again, I'm assuming it must have been.

Was the system tested with a scenario involving an applicant who had a relatively high age who also smoked? That's what triggers this important bug; apparently it wasn't found during testing (or found early enough).

 

If You Have Limited Time, Test All Pairs

Let's revisit the claim of "we can't execute all 13 million-years-worth of tests. Combinations like these are bound to sneak through, untested. How could we be expected to test all 13 million-years-worth of tests?" The second two sentences are preposterous.

  • "Combinations like these are bound to sneak through, untested." Nonsense. In a system like this, at a minimum, every pair of test inputs should be tested together. Why? The vast majority of defects in production today would be found simply by testing every possible pair of test inputs together at least once.

  • "How could we be expected to test all 13 million-years-worth of tests?" Wrong question. Start by testing all possible pairs of test inputs you've identified. Time-wise, that's easily achievable; its also a proven way to cover a system quite thoroughly in a very limited amount of time.

 

Design of Experiments is an Established Field that was Created to Solve Problems Exactly Like This; Testers are Crazy Not to Use Design of Experiments-Based Prioritization Approaches

The almost 100 year-old field of Design of Experiments is focused on finding out as much actionable information as possible in as few experiments as possible. These prioritization approaches have been very widely used with great success in many industries, including advertising, manufacturing, drug development, agriculture, and many more. While Design of Experiments test design techniques (such as pairwise testing and orthogonal array testing / OA testing) are increasingly becoming used by software testing teams but far more teams could benefit from using these smart test prioritization approaches. We've written posts about how Design of Experiments methods are highly applicable to software testing here and here, and put an "Intro to Pairwise Testing" video here. Perhaps the reason this powerful and practical test prioritization strategy remains woefully underutilized by the software testing industry at large is that there are too few real-world examples explaining "this is what inevitably happens when this approach is not used... And here's how easy it would be to avoid this from happening to you in your next project." Hopefully this post helps raise awareness.

 

Let's Imagine We've Got One Second for Testing, Not 13 Million Years; Which Tests Should We Execute?

Remember how we said it would take 13 million years to execute all of the 41 quadrillion possible tests? That calculation assumed we could execute 100 tests a second. Let's assume we only have one second to execute tests from those 13 million years worth of tests. How should we use that second? Which 100 tests should we execute if our goal is to find as many defects as possible?

If you have a Hexawise account, you can to your Hexawise account to view the test plan details and follow along in this worked example. To create a new account in a few seconds for free, go to hexawise.com/free.

By setting the 40 different parameter values intelligently, we can maximize the testing coverage achieved in a very small number of tests. In fact, in our example, you would only need to execute only 90 tests to cover every single pairwise combination.

The number of total possible combinations (or "tests") that are generated will depend on how many parameters (items/factors) and how many options (parameter values) there are for each parameter. In this case, the number of total possible combinations of parameters and values equal 41 quadrillion.

 

insurance-bug-1

This screen shot shows a portion of the test conditions that would be included the first 4 tests of the 90 tests that are needed to provide full pairwise coverage. Sometimes people are not clear about what "test every pair" means. To make this more concrete, by way of a few specific examples, pairs of values tested together in the first part of test number 1 include:

  • Plan Type = A tested together with Deductible Amount = High

  • Plan Type = tested together with Gender = Male

  • Plan Type = A tested together with Spouse = Yes

  • Gender = Male tested together with State = California

  • Spouse = Yes tested together with Yes (and over 5 years)

  • And lots of other pairs not listed here

 

insurance-bug-2

This screen shot shows a portion of the later tests. You'll notice that the values are shown in purple italics. Those values listed in purple italics are not providing new pairwise coverage. You will note in the first tests every single parameter value is providing new pairwise coverage value, toward the end few parameter value settings are providing new pairwise coverage. Once a specific pair has been tested, retesting it doesn't provide additional pairwise coverage. Sets of Hexawise tests are "front loaded for coverage." In other words, if you need to stop testing at any point before the end of the complete set of tests, you will have achieved as much coverage as possible in the limited time you have to execute your tests (whether that is 10 tests or 30 tests or 83). The pairwise coverage chart below makes this point visually; the decreasing number of newly tested pairs of values that appear in each test accounts for the diminishing marginal returns per test.

 

You Can Even Prioritize Your First "Half Second" of Tests To Cover As Much As Possible!

insurance-bug-3

This graph shows how Hexawise orders the test plan to provide the greatest coverage quickly. So if you get through 37 of the 90 tests needed for full pairwise coverage you have already tested over 90% of all the pairwise test coverage. The implication? Even if just 37 tests were covered, there would be a 90% chance that any given pair of values that you might select at random would be tested together in the same test case by that point.

 

Was Missing This Serious Defect an Understandable Oversight (Because of Quadrillions of Possible Combinations Exist) or was it Negligent (Because Only 90 Intelligently Selected Tests Would Have Detected it)?

A generous interpretation of this situation would be that it was "unwise" for testers to fail to execute the 90 tests that would have uncovered this defect.

A less generous interpretation would be that it was idiotic not to conduct this kind of testing.

The health care reform act will introduce many such changes as this. At an absolute minimum, health insurance firms should be conducting pairwise tests of their systems. Given the defect finding effectiveness of pairwise testing coverage, testing systems any less thoroughly is patently irresponsible. And for health insurance software testing it is often wiser to expand to test all triples or all quadruples given the interaction between many variables in health insurance software.

Incidentally, to get full 3 way test coverage (using the same example as above) would require 2,090 tests.

 

Related: Getting Started with a Test Plan When Faced with a Combinatorial Explosion - How Not to Design Pairwise Software Tests - Efficient and Effective Test Design

By: Justin Hunter on Sep 26, 2013

Categories: Combinatorial Software Testing, Hexawise test case generating tool, Multi-variate Testing, Pairwise Software Testing, Software Testing, Testing Strategies

Software testing is becoming more critical as the proportion of our economic activity that is dependent on software (and often very complex software) continues to grow. There seems to be a slowly growing understanding of the importance of software testing (though it still remains an under-appreciated area). My belief is that skilled software testers will be appreciated more with time and that the opportunities for expert software testers will expand.

Here are a few career tips for software testers.

 

  • Read what software testing experts have written. It's surprising how few software testers have read books and articles about software testing.Here are some authors (of books, articles and blogs) that I've found particularly useful. Feel free to suggest other software testing authors in the comments.

James Bach

Cem Kaner

Lisa Crispin

Michael Bolton

Lanette Creamer

Jerry Weinberg

Keith Klain

James Whittaker - posts while at Google and his posts with his new employer, Microsoft

Pradeep Soundararajan

Elisabeth Hendrickson

Ajay Balamurugadas

Matt Heusser

our own, Justin Hunter

 

  • Join the community You'll learn a lot as a lurker, and even more if you interact with others. Software Testing Club is one good option. Again, following those experts (listed above) is useful; engaging with them on their blogs and on Twitter is better. The software testing community is open and welcoming. In addition, see: 29 Testers to Follow on Twitter and Top 20 Software Testing Tweeps. Interacting with other testers who truly care about experimenting, learning, and sharing lessons learned is energizing.

 

  • Develop your communication skills Communication is critical to a career in software testing as it is to most professional careers. Your success will be determined by how well you communicate with others. Four critical relationship are with: software developers, your supervisors, users of the software and other software testers.Unfortunately in many organizations, managers and corporate structures restrict your communication with some of these groups. You may have to work with what you are allowed, but if you don't have frequent, direct communication with those four groups, you won't be able to be as effective.
    Work on developing your communication skills. Given the nature of software testing two particular types of communication - describing problems ("what is it?") and explaining how significant problem is ("why should we fix it?") - are much more common than in many other fields. Learning how to communicate these things clearly and without making people defensive is of extra importance for software testers.
    A great deal of your communication will be in writing and developing your ability to communicate clearly will prove valuable to your continued growth in your career.
    Writing your own blog is a great way to further you career. You will learn a great deal by writing about software testing. You will also develop your writing ability merely by writing more. And you will create a personal brand that will grown your network of contacts. It also provides a way for those possibly interested in hiring you to learn more. We all know hiring is often done outside the job announcement - job application process. By making a name for yourself in the software testing community you will increase the chances of being recruited for jobs.

 

  • Do what you love You career will be much more rewarding if you find something you love to do. You will most likely have parts of your job you could do without but finding something you have a passion for makes all the difference in the world. If you don't have a passion for software testing, you are likely better off finding something you are passionate about and pursuing a career in that field.

Your work is going to fill a large part of your life, and the only way to be truly satisfied is to do what you believe is great work. And the only way to do great work is to love what you do. If you haven’t found it yet, keep looking. Don’t settle.

Steve Jobs

 

  • Practice and learn new techniques and ideas. James Bach provides such opportunities. Also the Hexawise tool, lets you easily and quickly try out different scenarios and learn by doing. We also provide guided training, help and webcasts explaining software testing concepts and how to apply them using Hexawise.We include a pathway that guides you through the process of learning about software testing, with learning modules, practical experience and tests along the way to make sure you learn what is important. And once you reach the highest level and become a Hexawise guru we have offer a community experience to allow all those reaching that level to share their experience and learn from each other.

 

  • Go to good software testing conferences.I've heard great things about CAST (by the Association for Software Testing), Test Bash, and Let's Test in particular. Going to conferences is energizing because while you're there, you're surrounded by the minority of people in this industry who are passionate about software testing. Justin, Hexawise founder and CEO, will be presenting at CAST next week, August 26-28 in Madison, Wisconsin.

 

Related: Software Testing and Career Advice by Keith Klain - Looking at the Empirical Evidence for Using Pairwise and Combinatorial Software Testing - Maximizing Software Tester Value by Letting Them Spend More Time Thinking

By: John Hunter on Aug 22, 2013

Categories: Software Testing

The Hexawise Software Testing blog carnival focuses on sharing interesting and useful blog posts related to software testing.

 

  • T-Shaped Testers and their role in a team by Rob Lambert - "I believe that testers, actually – anyone, can contribute a lot more to the business than their standard role traditionally dictates. The tester’s critical and skeptical thinking can be used earlier in the process. Their other skills can be used to solve other problems within the business. Their role can stretch to include other aspects that intrigue them and keep them interested."

  • Testing triangles, pyramids and circles, and UAT by Allan Kelly - "Thus: UAT and Beta testing can only truly be performed by USERS (yes I am shouting). If you have professional testers performing it then it is in effect a form of System Testing.
    This also means UAT/Beta cannot be automated because it is about getting real life users to user the software and get their feedback. If users delegate the task to a machine then it is some other form of testing."

 

flower-justin

Photo by Justin Hunter, taken in South Africa.

 

  • Software Testing in Distributed Teams by Lisa Crispin - "Remote pairing is a great way to promote constant communication among multiple offices and to keep people working from home 'in the loop'.
    I often remote pair with a developer to specify test scenarios, do exploratory testing, or write automated test scripts. We use screen-sharing software that allows either person to control the mouse and keyboard. Video chat is best, but if bandwidth is a problem, audio will do. Make sure everyone has good quality headphones and microphone, and camera if possible."

  • Seven Kinds of Testers by James Bach - "I propose that there are at least seven different types of testers: administrative tester, technical tester, analytical tester, social tester, empathic tester, user, and developer. As I explain each type, I want you to understand this: These types are patterns, not prisons. They are clusters of heuristics; or in some cases, roles. Your style or situation may fit more than one of these patterns."

  • Which is Better, Orthogonal Array or Pairwise Software Testing? by John Hunter and Justin Hunter - "After more study I have concluded that: Pairwise is more efficient and effective than orthogonal arrays for software testing. Orthogonal Arrays are more efficient and effective for manufacturing, and agriculture, and advertising, and many other settings."

  • Experience Report: Pairing with a Programmer by Erik Brickarp - "We have different investigation methods. The programmer did low level investigations really well adding debug printouts, investigating code etc. while I did high level investigations really well checking general patterns, running additional scenarios etc. Not only did this make us avoid getting stuck by changing 'method' but also, my high level investigations benefited from his low level additions and vice versa."

  • I decided to evolve a faster test case by Ben Tilly - "I first wrote a script to run the program twice, and report how many things it found wrong on the second run. I wrote a second script that would take a record set, sample half of it randomly, and then run the first script.
    I wrote a third script that would take all records, run 4 copies of my test program, and then save the smaller record set that best demonstrated the bug. Wash, rinse, and repeat..."

  • Tacit and Explicit Knowledge and Exploratory Testing by John Stevenson - "It is time we started to recognise that testing is a tacit activity and requires testers to think both creativity and critically."

  • Tear Down the Wall by Alan Page - "There will always be a place for people who know testing to be valuable contributors to software development – but perhaps it’s time for all testing titles to go away?"

By: John Hunter on Jul 10, 2013

Categories: Software Testing

Here is a wonderful webcast that provides a very quick, and informative, overview of rapid software testing.

Software testing is when a person is winding around a space searching that space for important information.

 

James Bach starts by providing a definition of software testing to set the proper thinking for the overview.

Rapid software testing is a set of heuristics [and a set of skills]. Heuristics live at the border of explicit and tacit knowledge... Heuristics solve problems when they are under the control of a skilled human... It takes skill to use the heuristics effectively - to solve the problems of testing. Rapid software testing focuses on the tester... Tacit skills are developed through practice.

 

Automated software tests are useful but limited. In the context of rapid software testing only a human tester can do software testing (automated checks are defined as "software checking"). See his blog post: Testing and Checking Refined.

 

Related: People are better at figuring out interesting ideas to test. Generating a highly efficient, maximally varied, minimally repetitive set of tests based on a given set of test inputs is something computer algorithms are more effective at than a person. - Hexawise Lets Software Testers Spend More Time Focused on Important Testing Issues - 3 Strategies to Maximize Effectiveness of Your Tests

By: John Hunter on Jul 2, 2013

Categories: Software Testing, Testing Strategies

We have created the Hexawise Software Testing Glossary. The purpose is to provide some background for both general software testing tool and some Hexawise specific details.

While we continue to develop, expand and improve the Hexawise software itself we also realize that users success with using Hexawise rests not only on the tool itself but on the knowledge of those using the tool. The software testing glossary adds to the list of offerings we provide to help users. Other sources of help include sample test plans, training material (on software testing in general and Hexawise in particular), detailed help on specific topics, webcast tutorials on using Hexawise and this blog.

In Hexawise 2.0 we integrated the idea of building your expertise directly into the Hexawise application with specific actions, reading and practicums that guide users from novice to practitioner to expert to guru.

achievements

View for a Hexawise user that is currently a practitioner and is on their way to becoming an expert.

 

We aim to provide software services, and educational material, that help software testers focus their energy, insite, knowledge and time where it is most valuable while Hexawise software automates some tasks and does other tasks - that are essentially impossible for people to do manually (such as creating the most effective test plan, with pairwise, or better, coverage, given the parameters and values determined by the tester).

Our help, training and webcast resources have been found to be very useful by Hexawise users. We hope the software testing glossary will also prove to be of value to Hexawise users.

 

Related: Maximizing Software Tester Value by Letting Them Spend More Time Thinking - Hexawise Tip: Using Value Expansions and Value Pairs to Handle Dependent Values - Maximize Test Coverage Efficiency And Minimize the Number of Tests Needed

By: John Hunter on Jun 24, 2013

Categories: Hexawise test case generating tool, Hexawise tips, Software Testing

Many teams are trying to generate unusually powerful and varied sets of software tests by using Design of Experiments-based methods to generate many or most of their tests. The two most popular software test design methods are orthogonal array testing and pairwise testing. This article describes how these two approaches are similar but different and suggests that in most cases, pairwise testing is preferable.

Before advancing, it may be worth pointing out that Orthogonal Array Testing is also known as OA or OATS. Similarly, pairwise testing is sometimes referred to as all pairs testing, allpairs testing, pair testing, pair-wise testing, or simply 2-way testing. The difference between these two very similar approaches of pairwise vs. orthogonal array is that orthogonal array-based solutions require the same coverage goal that pairwise solutions do (e.g., that every pair of inputs is tested at least once) plus an additional hurdle/characteristic, that there be a uniform distribution throughout the domain.

I have studied the question of how can software testing inputs be combined most efficiently and effectively pretty steadily for the last 7 years. I started by searching the web for "Design of Experiments" and "software testing" and found references to Dr. Madhav Phadke (who, by coincidence, turns out was a former student of my father).

  • I discovered that Dr. Phadke had designed RDExpert which, although it had been primarily created to help with Research & Design projects in manufacturing settings, could also be used to select small sets of powerful test sets in software testing projects, using the Orthogonal Array-based test selection criteria.

  • I used RDExpert to create test sets (and compared those test sets against sets of tests that had been selected manually by software testers)

  • I gathered results by asking one tester to execute the manually selected tests and another tester to execute the the Orthogonal Array-based tests; the OA-based tests dramatically outperformed the manually-selected ones in terms of defects found per tester hour and defexts found overall.

So, in short, I had confirmed to my satisfaction that an OA-based test data combination strategy was far more effective than manually selecting combinations for the kinds of projects I was working on, but I was curious if other techniques worked better.

 

After more study I have concluded that:

  • Pairwise is more efficient and effective than orthogonal arrays for software testing.

  • Orthogonal Arrays are more efficient and effective for manufacturing, and agriculture, and advertising, and many other settings.

 

And we have built Hexawise as a software tool to help software producers test their software, based on what I have learned from my experience. We take full advantage of the greatly increased efficiency and effectiveness of letting testers to determine what needs to be tested and software algorithms to quickly create comprehensive test plans that provide more coverage with dramatically fewer tests.

But we also go well beyond this to create a software as a service solution that aids the software testing team with many huge advantages such as: automatically generating Expected Results in test scripts, automated importing of data from Excel or mind maps, exporting tests into other tools, preventing impossible to test for values from appearing together, and much more.

 

Why is a pairwise testing strategy better than an orthogonal array strategy?

  • Pairwise testing almost always requires fewer tests than orthogonal array-based solutions (it is possible, in some situations, for them to have an equal number of tests).

  • Remember, the reason that orthogonal array-based solutions require more tests than a pairwise solution to reach the coverage goal of testing all pairs of test conditions together in at least one test is the additional hurdle/characteristic that orthogonal array testing has, e.g., that there be a uniform distribution throughout the domain.

  • The "cost" of the extra tests (AKA experiments) is worth paying in many settings outside of the software testing industry because the results are non-binary in those tests. Someone seeking a desired darkness and gloss and luminosity and luster for a particular shade of green in the processing of film, for example, would benefit from with the information obtained from the added information gathered from orthogonal arrays.

  • In software testing, however, the added costs imposed by the the extra tests are not worth it. You're generally not seeking some ideal point in a continuum; you're looking to see what two specific pieces of data will trigger a defect when they appear in the same transaction. To identify that binary approach most efficiently and effectively, what you want is a pairwise solution (with fewer tests), not a longer list of orthogonal array-based tests.

 

Let me also add these points.

  • First, unlike some of my other views on combinatorial test design, my opinion on this narrow subject is not based on multiple empirical studies; it is based on (a) the reasoning I laid out above, and (b) a dozen or so conversations I've had with PhD's who specialize in the intersection of "Design of Experiments" and software test design, and (c) anecdotal evidence from using both methods.

  • Secondly, to my knowledge,very few, if any, studies have gathered empirical data showing benefits of pairwise solutions vs. orthogonal array-based solutions in software testing scenarios.

  • Thirdly, I strongly suspect that if you asked Dr. Phadke, he would give you his reasons for why orthogonal array-based solutions are appropriate (and even preferable) to pairwise test case selection methods for certain kinds of software projects. I have a huge amount of respect for both him and his son.

 

Time doesn't allow me to get into this last point much now, but "mixed strength" tests are another even more powerful test design approach for you to be aware of as well. With mixed strength testing solutions, the test designer is able to select a default coverage strength for the entire plan (e.g., pairwise / AKA 2-way coverage) and, in the same set of tests, select certain high priority values to receive higher coverage strength (e.g., 4-way coverage strength selected for each "Credit Rating" and "Income" and "Loan Amount" and "Loan to Value Ratio" would give you a palm that achieved pairwise coverage for everything in the plan plus comprehensive coverage for every imaginable combination of values from those four high priority parameters. This approach allows you to focus on risk-based testing considerations.

 

Sorry if I got a bit long-winded. It's a topic I'm passionate about.

Originally posted on Stack Exchange, Additional note added after the first 3 comments were submitted:

@Hannibal, @Peter K., and @MichaelF, Thanks for your comments! If you'd like to read more about this stuff, I recommend the multiple links available through this "bundle of links" about pairwise testing and combinatorial testing. In particular, Michael Bolton's article on pairwise testing is directly relevant and very clearly written. It is one of the few introductory articles around that accurately describes the difference between orthogonal array-based solutions and pairwise solutions. If I remember correctly though, the example Michael uses is a rare exception to the rule; the OA solution has the same number of tests as an optimal pairwise solution does.

Related: The Empirical Evidence for Using Pairwise and Combinatorial Software Testing - 3 Strategies to Maximize Effectiveness of Your Tests - Hexawise TV

More than 100 Fortune 500 firms use Hexawise to design their software tests. While large companies pay six figures per year for enterprise licenses, Hexawise is available for free to schools, open source projects, other non-profits, and teams of up to 5 users from any kind of company. Sign up for your Hexawise account.

By: John Hunter and Justin Hunter on Jun 11, 2013

Categories: Combinatorial Testing, Design of Experiments, Efficiency, Multi-variate Testing, Pairwise Software Testing, Software Testing, Testing Strategies, Experimenting

The Expected Result feature in Hexawise provides 3 benefits in the software testing process. Specifically, it...

  • Saves you time documenting test scripts.

  • Makes it less likely for you to make a mistake when documenting Expected Results for your tests.

  • Makes maintaining sets of tests easy from release to release and/or in the face of requirements changes.

There are two different places you can insert auto-generated Expected Results into your tests: the "Auto-script" screen, and the "Requirements" screen. So which place should you use?

It may depend upon how many values are required to trigger the Expected Result you're documenting.

If you want to create an Expected Result that can be triggered by 0, 1, or 2 values, you can safely add your Expected Result using the Auto-script screen. If you're creating an Expected Result that can only be triggered when 3 or more specific values appear in the same test case, then - at least if you're creating a set of 2-way tests - you will probably want to add that particular Expected Result to the Requirements screen, not on the Auto-script screen. Why is that?

It's because if you use the Expected Result feature on the Auto-script screen, it is like telling the test generation engine "if you happen to see a test that matches the setting I provided then show this Expected Result text to the tester." This automates the process so the tester will be provided clear instructions on what to look for in cases that might otherwise be less clear to the tester. If the only way

Using the requirements feature is similar but a bit different. Using the requirements feature, in contrast, is like telling the test generation engine "make sure that the specific combination of values I'm specifying here definitely appear together in the set of tests at least one time (and, if I tell you that there's an expected result associated with that scenario, please be sure to include it)." The requirements feature is used when you not only want to provide an expected value for a specific test case situation but to require that the specific test case scenario be included in the generated test plan.

See our help page for more details: How to save test documentation time by automatically generating Expected Results in test scripts.

 

Related: How to Model and Test CRUD Functionality - 3 Strategies to Maximize Effectiveness of Your Tests

By: John Hunter on Jun 4, 2013

Categories: Hexawise test case generating tool, Hexawise tips, Software Testing

The test cases in a given test plan should be sufficiently similar and should not have wildly divergent paths depending on the value of parameters in a test case.

When you do find your test case flow diverges too much you often want to either break your test plan down into a few different test plans, so that you have a plan for each different kind of pass through the system.

Another similar approach is that you may want to decrease the scope of your test plan a bit so that you end up with test cases that are all similar in the plan.

Lastly, let's say the flows aren't wildly divergent, but only slightly so. As a silly example let's say you were testing a recipe that varied based on the fruit selected.

Fruit: Apple, Grape, Orange, Banana

And then you wanted a step for how the peeling was done.

Peeling: By hand, By manual peeling tool, By automated peeler Peeler type: Hand crank, Battery powered, AC powered

Now... our testing flow here has some divergence. Grapes and Apples don't get peeled in this recipe, so they never enter that flow. And Bananas are always peeled by hand so they only get a part of that flow. If this was just the tip of the iceberg of the divergence, we should create a test plan for Grapes and Apples and a different one for Oranges and Bananas.

But if this is the entire extent of the divergent flow, then we want to take advantage of N/A values and married and invalid pairs.

We update our parameter values for the peeling flow to include N/A.

Peeling: By hand, By manual peeling tool, By automated peeler, N/A Peeler type: Hand crank, Battery powered, AC powered, N/A

We marry Grape and Orange (uni-directionally) to the two N/A's so they don't participate in the peeling flow. We marry Banana (unidirectionally) to "By hand" and the 2nd N/A so it has a partial and circumscribed pass through the peeling flow.

Lastly we don't allow Orange to be paired with either N/A with an invalid pair.

That's how a slight flow variation can be accommodated. Please comment with any questions about any of these approaches to your problem.

 

Related: 3 Strategies to Maximize Effectiveness of Your Tests - How Not to Design Pairwise Software Tests - How to Model and Test CRUD Functionality

By: Sean Johnson on May 21, 2013

Categories: Hexawise tips, Scripted Software Testing, Software Testing, Testing Strategies

A client informed us that they had created (and used) approximately 3,500 test cases to test the search functionality of their application. They had a strong suspicions that (a) they should be able to test the search functionality of their application with fewer tests, (b) the tests they had accidentally omitted tests of many hundreds of plausible combinations of values that would be useful to test for (but did not know how to precisely identify were those gaps were without a huge amount of work), and that (c) many of these tests were quite inefficient in that they repeated many steps that had already been tested in other tests in the plan (even if they were not 100% duplicative of any other single test in the plan).

This client should have spoken to Lanette Creamer before they got into that situation. Lanette is a testing expert and blogger with ideas worth paying attention to. For example, her paper, Reducing Test Case Bloat, is well worth reading as is her blog.

There are times when what you cut may not be bloat. There are some situations where the decisions are the equivalent of “Do we cut off the arm or the head?” Well, a person can live without an arm. If you are in a situation where you are so time constrained that critical areas will be untested, you can still communicate the risk, be transparent and use a strategy to test the most important areas first. It is possible to plan for and do testing for a very time constrained project.

 

Of course avoiding this situation is best. Improving testing processes to use the best thoughts and tools is a better option. Cutting the bloat can allow resources to be applied to those areas they are really needed. Often though, people are scared of trying new ideas and cling to old methods, even if that results in the organization having to take increased risks by failing to test critical areas sufficiently. They are just more scared of trying new ideas than of getting away with saying we need more funding if you want more testing.

Lanette's article provides 8 specific suggestions for process improvements to reduce bloat. The first suggestion is to use combinatorial testing tools to greatly improve coverage while reducing workload. Another suggestion is to run the bloat reduction ideas by the stakeholders.

As part of your plan to reduce bloat, it can be helpful to state your assumptions about who is important and where you are placing testing priority and why. When reducing test case bloat you are taking a calculated risk. You are weighing the risk of being unable to test new features by insisting on testing every legacy case against the risk of purposefully not running some tests. When you share your starting assumptions with your stakeholders you offer them the chance to counter with their own assumptions and often you can clarify the boundaries of your testing this way to avoid gaps in testing or duplication.

 

See the full article for more good ideas on how to get better results for the existing testing resources available to your organization.

 

Related: Design of Experiments is about Learning ASAP and, in Software Testing, Finding Bugs ASAP - Pairwise and Combinatorial Software Testing in Agile Projects - Cem Kaner: Testing Checklists = Good / Testing Scripts = Bad?

By: John Hunter and Justin Hunter on Apr 24, 2013

Categories: Combinatorial Software Testing, Efficiency, Multi-variate Testing, Software Testing, Testing Strategies

The Hexawise Software Testing carnival focuses on sharing interesting and useful blog posts related to software testing.

 

  • Testing and Checking Refined by James Bach and Michael Bolton - "a robust role for tools in testing must be embraced. As we work toward a future of skilled, powerful, and efficient testing, this requires a careful attention to both the human side and the mechanical side of the testing equation. Tools can help us in many ways far beyond the automation of checks. But in this, they necessarily play a supporting role to skilled humans; and the unskilled use of tools may have terrible consequences."

  • Bugs Spread Disease by Elisabeth Hendrickson - "Cancel all the bug triage meetings; spend the time instead preventing and fixing defects. Test early and often to find the bugs earlier. Fix bugs as soon as they’re identified."

 

haija-sophia-istanbul

Haija Sophia in Istanbul, Turkey by Justin Hunter

 

  • Becoming a World-Class Tester by Ilari Henrik Aegerter - "World-class testing means following the urge to get to the bottom of things, not giving up until you have experienced enough value for yourself, thinking more expansively about the role of a tester, and thinking non-traditionally about what skills are required to thrive in the role."

  • Test Coverage Triage by Parimala Hariprasad - "My experience shows that a mind map based test design works great at this stage. Business teams will be thrilled to have a Visual Walkthrough of tests and provide inputs quickly. As a participant observer on several dozen IT projects, I have found out that testers’ personally walking them through the tests works really well."

  • What Does it Take to Change the Software Testing Industry? Courage! by Keith Klain - "According to Mark Twain, courage is not the absence of fear – but the mastery of it. There are people working in software testing all over the globe who are questioning long standing ways of working - some for the first time. Get yourself energized and get involved."

  • Explaining Exploratory Testing Relies On Good Notes. by Rob Lambert - "Being able to do good exploratory testing is one thing, being able to explain this testing (and the insights it brings) to the people that matter is a different set of skills. I believe many testers are ignoring and neglecting the communication side of their skills, and it’s a shame because it may be directly affecting the opportunities they have to do exploratory testing in the first place."

  • Human-Computer Cooperation by John Hunter - "people are better at figuring out interesting ideas to test. Once those are identified, those test conditions and other test ideas need to be combined together and put into tests. Generating a highly efficient, maximally varied, minimally repetitive set of tests based on a given set of test inputs is something computer algorithms are more effective at than a person."

  • Software testing can be sexy, too by Ole Lensmar - "What Klain, who serves on the board of the AST, and the others would like to do is not only bring those skills back home but increase the availability and accessibility of this kind of training and job opportunity. Ideally, colleges and universities would start offering majors in Software Testing so we can set young people on a path toward testing as a career."

  • Yet another future of testing post (YAFOTP) by Alan Page - "I think we’ll always need people to look at big end-to-end scenarios and determine how non-functional attributes (e.g. performance, privacy, usability, reliability, etc.) contribute to the user experience. Some of this evaluation will come from manually walking through scenarios), but there will be plenty of need for programmatic measurement and analysis as well..."

By: John Hunter on Apr 18, 2013

Categories: Software Testing

Attempting to assess the relative benefits of more than 200 software development practices is not for the faint of heart. Context-specific considerations run the risk of confounding the conclusions at every turn. Even so, Capers Jones, a software development expert with dozens of years of experience and nearly twenty books related to software development to his credit, recently attempted the task. He's literally devoted decades of his career to assessing such things for clients. We're quite pleased with how using Hexawise fared in the analysis.

Scoring and Evaluating Software Methods, Practices and Results by Capers Jones (Vice President and CTO, Namcook Analytics) provides some great idea on software project management. The article is based on the Software Engineering Best Practices with some new data is taken from The Economics of Software Quality (two of the books Capers Jones has authored).

Software development, maintenance, and software management have dozens of methodologies and hundreds of tools available that are beneficial. In addition, there are quite a few methods and practices that have been shown to be harmful, based on depositions and court documents in litigation for software project failures.

In order to evaluate the effectiveness or harm of these numerous and disparate factors, a simple scoring method has been developed. The scoring method runs from +10 for maximum benefits to -10 for maximum harm.

The scoring method is based on quality and productivity improvements or losses compared to a mid-point. The mid point is traditional waterfall development carried out by projects at about level 1 on the Software Engineering Institute capability maturity model (CMMI) using low-level programming languages. Methods and practices that improve on this mid point are assigned positive scores, while methods and practices that show declines are assigned negative scores.

The data for the scoring comes from observations among about 150 Fortune 500 companies, some 50 smaller companies, and 30 government organizations. Negative scores also include data from 15 lawsuits.

 

The article provides guidance, based on the results achieved by many, and varied, organizations with respect to software projects.

finding and fixing bugs is overall the most expensive activity in software development. Quality leads and productivity follows. Attempts to improve productivity without improving quality first are not effective.

 

This is an extremely important point for business managers to understand. Those involved in software development professionally don't find this surprising. But business people often greatly underestimate the costs of maintaining and updating software. The costs of bugs introduced by fairly minor feature requests to a system that doesn't have good software test coverage or test plans often create far more trouble than business managers expect.

This is especially true because there is a high correlation between software applications that have poor software testing processes (including poor test coverage and poor or completely missing test plans) and those application that were designed without long term maintenance in mind. Both deficiencies result of decisions made to minimize initial development costs and time. They both show a lack of appreciation for wise software engineering practices and software application project management.

The article discusses a complicating factor for accessing the most effective software development practices: the extremely wide differences in software engineering scope. Projects range from simple applications one software developer can create in a short period of time to massive application requiring thousands of developer-years or effort.

In order to be considered a “best practice” a method or tool has to have some quantitative proof that it actually provides value in terms of quality improvement, productivity improvement, maintainability improvement, or some other tangible factors.

Looking at the situation from the other end, there are also methods, practices, and social issues have demonstrated that they are harmful and should always be avoided. ... Although the author’s book Software Engineering Best Practices dealt with methods and practices by size and by type, it might be of interest to show the complete range of factors ranked in descending order, with the ones having the widest and most convincing proof of usefulness at the top of the list. Table 2 lists a total of 220 methodologies, practices, and social issues that have an impact on software applications and projects.

The average scores shown in table 2 are actually based on the composite average of six separate evaluations:

  1. Small applications < 1000 function points

  2. Medium applications between 1000 and 10,000 function points

  3. Large applications > 10,000 function points

  4. Information technology and web applications

  5. Commercial, systems, and embedded applications

  6. Government and military applications

The data for the scoring comes from observations among about 150 Fortune 500 companies, some 50 smaller companies, and 30 government organizations and around 13,000 total projects. Negative scores also include data from 15 lawsuits.

The scoring method does not have high precision and the placement is somewhat subjective.

Top 10 tools and practices listed in the article:

Practice Score
1. Reusability (> 85% zero-defect materials) 9.65
2. Requirements patterns - InteGreat 9.50
3. Defect potentials < 3.00 per function point 9.35
4. Requirements modeling (T-VEC) 9.33
5. Defect removal efficiency > 95% 9.32
6. Personal Software Process (PSP) 9.25
7. Team Software Process (TSP) 9.18
8. Automated static analysis - code 9.17
8. Mathematical test case design (Hexawise) 9.17
10. Inspections (code) 9.15

 

We are obviously thrilled that Hexawise is listed. We have seen the value our customers have achieved using mathematical based combinatorial software test plans (see several Hexawise case studies). It is great to see that value recognized in comparison to other software development practices and judged to be of such high value to software development projects.

The article makes it clear the importance of the results is not "the precision of the rankings, which are somewhat subjective, but in the ability of the simple scoring method to show the overall sweep of many disparate topics using a single scale."

The methodology behind the results shown in the article can be used to evaluate your organization's software development practice and determine opportunities for improvement. But, as stated above, software projects cover a huge range of scopes. The specific software project needs will drive which practices are most critical to achieving success for a specific project. The list in the article, of what practices have provided huge value and what practices have resulted great harm, is a very helpful resource but project managers and software developers and testers need to apply their judgement to the information the article provides in order to achieve success.

A leading company will deploy methods that, when summed, total to more than 250 and average more than 5.5. Lagging organizations and lagging projects will sum to less than 100 and average below 4.0.

 

The use of Hexawise has been growing; that has helped increase the number of software projects using best practices (that score 9, or higher), however as the article states there is quite a need for improvement.

From data and observations on the usage patterns of software methods and practices, it is distressing to note that practices in the harmful or worst set are actually found on about 65% of U.S. Software projects as noted when doing assessments. Conversely, best practices that score 9 or higher have only been noted on about 14% of U.S. Software projects. It is no wonder that failures far outnumber successes for large software applications!

 

A score of 9 to 10 for a practice means that practice results 20-30% improvement in quality and productivity of software projects.

Conclusion: while your individual mileage may vary, this report provides further evidence that using Hexawise really does lead to large, measurable improvements in efficiency and effectiveness.

We are very proud of the success of Hexawise thus far; as a new year starts we see huge potential to help many organizations improve their software development efforts.

The article includes a list of references and suggested readings that is valuable. Included in that list are:

DeMarco, Tom; Controlling Software Projects, 1986, 296 pages.

Gilb, Tom and Graham, Dorothy; Software Inspections, 1994, 496 pages.

Jones, Capers; Applied Software Measurement, 3rd edition, 2008, 662 pages.

McConnell, Code Complete, (I'm linking to the 2nd edition the article references the 1st edition) 2004, 960 pages.

 

Related: Maximizing Software Tester Value by Letting Them Spend More Time Thinking - A Powerful Software Test Design Approach - 3 Strategies to Maximize Effectiveness of Your Tests

By: John Hunter on Mar 18, 2013

Categories: Software Testing, Software Testing Efficiency, Testing Case Studies, Testing Strategies

I strongly agree with Cem Kaner's statement (in Inefficiency and Ineffectiveness of Software Testing: A Key Problem in Software Engineering) that: sometimes tests uncover defects that don’t fit within any coverage model because they are side effects of the tests rather than explicitly planned foci of the tests."

My experience indicates that an effective way to increase the likelihood that you will trigger such defects (without explicitly looking for them) is to try to maximize the variation between each test case you execute.

A case in point: when I sat down to dinner with James Bach a year or so ago in Boston at a testing conference, he gave me a quick testing challenge (as he is fond of doing with testers he meets for the fist time to see how we think). He asked how I would test a very simple calendar entry application that allowed users to record the start and end times of diary events. Key inputs to use as test conditions for these tests included start times and end times.

I proposed a set of times to try that were designed to provide as much variety as possible from one test case to the next. As inputs into the start and end times, I used a small number of different times spread throughout the morning, afternoon, and evening as well. The strategy I used quickly identified the testing defect the puzzle was designed to uncover in a small handful of tests. What was most memorable about the experience from my perspective was not that I "succeeded" in triggering the bug but that the tests I created triggered a type of bug that was, in Kaner's words, a "side effect of the tests rather than explicitly planned foci of the tests."

The business logic in the calendar application that should have identified invalid beginning and end time combinations was coded incorrectly. Instead of using numbers in the business logic, the business logic was ordering the numbers alphabetically. I was not consciously looking to identify that kind of a flaw in the business logic, but by maximizing the variation from test case to test case, I maximized my odds of finding it.

Efficiently achieving structured variation is difficult because it is hard for a human brain to remember whether dozens of different test conditions have been tested together (or we're accidentally repeating ourselves). This is where pairwise and combinatorial test case generating tools like our Hexawise tool come in. They are designed to achieve as much variation from test case to test case as possible. One of the relatively unsung benefits of this approach is that doing so will help find bugs, like these, that you aren't even consciously looking for.

 

Related: 3 Strategies to Maximize Effectiveness of Your Tests - Getting Started with a Test Plan When Faced with a Combinatorial Explosion - Book Review of “Explore It!” Elisabeth Hendrickson’s Excellent New Book on Software Testing

By: Justin Hunter on Feb 26, 2013

Categories: Software Testing, Testing Strategies

Malcolm Gladwell's three best-selling books are filled with fascinating examples of all sorts of things to underscore the interesting points he makes about a wide variety of topics. I'm currently readying his book "Blink," about how people make process information and make decisions in the 'blink of an eye."

The example he shares about an overcrowded hospital Emergency Room's decision making process really resonated with me. Brendan Reilly, named the chairman of Medicine at Cook's County Hospital in Chicago in 1996, was in dire need of a way to improve the conditions of the hospital when he showed up there. The Emergency Room faced extreme amounts of resource pressure in particular; there were too many patients needing care and too few doctors and beds to serve them all promptly. One of the most resource-sapping processes was the treatment of patients who entered the ER complaining of chest pains.

"From the beginning, the question of how to deal with heart attacks was front and center." About 30 people a day came into the ER "were worried that they were having a heart attack. And those thirty used more than their share of beds and nurses and doctors and stayed around a lot longer than other patients."

"Reilly's first act was to turn to the work of a cardiologist named Lee Goldman. In the 1970's, Goldman...p. 133

He... announced that he was holding a bake-off. For the first few months, the staff would use their own judgment in evaluating chest pain, the way they always had. Then they would use Goldman's algorithm, and the diagnosis and outcome of every patient treated under the two systems would be compared.

For two years, data were collected, and in the end, the result wasn't even close. Goldman's rule won hands down in two directions; it was a whopping 70 percent better than the old method at recognizing patients who weren't actually having a heart attack. at the same time, it was safer. the whole point of chest pain prediction is to make sure that patients end up having major complications are assigned right away to the coronary and intermediate units. Left to their own devices, the doctors guessed right on the most serious patients somewhere between 75 and 89 percent of the time. The algorithm guessed right more than 95 percent of the time.... He went to the ED and changed the rules."

 

I suspect that if you asked 100 cardiologists whether they would perform more accurate assessments that the 3 question Goldman algorithm would, you would find that all 100 of those cardiologists would feel confident in their ability to outperform the 3 question algorithm. After all, the algorithm didn't allow for subject matter expertise.

 

How is this Relevant to Software Testing?

When many business leaders think of software testing what they want is something that catches bugs before the product is released to the customer. To some extent this can be automated. There are many previous bugs (and experience with bugs in other software) that allow us to create specific tests which will identify known bugs.

So if we have a form on the web application, we can test various scenarios and make sure the form is processed properly: submitting the form when it should or displaying the proper error message when it should do that. It is wise to have test plans to check for these bugs.

Creating simple easy to follow guidelines to address known issues is a wise action. This is true for creating a simple checklist to use in screening for heart attacks and for creating test plans that cover the known issues.

While those are wise they are by no means sufficient. Physicians bring a great deal of expertise to evaluating patients for a huge number of symptoms. And they can use their knowledge and experience to judge where they need more evidence (for example, a diagnostic x-ray or an analysis of a blood sample) to make a judgement. And to judge is software will work as users wish simply following a test plan robotically is not sufficient. The experience and expertise of a software tester allows them to probe the software applications and spot not only bugs but weaknesses that should be addressed so users of of the software have the best experience.

The appropriate methods depend upon the need. And sometimes the expert is not as effective as simply using tools or a check sheet that has been developed specifically for that purpose. It is important to design processes in your organization to find the most effective strategies given the local circumstances.

Experimenting, by using evidence based management practices, is needed to determine what is most effective. Assuming that the best solution is always relying on the experts is not accurate (as the example Gladwell used shows). A great tool to aid in determining what practices work best is the Plan-Do-Study-Act cycle made famous by Dr. W. Edwards Deming.

 

Related: Maximizing Software Tester Value by Letting Them Spend More Time Thinking - Mistake Proofing Strategies for Software Code - Checklists in Software Development

By: John Hunter and Justin Hunter on Feb 18, 2013

Categories: Software Testing, Testing Strategies

I am passionate about pairwise software testing techniques. I have helped dozens of teams, for example, carefully measure the benefits that can be created when teams of testers adopt pairwise and related combinatorial testing approaches to identify the test cases they will execute (as compared to manual test case identification methods). What usually happens is that tester productivity doubles. (See Combinatorial Software Testing - pdf download).

I believe these approaches will be much more widely adopted in a few years than they are now for the simple reason that they consistently deliver dramatic benefits to both the speed of software test design and the efficiency and thoroughness of software test execution. As more teams try these methods for themselves, and measures the benefits they achieve with them, broader adoption seems highly likely to me.*

I see three main barriers to broader adoption by the testing community at large:

  1. The first barrier is that testers will not make an attempt to apply this method to their testing projects so they will never find out how effective it is. The second is that ill-informed testers will try to apply the approach but do such a poor job at implementation that they do not generate benefits.

  2. The second barrier is that even testers who use the approach effectively a few times, will not realize how much more effective it is making them. A dismissive thought process guilty of this might sound something like this: "Those 11 bugs I just found? Yeah. I found them because I'm a good tester; the fact that I happened to use pairwise tests just now? That's largely irrelevant. I'm sure I would have found them regardless.")

  3. The third barrier is that testers unfamiliar with the basics of pairwise testing principles will design test cases without thinking about what they are doing, and achieve "garbage in / garbage out" results. The benefits that would have been so easily achieved in the testing project - like Lindsey Jacobellis' opportunity to win a gold medal for Snow Boarding - disappear in a groan-worthy moment of bone-headed stupidity.

 

 

This blog post addresses this third barrier. When testers sabotage their own test plans with a poor choice of inputs, they may well blame the test design strategy rather than themselves, which would be unfortunate. Here's one common problem I see (exaggerated a bit in this example to make my point).

 

Objective: create a set of tests that will check to see if the underwriting engine for a car insurance firm is calculating premium estimates correctly.

Our aspiring pairwise test designer enters stage left and identifies a set of parameters:

First Name, Last Name, Age of Primary Driver, Credit Score, Number of Cars, Number of Accidents, Number of Speeding Tickets, and Number of Additional Drivers

So far so good. We now have the initial ingredients for a thing of beauty; we have a set of parameters that could quickly result in a combinatorial explosion of possibilities and, ready to save the day, we have a test designer who has correctly identified this as an opportunity to achieve efficiency and thoroughness benefits through the application of pairwise testing methods. Our potential hero is a couple minutes away from creating a concise set of tests that will confirm not only confirm that each of the data points in the plan work as they should but that they work as they should in combination with each of the other data points in the test plan.

In other words, the plan will not only confirm that "Number of Accidents = 3" will impact premiums as it should on its own, but also that "Number of Accidents = 3" will work as it should when tested in combination with the other relevant inputs in the application, e.g.,: 3 accidents with every relevant input for "Age of Primary Driver," 3 accidents with every relevant input for "Credit Score," 3 accidents with every relevant input for "Number of Cars," 3 accidents with every relevant combination for "Number of Speeding Tickets," and 3 accidents with every relevant input for "Number of Drivers."

He's seen the Promised Land of improved efficiency and effectiveness and he's ready to enter. Unfortunately, with his next move, he demonstrates he's a doofus. Entry to Promised Land denied. Check out the values he chose to enter for each of his parameters.

 

hexawise-screenshot

Notice anything wrong here?

 

Just for fun, let's take a close up look at Lindey's disastrous Snow Boarding maneuver here.

 

... and let's break down our shame-faced test designer's bone-headed move here. Can you notice what is wrong in with his choices of values?

There are nine different parameters in the mix here. Of those, two ("First Name" and "Second Name"), are the least important to our current objective of looking for problems in the underwriting engine calculations. And yet...

He's added ten values to each of them. Oops! Whenever you are putting together a pairwise (or 2-way) test plan, the number of tests required will never be lower than the product of the number of parameter values from the two parameters that have the highest number of values. In plain English, that high-falutin' previous sentence means: when you have a plan with 7 parameters that have a maximum of 4 values each, "10 largely irrelevant values X 10 largely irrelevant values = you're a big fat idiot" because you'll create a test plan that has 100 test cases (as compared to a test plan that could have covered the System Under Test more effectively with fewer than a quarter of the tests you've just created).

 

For more information on pairwise and combinatorial testing, I would recommend the following sources:

 

If you are attempting to use pairwise and/or combinatorial testing methods and running into questions, I'd sincerely like to help. Please consider one or more of the following:

 

Thank you,

Justin Hunter

 

*The manufacturing industry followed a similar pattern of adoption to similar methods that consistently delivered dramatic efficiency and effectiveness benefits. It took decades before multi-variate Design of Experiments methods were widely adopted by manufacturers even long after the benefits were proven to be dramatic and repeatable to anyone who would look at the clear, unambiguous, objectively-measurable evidence. Today, it is impossible to find a Fortune 500 manufacturing firm that does not regularly use multi-variate Design of Experiments in their manufacturing processes. One day it will be the same for Fortune 500 firms with respect to their adoption of multi-variate Design of Experiments methods of software testing.

By: Justin Hunter on Jan 29, 2013

Categories: Combinatorial Testing, Pairwise Software Testing, Software Testing, Software Testing Efficiency, Testing Strategies

This is the second edition of our carnival that focuses on finding interesting and useful blog posts related to software testing.

 

  • Testing != test execution by Jeff Fry - "We often talk about testing as if it’s only test execution, yet often the most interesting, challenging, skill-intensive aspects of testing are in creating a mental model that helps us understand the problem space, designing tests to quickly and effectively answer key questions, analyzing what specifically the problem is, and communicating it effectively."

  • Test Mercenaries by Mike Bland - "good testing practice goes a long way towards finding and killing a lot of bugs before they can grow more expensive, and possibly multiply. The bugs that manage to pass through a healthy testing regimen are usually only the really interesting ones. Few things are less worthy of our intellectual prowess than debugging a production crash to find a head-slappingly stupid bug that a straightforward unit or integration test could’ve caught."

 

yellow cameleon

Yellow Cameleon in South Africa by Justin Hunter

 

  • The Oracle Problem and the Teaching of Software Testing by Cem Kaner - "I’ve been emphasizing the oracle problem in my testing courses for about a dozen years. I see this as one of the defining problems of software testing: one of the reasons that skilled testing is a complex cognitive activity rather than a routine activity. Most of the time, I start my courses with a survey of the fundamental challenges of software testing, including an extended discussion of the oracle problem."

  • The new V-Model by Kim Ming Leung - "We design by specifying the measurements before coding but not writing the test before coding. We write code and invite user feedback before writing automated testing for these measurements. Code quality is still guaranteed because first, measurement is the design and second, we code with testing in mind (i.e. write testable code)."

  • Maximizing Software Tester Value by Letting Them Spend More Time Thinking by John Hunter - "Hexawise also lets the software tester easily tune the test coverage based on what is most important. Certain factors can be emphasized, others can be de-emphasised. Knowledge is needed to decide what factors are most important, but after that designing a test plan based on that knowledge shouldn’t take up staff time, good software can take care of that time consuming and difficult task."

  • Improving the State of your Testing Team: Part One – Values by Keith Klain - "Typically, the first thing out of my test teams mouths when asked “how can we improve the state of testing here”, usually relates to something that other people should do. Very few people or teams take an introspective based approach to improvement, or state their management values, but the ones that do, typically have great success.

  • Interview and Book Review: How Google Tests Software by Craig Smith - "stop treating development and test as separate disciplines. Testing and development go hand in hand. Code a little and test what you built. Then code some more and test some more. Test isn’t a separate practice; it’s part and parcel of the development process itself. Quality is not equal to test. Quality is achieved by putting development and testing into a blender and mixing them until one is indistinguishable from the other."

  • Introduction to test strategy (review of Rikard Edgren's presentation) by Mauri Edo - "Rikard referred to a test strategy as the outcome of the gathered and shared information plus our own thinking processes, encouraging us to find strategies for our context, learning to understand what is important for us, in our situation."

  • Experience report EuroSTAR Testlab 2012 by Martin Jansson - "The testlab is very much alike our every day life as testers. We prepare and plan for many things, but when reality hits us our preparations might be in vain. Therefore it is important in being prepared for the unknown and the unexpected. Working with the testlab is a great way of practising that."

 

Related: Software Testing Carnival #1 - Hexawise TV blog

By: John Hunter on Jan 21, 2013

Categories: Software Testing

Screen-Shot-2013-01-15-at-7.28.42-PM-250x300

 

Elisabeth Hendrickson's new book, Explore It!, will begin shipping from Amazon in a week. If you're interested in software testing, I highly recommend it without reservation. It's outstanding. It is currently available for sale on Prag Prog and for pre-order on Amazon. The paper version will be published on January 22nd. Since Amazon apparently doesn't allow people to review books until they officially go on sale, I can't yet post my review on Amazon, but here, one week early, is my glowing review:

Explore It! is one of the very best software testing books ever written. It is packed with great ideas and Elisabeth Hendrickson's writing style makes it very enjoyable to read. Elisabeth Hendrickson has a well-deserved global reputation in the software testing community as someone who has the enviable ability to clearly communicate highly-practical, well-thought-out ideas. Tens of thousands of software testers who have already read her "Test Heuristics Cheat Sheet" no doubt already appreciate her uncanny ability to clearly convey an impressive number of actionable ideas with a minimal use of ink and paper. A pdf download of the cheat sheet is available here. If you're impressed by how much useful stuff Hendrickson can pack into one double-sided sheet of paper, you should see what she can do with 160 pages.

Testers at all levels of experience will benefit from this book. Like the best TED talks, Explore It! contains advanced ideas, yet those ideas are presented in way that is both interesting and accessible to a broad audience. Beginning testers will benefit from learning about the fundamentals of Exploratory Testing (an important and incredibly useful approach to software testing that is increasingly getting the respect it deserves). Experienced testers will benefit from practical insights, frameworks for thinking about challenges that bedevil all of us, and Hendrickson's unmatched ability to clearly explain important aspects of testing (including her superb explanations of test design principles).

Chapter 4 "Find Interesting Variations" in itself is worth far more than the price of the book. It is my favorite chapter in any software testing book I have ever read. A large part of the reason I have so much appreciation for this chapter is that I have personally been teaching software testers how to create interesting variations in their testing efforts for the last six years and know from experience that it can be a challenging topic to explain. I was excited to see how thoroughly Hendrickson covered this important topic because relatively few software testing books address it. I was humbled by how effortlessly Hendrickson seemed to make this complex topic easy to understand.

Buy it. You won't regret it. I'm buying multiple copies to give to developers and testers at my company as well as multiple copies to give to our clients.

  • Justin Hunter

By: Justin Hunter on Jan 15, 2013

Categories: Book Review, Exploratory Testing, Software Testing, Testing Checklists, Testing Strategies, Training

A combinatorial explosion is when the configuration settings and user actions and data entered etc. makes it impossible to test everything. The number of tests required to individually test every single possibility is many thousands of times greater than could realistically be tested.

When faced with taking over an existing software applications without a good test suite (or any test plan) often is daunting. And the problems of creating an unfathomable number of tests face you due to combinatorial explosion. Hexawise is a software as a service that aids in dealing with this dilemma for software testers. Software test plans are created that provide far better coverage than is seen in practice with a tiny fraction of the test required for complete combinatorial coverage (that is testing every possible combination [pairwise or 3, 4, 5... way] individually).

The Google Maps test plan provides a good example of combinatorial explosion faced by the testers (in this case, those who tested Google Maps). Take a look at the Google Maps test plan by login to your Hexawise account (creating a demo account is free and simple). The Google Maps test plan is one of 9 samples currently provided in Hexawise.

For creating your own test plan, while you are exploring the software application and testing it out to find "where the weak points are," you will probably find it useful to vary things as much as possible, repeat your actions as little as possible. Those points are true whether you're doing relatively informal lightly documented exploratory testing or more heavily documented test scripts. It addition, since a large percentage of defects can be triggered by the interaction of just two test inputs, it would be nice, if you had time, to test every single possible combination involving two test inputs; that's the rationale behind allpairs, pairwise and orthogonal array-based test case prioritization methods.

To recreate a similar - very early draft - plan for yourself, I'd suggest going through the following steps to put together a relatively small number of highly informative end-to-end-ish tests:

Ask what can change as users go through the system? Think about configuration settings, user actions, data formats, data ranges, etc. even throw in more "creative" ideas like user personas. Let your creativity and common sense guide you. Enter those in as parameters.

Ask how those parameters can change? (for the parameter "Browser" enter IE7, IE8, FF, etc.) Put those in as values under each parameter (entering constraints as required)

Ask does that variation matter? When possible (when it doesn't matter as much) use equivalence classes and be biased towards fewer values - at least for your early draft tests.

Ask what special paths thorough the system do you want to be sure to include? (Most common happy path, paths to trigger certain business rules, etc.)

Click the Create Tests button in Hexawise and you'll instantly get a very nice draft starter set of highly varied tests. If they look like they're relatively interesting and don't miss hugely important things, start informally executing them and you'll be sure to learn some more things as you do about the system's weak points that would result in you going back to those draft tests and iterating them to make them stronger and cover more.

To get a bit more on using this approach see our case studies. Hexawise TV provides narrated videos online showing how to make your life easier as a software tester.

 

Related: Hexawise Tip: Using Value Expansions and Value Pairs to Handle Dependent Values - 3 Strategies to Maximize Effectiveness of Your Tests - Pairwise and Combinatorial Software Testing in Agile Projects

By: John Hunter and Justin Hunter on Jan 2, 2013

Categories: Combinatorial Software Testing, Exploratory Testing, Multi-variate Testing, Pairwise Software Testing, Software Testing, Testing Strategies

Creating test plans for create, read, update and delete (CRUD) functionality is a very common requirement. There are a few different ways to model it. Let's review the simplest solution first then move to a few optional extra ideas that you could use in addition.

Option 1: Only one C, R, U, D action tested in each of your tests

Have one Parameter called "CRUD action" - have 4 different values: Create, Read, Update, Delete

Option 2: Include two CRUD actions in (most of) your tests

Create 2 Parameters, each with the following 4 and 3 Values:

First CRUD action: Create, Read, Update, Delete
Second CRUD action: Read, Update, Delete

You may want to add an invalid pair with the first CRUD action = Delete and each of the three values in the second CRUD action (in which case you'd need to add a 4th Value called N/A and leave it as the only available option for scenarios that deleted a record in the first test.

Option 3: Up to 4 CRUD actions in each of your tests

  1. Create a new record: create, don't create
  2. Read a record: read, don't read
  3. Update a record: update, don't updat
  4. Delete a record: delete, don't delete

That's the most basic modeling decision. (e.g., one CRUD action / test, two CRUD actions / test, or potentially 4 CRUD actions / test).

What other things might be interesting to vary?

You might want to think about "newspaper reporter questions" - e.g., (who, what, when, where, why? how? how much?)

 

Who?

Which user type (or which system) is performing the CRUD ACTION? - Are certain types of users or systems not allowed to perform certain actions? Include test inputs to make sure they can't. Are others allowed to? Include test inputs in your model to make sure they can.

 

What?

What kind of CRUD actions are being made? Valid ones? Invalid ones? How many invalid types of actions can you think of? What is supposed to happen when special characters are entered? What is supposed to happen when blanks are entered? What is supposed to happen when extremely long names are entered (e.g., are there concatenation rules?)

 

When?

When are the updates being made? Are there any system downtimes? What happens to CRUD actions that are attempted then? What happens if the time stamp on a CRUD action is from the future? From the past? Originates in a different time zone?

 

When? ("version 2")

When are the CRUD actions tested in relation to one another? e.g., Update a record and then try to read it or read it and then try to update it? If you wanted to "mix up the order" of the Option 3 actions, for example, you could add a parameter called perform CRUD actions in what order? with values such as normal create then read then update then delete order, the reverse of the normal order as much as possible, start with read or update. This would not give you a "perfect solution" to cover every timing combination but it would lead to additional variability within your tests if you wanted them.

 

Why?

Why is a CRUD action being conducted? Do any interesting testing ideas come to mind when you think about this angle? How about bad or malicious motivations?

 

How?

How are the actions being undertaken? By a keyboard? By a mouse? By a batch processing system?

Lastly, if you have an account you created on Hexawise.com, check out sample test plan "08. Modifications to a Database" for some additional ideas.

Bottom line: when test designers create tests (whether they use Hexawise to generate their tests or whether they select and document their tests by hand), they will have a lot of discretion over what kinds ideas to include / exclude from their tests. Using Hexawise offers these advantages over selecting tests by hand:

  1. You will maximize variation in your tests
  2. You will minimize wasteful repetition in your tests
  3. You will achieve 100% coverage of the combinations you tell the tool to include in your solution (e.g., pairwise testing coverage, three-way coverage, risk based testing coverage, etc.)
  4. You will be able to select those combinations and documented tester instructions faster and with fewer errors in the documented tests.

This post is based on a user question. See the Hexawise user forum page for some additional details related to the users follow up questions.

 

Related: Efficient and Effective Test Design - 3 Strategies to Maximize Effectiveness of Your Tests - Maximize Test Coverage Efficiency And Minimize the Number of Tests Needed

By: John Hunter and Justin Hunter on Dec 18, 2012

Categories: Pairwise Software Testing, Software Testing, Testing Strategies

I have started focusing on the Hexawise blog recently. For a good part of this year we have not had much activity on the blog, but we plan to make this blog more active going forward. As part of that effort we are starting a Software Testing Blog Carnival to highlight some post on software testing that I find interesting and think others will too. Enjoy,

 

 

water buffalo south africa

Water buffalo, Timbavati Game Preserve, South Africa by Justin Hunter

 

  • Bugs Spread Disease by Elisabeth Hendrickson - "It wasn’t the bugs that killed us directly. Rather, the bugs became a pervasive infection. They hampered us, slowing down our productivity. Eventually we were paralyzed, unable to make even tiny changes safely or quickly."

  • A Sticky Situation by Michael Bolton - on using agile, kanban style, system for managing the software testing workload: "The whiteboard was the center of an active discussion between programmers and project managers about the project status. After the meeting, the whiteboard and the notes on it remained as a kind of information radiator. 'I suddenly realized that if they could do that, I could too,' Paul said. He began by dividing the whiteboard into three columns: To Be Done, Work in Progress, and Done."

  • A Quick Testing Challenge by Alan Page and a response, Angry Weasel Challenge, a presentation by to Figure out why TheApp.exe won't load and cause it to load by solving the problem.

  • 3 Strategies to Maximize Effectiveness of Your Tests by John Hunter - "Use the MECE principle. The MECE principles states you should define your values in a way that makes each of them “Mutually Exclusive” from the others in the list (no subsets should represent any other subsets, no overlaps) and “Collectively Exhaustive” as a group (the set of all subsets, taken together, should fully encompass all items, no gaps)."

  • Musings on Test Design by Alan Page - "The point is, and this is worth repeating so often, is that thinking of test automation and “human” testing as separate activities is one of the biggest sins I see in software testing today. It’s not only ineffective, but I think this sort of thinking is a detriment to the advancement of software testing."

  • Creating a Shared Understanding of Testing Culture on a Social Coding Site by Leif Singer - "those learning testing in this environment write tests to make sure that a program behaves a certain way, and forget that they might also need to test how it should not behave (e.g. on invalid input)."

  • Testing in Scrum with a Waterfall Interaction by Davide Noaro - "Testing each user story separately is, for me, the basis of the Agile process, even in an interaction with a Waterfall-at-end scenario like the one described. Integrating testing into the process itself is something we should do for any software development process, not only in Agile or Scrum."

By: John Hunter on Oct 31, 2012

Categories: Software Testing

Hexawise includes an array of sample plans when a new user account is created. These provide concrete examples of how to categorize items when creating a combinatorial test plans (also called pairwise test plans, orthogonal array test plans, etc.). Once you [sign in to your Hexawise account](http://hexawise.com/ (or setup a new, free, account) looking at this [sample test plan](https://app.hexawise.com/share/HT3UG7M8 (which is similar to the situation raised in the question that follows), might be useful.

Within your Hexawise account you can copy the sample test plans that you are provided with and then make adjustments to them. This lets you quickly see what effects changes you make have on real test plans. And it also lets you see how easy it is to adjust as changes in priorities are made, or gaps are found in the existing test plan.

 

A Hexawise user sent us the following question.

What is the recommended approach to configuring parameter with one or more values?

I have two parameters which are related.

If Parameter 1 = Yes, Parameter 2 allows the user to select one or more values out of a list of 25 - most of which are not equivalent.

For Parameter 2, is the recommended approach to handle this to create separate parameters each with a yes/no value? i.e. create one parameter for each non-equivalent value, and one parameter for the equivalent values. Then link each of these as a married pair to Parameter 1.

I'm open to suggestions as to alternatives.

Here's the screen in question. Parameter 1 = "Pilot", Parameter 2 = checkboxes for types of plans.

aviation question inline

Great question.

I would recommend that you use different parameters for each option (e.g., "Scheduled Commercial" as a parameter with "Selected, Not Selected" as your Values associated with it).

Also, I'd recommend following these 3 strategies to maximize the effectiveness of your tests.

First, consider using adjusted weightings. You may find it useful to weight certain values multiple times, e.g., have 4 values such as "Select, Do Not Select, Do Not Select, Do Not Select" to create 3 times as many tests with "Do Not Select" as "Select."

Second, use the MECE principle. The MECE principles states you should define your Values in a way that makes each of them "Mutually Exclusive" from the others in the list (no subsets should represent any other subsets, no overlaps) and "Collectively Exhaustive" as a group (the set of all subsets, taken together, should fully encompass all items, no gaps)

Third, avoid "ands" in your value names. As a general rule it is unwise to define values like "Old and Male" or "Young and Female", etc. A better strategy is to break those ideas into two separate Parameters, like so:

First Parameter = "Age" --- Values for "Age" = Old / Young

Second Parameter = "Gender" --- Values for "Gender" = "Male / Female"

 

Related: Efficient and Effective Test Design - Context-Driven Usability Considerations, and Wireframing - Why isn't Software Testing Performed as Efficiently and Effecively as it could be?

By: John Hunter on Oct 25, 2012

Categories: Efficiency, Hexawise tips, Pairwise Software Testing, Software Testing, Software Testing Efficiency, Testing Strategies

When we met with Hexawise users in India, we noticed that the page load response times for Hexawise users there were sometimes significantly slower than in the United States due to sporadic internet connectivity issues. One area that troubled us in particular was the extra few seconds it could take to enter test inputs into plans.

We are constantly looking for ways to make our software test case generating tool better and came up with a solution.

Hexawise Bulk Add Instructions 1 inline

Hexawise Bulk Add Instructions 2 inline

 

Even with a great internet connection this feature lets you be more productive, so if you have not tried out the bulk add feature, give it a try today.

Hexawise: More Coverage. Fewer Tests

 

Related: Hexawise Tip: Using Value Expansions and Value Pairs to Handle Dependent Values - Maximize Test Coverage Efficiency And Minimize the Number of Tests Needed - 13 Great Questions To Ask Software Testing Tool Vendors

By: John Hunter on Oct 23, 2012

Categories: Hexawise test case generating tool, Hexawise tips, Software Testing

We have thousands of users and we're growing by hundreds of new users a month. We don't have a way to accurately breakdown our users into Agile vs Waterfall users but I know from conversations with Hexawise users that many users are using Hexawise with great results on Agile projects.

Agile projects involve short sprints in which new features are added to existing functionality. These sprints are often a couple weeks long. In such situations, it is important to be able to create tests quickly. Hexawise is extremely good at accomplishing exactly that.

Let me explain by way of an example. A Hexawise user named Sandeep asked how Hexawise could be used to create tests for an insurance ratings engine. I was able to create a sample set of 37 2-way tests within 20 minutes.

Now lets imagine that a sprint creates 2 new features that we'd like to test in combination with the existing test inputs in the existing plan. Let's assume "Feature 1" is a new ability of the application to handle input coming in from three sources: (1) web, (2) agent, and (3) call center. Let's assume "Feature 2" is a new ability of the application to handle Group Discounts. The Group Discount parameter will have 3 categories of values as well: (N/A) for people who aren't able to claim them, "Private company," and "Government / Public."

Added the new features, and created another, completely different set of tests that achieve 100% 2-way coverage. Each value in the new feature is tested in at least one test with every other value in the plan. It took less than 4 minutes to create this new set of tests incorporating the new features. It is an extremely attractive benefit of using Hexawise that is excellent for both Waterfall projects (which inevitably have late-breaking requirements changes) as well as Agile projects (where late-breaking feature additions are expected to happen as part of the process).

You want to keep testing documentation light, you want to minimize the amount of time you spend in selecting and documenting tests, and it is helpful to achieve as much coverage in as little time as possible and minimize the risk that you'll accidentally forget to test things that you meant to test (which can be easier to do if you have light test cases with enough detail in them to maximize your odds of testing the important stuff without accidentally omitting tests). Hexawise helps you achieve all of thise objectives. The example shows, as clearly and concisely as I'm able to explain it, how you can use Hexawise to create a small set of powerful tests within four minutes (from 1.7 trillion possible tests) that are well suited to test new features developed in an Agile project iteration.

 

Related: What is Agile? What is not Agile? - Why isn't Software Testing Performed as Efficiently and Effecively as it could be? - Cem Kaner: Testing Checklists = Good / Testing Scripts = Bad?

By: Justin Hunter on Oct 19, 2012

Categories: Agile, Hexawise tips, Pairwise Software Testing, Software Testing

I came across Elisabeth Hendrickson's "Test Heuristics Cheat Sheet" yesterday and developed some pairwise testing (AKA 2-way combinatorial) test cases using many of the good ideas contained in it. I would highly recommend it, I'd recommend you send it (or email a link to this blog post) to everyone on your QA team.

As an indication that the Hendrikson's Test Heuristics Cheat Sheet works well to uncover defects,

  • I wanted to create a set of pairwise tests that could be broadly applicable to test thousands of different applications, I incorporated many ideas from the Test Heuristics Cheat Sheet.

  • I intend to use those inputs to test our test design tool, Hexawise.

  • The way Hexawise works is that users enter "things they want to test" into Hexawise on the first of three screens, the "Define Inputs" screen, then click on "Create Tests." Hexawise then uses a scientific approach to maximizing coverage of the combinations of all the "stuff to be tested" in the fewest possible number of tests. This scientific approach is based on the >40 years of Design of Experiments lessons and includes both pairwise / AllPairs methods as well as more thorough 3-way, 4-way, 5-way and 6-way tests (as well

  • Ironically, even before starting to execute the test conditions suggested by Hexawise, I discovered that the special characters that I had input into the "Define Inputs" screen (which I took from the Test Heuristics Cheat Sheet) triggered a previously unidentified defect in Hexawise itself.

  • The fact that it was triggered so quickly in an application that has been live for a year and used thousands of times is a strong indication that using checklists and cheat sheets can be a great way to efficiently find defects.

Why is using checklists to guide your testing often such an efficient and effective way of finding defects? Here's my top ten list:

  1. The "bad ideas" have already been weeded out.
    1. The ideas on the list have found enough defects to make the author of the checklist think there is value in testing the particular idea.
    2. If you've got a checklist or "cheat sheet" put together by someone as thoughtful and experienced as the Bachs, Bolton, and Hendrickson, you're getting a highly-condensed executive summary version of many of their valuable insights.
    3. All testers go through many, many, "I wonder what would happen if we did this or considered that?" scenarios.
    4. The checklists referenced above represent expertise culled from thousands of testing projects.

  2. Checklists are directly actionable. You can apply them in almost no time at all.
  3. They work well. See Cem Kaner's slides on the Value of Checklists (11 Mb pdf file).
  4. They can easily evolve into some of your most powerful test artifacts.
    • Start with the lists above. See if each of the ideas for tests trigger defects in your Systems Under Test.
    • Find a lot of defects from certain test ideas? Create your own checklist of ideas that worked and iterate them over time.. Consider expanding upon the checklist items and concepts that do bear fruit.
    • Don't ever find defects from certain of the test ideas? Consider dropping those items from the checklists if they don't bear fruit for you (or put tests for those ideas at the back of your lists and only include tests for them if you have extra time).

  5. Checklists include useful, defect-triggering ideas that you may not have thought of on your own.
  6. They're free.
    • No software or books to buy.
    • No courses or conferences to attend.

  7. Using checklists mitigate the risk that you will forget to test for things that you know you should be testing for (but could well forget to test for in any specific instance).
    • As humans, we're naturally forgetful as a species despite our best efforts.
    • Checklists are widely used with good results by doctors, lawyers, pilots, software testers, and people going to grocery stores to minimize the effects of these shortcomings.

  8. Software testing checklists are an efficient way to communicate actionable information.
  9. Software testing checklists are widely applicable to all kinds of software testing.
    • Checklists can be used in creating Unit Tests, Assembly Tests, Product Tests, System Tests, Functional Tests, Load Tests, Performance Tests, User Acceptance Tests, etc.
    • Checklists can be used by Exploratory Testers and "script-everything-in-advance" test-case-centric testers.
    • Checklists can be used in Agile projects as well as Waterfall projects.

  10. Software testing checklists can be easily used in pairwise and combinatorial testing.
    • Using elements from the checklists in a pairwise test will have the added benefit that not only will you test for every one of the testing ideas on the checklist (e.g., XXX) but also, you can easily test for every idea on the checklist **in combination with** every other test idea on the checklist in at least one test case.

    By: Justin Hunter on May 3, 2012

    Categories: Checklists, Software Testing, Testing Checklists, Uncategorized

It's common to have a test plan where the possible values of one parameter depend on the value of another parameter. There are many options for how you can represent this scenario in Hexawise, some options that involve using value expansions (when there is equivalence) and other options that do not use value expansions (when there is not equivalence).

Using Value Expansions in Hexawise

The general rule of thumb for value expansions is that they are for setting up equivalence classes. The key there being the equivalence. The expectations of the system should be the same for every value listed in that particular value expansion.

Let's consider a real world example involving a classification parameter with a value that is dependent on the value of a role parameter:

Inputs
Role: Student, Staff
Classification: Freshman, Sophomore, Junior, Senior, Adjunct, Assistant, Professor, Administrator

So if the Role parameter has a value of Student, then the Classification parameter must have a value of Freshman, Sophomore, Junior or Senior, but if the Role parameter has a value of Staff, then the Classification parameter must have a value of Adjunct, Assistant, Professor or Administrator.

Using value expansions in this case might be a good option. You could setup your inputs, value expansions and value pairs this way:

Inputs
Role: Student, Staff
Classification: Student Classification, Staff Classification

Value Expansion
Student Classification: Freshman, Sophomore, Junior, Senior
Staff Classification: Adjunct, Assistant, Professor, Administrator

Value Pairs
When Role=Student Always Classification=Student Classification
When Role=Staff Always Classification=Staff Classification

You would use this approach if there were no important differences in the business logic or expected behavior of the system when the different expansions of the value were used. If Freshman versus Sophomore is an important label for the users to be able to enter and see, but the system under test doesn't change its behavior based on which value is selected, then those expansions of the value are equivalent and don't need to be tested individually for how they might interact with other parts of the system and create bugs. If this equivalence scenario is true, then you will greatly simplify things for yourself and create fewer tests that are just as powerful by using value expansions.

In the scenario that would support using value expansions, the system might have different behavior for a Junior versus an Adjunct Professor, but not for a Freshman versus a Senior. A Freshman and a Senior are always equivalent in the system, so they can be combined in a value expansion.

However, if the expectations are not the same, then a value expansion should not be used. For example, let's suppose this hypothetical system has business logic giving priority class scheduling to Seniors and only last available scheduling priority to Administrators. In this case, using value expansions as described above would probably be a mistake. Why? Because a Sophomore and a Senior aren't treated the same way by the system, yet Hexawise considers all the expansions of the Student Classification value as equivalent. As long as you've got a test that has paired a value expansion of the Student Classification value with the Overbooked value of the Class Status parameter, then Hexawise won't insist on pairing all the other value expansions for the Student Classification value with Class Status = Overbooked in other tests. You could therefore miss a bug that only occurs when a Senior signs up for an overbooked class.

"One to many" or "multi-valued" married pair model

If the system under test does not consider the values to be equivalent and has requirements and business logic to behave differently, then by using value expansions to signal equivalency to Hexawise when there isn't equivalency is probably a mistake.

So what would you do in that case?

We've decided that it might be nice to be able to set up your inputs and value pairs like this:

Inputs
Role: Student, Staff
Classification: Freshman, Sophomore, Junior, Senior, Adjunct, Assistant, Professor, Administrator

Value Pairs
When Role=Student Always Classification=Freshman, Sophomore, Junior, or Senior When Role=Staff Always Classification=Adjunct, Assistant, Professor, or Administrator

Unfortunately, this kind of a "one to many" or "multi-valued" value pair is something we've only recently realized would be very helpful, and is something we have on the drawing board for Hexawise in the intermediate future, but is not a feature of Hexawise today. In the meantime, you could model it with three parameters:

Inputs
Role: Student, Staff
Student Classification: Freshman, Sophomore, Junior, Senior, N/A Staff Classification: Adjunct, Assistant, Professor, Administrator, N/A

Value Pairs
When Role=Student Always Staff Classification=N/A
When Role=Staff Always Student Classification=N/A

Another modeling option to consider, if there is only special logic for Administrator and for Seniors, but the rest of the values we've been discussing are equivalent, is to use value expansions for just the equivalent values:

Inputs
Role: Student, Staff
Classification: Underclassman, Senior, Professor, Administrator

Value Expansions
Underclassman: Freshman, Sophomore, Junior Professor: Adjunct, Assistant, Full

Value Pairs
When Role=Student Never Classification=Professor When Role=Student Never Classification=Administrator When Role=Staff Never Classification=Underclassman When Role=Staff Never Classification=Senior

I hope this helps you understand the role of value expansions in Hexawise, when to use them (in cases of equivalency), and when to avoid them, and how value pairs and value expansions can be used together to handle cases of dependent parameter values. Value Expansions are a powerful tool to help you decrease the number of tests you need to execute, so take advantage of them, and if you have any questions, just let us know!

By: John Hunter on Apr 26, 2012

Categories: Combinatorial Software Testing, Combinatorial Testing, Hexawise tips, Pairwise Testing, Software Testing, Testing Strategies

84 percent coverage in 20 tests

Hexawise test coverage graph showing 83.9% coverage in just 20 tests

 

Among the many benefits Hexawise provides is creating a test plan that maximizes test coverage with each new scenario tested. The graph above shows that after just 20 test 83.9% of the test combinations have been tested. Read more about this in our case study of a mortgage application software test plan. Just 48 test combinations are needed to test for every valid pair (3.7 million possible tests combinations exist in this case). If you are lost now, this video may help.

The coverage achieved by the first few tests in the plan will be quite high (and the graph line will point up sharply) then the slope will decrease in the middle of the plan (because each new test will tend to test fewer net new pairs of values for the first time) and then at the end of the plan the line will flatten out quite a lot (because by the end, relatively few pairs of values will be tested together for the first time).

One of the benefits Hexawise provides is making that slope as steep as possible. The steeper the slope the more efficient your test plan is. If you repeat the same tests of pairs and triples and... while not taking advantage of the chance to test, untested pairs and triples you will have to create and run far more test than if you intelligently create a test plan. With many interactions to test it is far too complex to manually derive an intelligent test plan. A combinatorial testing tool, like Hexawise, that maximizes test plan efficiency is needed.

For any set of test inputs, there is a finite number of pairs of values that could be tested together (that can be quite a large number). The coverage chart answers, after each tests, what percentage of the total number of pairs (or triples, etc.) that could be tested together have been tested together so far?

The Hexawise algorithms achieve the following objectives that help testers find as many defects as possible in as few tests as possible. In each and every step of each and every test case, the algorithm chooses a test condition that will maximize the number of pairs that can be covered for the first time in the test case. (Or, the maximum number of triplets or quadruplets, etc. based on the thoroughness setting defined by the user). Allpairs (AKA pairwise) is a well known and easy to understand test design strategy. Hexawise lets users create pairwise sets of tests that will test not only every pair but it also allows test designers to generate far more thorough sets of tests (3-way to 6-way coverage). This allows users to "turn up the coverage dial" and generate tests that cover every single possible triplet of test inputs together at least once (or every 4-way combination or 5-way combination or 6-way combination).

Note that the coverage ratio Hexawise shows is based on the factors entered as items to be tested: not a code coverage percentage. Hexawise sorts the test plan to front load the coverage of the tuple pairs, not the coverage of the code paths. Coverage of code paths ultimately depends on how good a job the test designer did at extracting the relevant parameters and values of the system under test. You would expect there to be some loose correlation between coverage of identified tuple pairs and coverage of code paths in most typical systems.

If you want to learn more about these concepts, I would recommend Scott's Scott Sehlhorst articles on pairwise and combinatorial test design. They are some of the clearest introductory articles about pairwise and combinatorial testing that I have seen. They also contain some interesting data points related to the correlation between 2-way / allpairs / pairwise / n-way coverage (in Hexawise) and the white box metrics of branch coverage, block coverage and code coverage (not measurable by Hexawise).

In Software testing series: Pairwise testing, for example, Scott includes these data points:

 

  • We measured the coverage of combinatorial design test sets for 10 Unix commands: basename, cb, comm, crypt, sleep, sort, touch, tty, uniq, and wc... The pairwise tests gave over 90 percent block coverage.

 

  • Our initial trial of this was on a subset Nortel’s internal e-mail system where we able cover 97% of branches with less than 100 valid and invalid testcases, as opposed to 27 trillion exhaustive testcases.

 

  • A set of 29 pair-wise... tests gave 90% block coverage for the UNIX sort command. We also compared pair-wise testing with random input testing and found that pair-wise testing gave better coverage.

 

Related: Why isn't Software Testing Performed as Efficiently and Effecively as it could be? - Video Highlight Reel of Hexawise – a pairwise testing tool and combinatorial testing tool - Combinatorial Testing, The Quadrant of Massive Efficiency Gains

Specific guidance on how to view the percentage of coverage graph for the test plan in Hexawise:

 

When working on your test plan in Hexawise, to get the checklist to be visible, click on the two downward arrow keys located shown in the image:

How-To Progress Checklists-2 inline

Then you'll want to open up the "Advanced" list. So you might need to click here:

Advanced How-To Progress Checklist inline

Then the detailed explanation will begin when you click on "Analyze Tests"

Decreasing Marginal Returns inline

 

This post is adapted (and some new content added) from comments posted by Justin Hunter and Sean Johnson.

By: John Hunter on Feb 3, 2012

Categories: Combinatorial Software Testing, Combinatorial Testing, Efficiency, Multi-variate Testing, Pairwise Software Testing, Pairwise Testing, Scripted Software Testing, Software Testing, Software Testing Efficiency

jobgraph

 

This chart shows that the number of job listings posted for testers has remained essentially flat for testers over the last six years. During this same period, the number of job listings for developers has increased almost 50%. If your company's hiring has mirrored this chart's trends, and your software testers are increasingly feeling that there's not enough time in the day to test everything that's being thrown at them, your testers might have a point.

By: Justin Hunter on Feb 8, 2011

Categories: Software Testing

I can be too verbose with some of my posts. This will be quick.

I recommend that you read this. It is the Exploratory Testing Dynamics document, a tightly condensed list of useful testing heuristics authored by three of the most thoughtful and experienced software testers alive today: James Bach, Michael Bolton, and Jon Bach. Using it will help you improve your software testing capabilities. http://www.satisfice.com/blog/wp-content/uploads/2009/10/et-dynamics22.pdf

Told you it'd be brief. Go. Now. Read.

By: Justin Hunter on Jul 13, 2010

Categories: Exploratory Testing, Software Testing

A friend passed me this set of recent tweets from Wil Shipley, a Mac developer with 11,743 followers on Twitter as of today. Wil recently encountered the familiar problem of what to do when you've got more software tests to run than you can realistically execute.

 

20100623-nixnbwu9urxaufu6hjt2143j3h

I love that. Who can't relate?

Now if only there were a good, quick way to reduce the number of tests from over a billion to a smaller, much more manageable set of tests that were "Altoid-like" in their curious strength. :) I rarely use this blog for shameless plugs of our test case generating tool, but I can't help myself here. The opening is just too inviting. So here goes:

 

"Wil,

There's an app for that... See www.hexawise.com for Hexawise, a "pairwise software test case generating tool on steroids." It eats problems like the one you encountered for breakfast. Hexawise winnows bazillions of possible test cases down in the blink of an eye to small, manageable sets of test cases that are carefully constructed to maximize coverage in the smallest amount of tests, with flexibility to adjust the solutions based upon the execution time you have available. In addition to generating pairwise testing solutions, Hexawise also generates more thorough applied statistics-based "combinatorial software testing" solutions that include tests for, say, all possible 6-way combinations of test inputs.

Where your Mac cops an attitude and tells you "Bitch, I ain't even allocating 1 billion integers to hold your results" and showers you with taunting derisive sneers, head-waggling and snaps all carefully choreographed to let you know where you stand, Hexawise, in contrast, would helpfully tell you: "Only 1 billion total possibilities to select tests from? Pfft! Child's play. Want to start testing the 100 or so most powerful tests? Want to execute an extremely thorough set of 10,000 tests? Want to select a thoroughness setting in the middle? Your wish is my command, sir. You tell me approximately how many tests you want to run and the test inputs you want to include, and I'll calculate the most powerful set of tests you can execute (based on proven applied statistics-based Design of Experiments methods) before you can say "I'm Wil Shipley and I like my TED Conference swag."

More info at:
http://hexawise.tv/intro/
or
https://hexawise.com/Hexawise_Introduction.pdf
free trials at:
http://hexawise.com/signup

By: Justin Hunter on Jun 23, 2010

Categories: Combinatorial Software Testing, Combinatorial Testing, Interesting People , Pairwise Software Testing, Pairwise Testing, Recommended Tool, Software Testing

There are good reasons James Bach is so well known among the testing community and constantly invited to give keynote presentations around the globe at software testing conferences. He's passionate about testing and educating testers; he's a gifted, energetic, and entertaining speaker with a great sense of humor; and he takes joy in rattling his saber and attacking well-established institutions and schools of thought that he disagrees with. He doesn't take kindly to people who make inflated claims of benefits that would materialize "if only you'd perform testing in XYZ way or with ABC tool" given that (a) he can always seem to find exceptions to such claims, (b) he doesn't shy away from confrontation, and (c) he (rightly, in my view) thinks that such benefits statements tend to discount the importance of critical thinking skills being used by testers and other important context-specific considerations.

Leave it up to James to create a list of 13 questions that would be great to ask the next software testing tool vendor who shows up to pitch his problem-solving product. In his blog post titled "The Essence of Heuristics," he posed this exact set of questions in a slightly different context, but as a software testing tool vendor myself, they really hit home. They are:

 

  1. Do they teach you how to tell if it’s working?
  2. Do they teach you how to tell if it’s going wrong?
  3. Do they teach you heuristics for stopping?
  4. Do they teach you heuristics for knowing when to apply it?
  5. Do they compare it to alternative heuristics?
  6. Do they show you why it works?
  7. Do they help you understand when it probably works best?
  8. Do they help you know how to re-design it, if needed?
  9. Do they let you own it?
  10. Do they ask you to practice it?
  11. Do they tell stories about how it has failed?
  12. Do they listen to you when you question or challenge it?
  13. Do they praise you for questioning and challenging it?

 

[Side note: Apparently I wasn't the only one who thought of Hexawise and pairwise / combinatorial test design approaches when they saw these 13 questions. I was amused that after I drafted this post, I saw Jared Quinert's / @xflibble's tweet just now:]

20100601-br4ud66pcc7f79q1ywgbat74jw

Where do I come down on each of James' 13 questions with respect to people I talk to about our test design tool, Hexawise, and the types of benefits and the size of benefits it typically delivers? Quite simply, "Yes" to all 13. I enjoy talking about exactly the kinds of questions that James raised in his list. In fact, when I sought out James to ask him questions at a conference in Boston earlier this year, it was because I wanted his perspective on many of the points above, particularly #11: (hearing stories about how James has seen pairwise and combinatorial approaches to test design fail), and #7 (hearing his views on where it works best and where it would be difficult to apply it). I'll save my specific answers to another post, but I am serious about wanting to share my thoughts on them; time constraints are holding me back today. I gave a speech at the ASQ World Conference on Quality Improvement in St. Louis last week though that addressed many, but not all, of James' questions.

I'm not your typical software tool vendor. Basically, my natural instincts are all wrong for sales. I agree with the premise that "a fool with a tool is still a fool"; when talking to target clients and/or potential partners, I'm inclined to point out deficiencies, limitations, and various things that could go wrong; I'm more of an introvert than an extrovert, etc. Not exactly the typical characteristics of a successful salesman... Having said that, I believe that we've built a very good tool that helps enable dramatic efficiency and thoroughness benefits in many testing situations but our tool, along with the pairwise and combinatorial test design approaches that Hexawise enables both have their limitations. It is primarily by talking to software testers about their positive and negative experiences that our company is able to improve our tool, enhance our training, and provide honest, pragmatic guidance to users about where and how to use our tool (and where and how not to).

Tool vendors who defend their tools (and/or the approaches by which their tools helps users solve problems) as magical, silver bullet solutions are being both foolish and dishonest. Tool vendors who choose not to engage in serious, honest and open discussions with users about the challenges that users have when applying their tools in different situations are being short-sighted. From my own experiences, I can say that talking about the 13 topics raised by James have been invaluable.

By: Justin Hunter on Jun 1, 2010

Categories: Combinatorial Testing, Design of Experiments, Hexawise test case generating tool, Pairwise Testing, Software Testing, Software Testing Efficiency, Uncategorized

Luis Fernández, an Associate professor at Universidad de Alcala is conducting a survey of software testers to gather data relating to, e.g., "Why isn't software testing conducted as efficiently and effectively as it should be?" and "What factors lead to software testing being 'under-appreciated' as a potential career path?"

His survey (as of March, 2010) is listed [here].(http://www.cc.uah.es/encuestas/index.php?sid=28392&lang=en)

Personally, I agree that the following two issues (identified in his survey) are significant causes of inefficiency in software testing:

1) "People tend to execute testing in an uncontrolled manner until the total expenditure of resources in the belief that if we test a lot, in the end, we will cover or control all the system." (Or, at least, given the relatively undisciplined test case selection methods prevalent in the industry, my experience in analyzing manually selected test scenarios is that testers generally believe (a) they are covering a higher proportion of an application's possible combinations than they actually are and (b) they underestimate the amount of time that is spent during test execution unproductively repeating steps that they have previously tested)

2) "Many managers did not receive appropriate training on software testing so they do not appreciate its interest or potential for efficiency and quality."

It is unfortunate, but true, that many testing managers do not have any background whatsoever in combinatorial testing methods that (a) dramatically reduce the amount of time it takes to select and document test cases, and (b) will simultaneously improve test execution efficiency when applied correctly. See, for example, this

See also: This slideshow on efficient and effective test design

Please consider taking Fernández's short survey. It takes only 5-10 minutes to complete.

By: Justin Hunter on Mar 1, 2010

Categories: Combinatorial Testing, Efficiency, Software Testing, Software Testing Efficiency

20100127-ht4mjknjnmwce46fp7m2jst7q

 

All the quotes below are from the inside cover of Statistics for Experimenters written by George Box, Stuart Hunter, and William G. Hunter (my late father). The Design of Experiments methods expressed in the book (namely, the science of finding out as much information as possible in as few experiments as possible), were the inspiration behind our software test case generating tool. In paging through the book again today, I found it striking (but not surprising) how many of these quotes are directly relevant to efficient and effective software testing (and efficient and effective test case design strategies in particular):

  • "Discovering the unexpected is more important than confirming the known." - George Box

  • "All models are wrong; some models are useful." - George Box

  • "Don't fall in love with a model."

  • How, with a minimum of effort, can you discover what does what to what? Which factors do what to which responses?

  • "Anyone who has never made a mistake has never tried anything new." - Albert Einstein

  • "Seek computer programs that allow you to do the thinking."

  • "A computer should make both calculations and graphs. Both sorts of output should be studied; each will contribute to understanding." - F. J. Anscombe

  • "The best time to plan an experiment is after you've done it." - R. A. Fisher

  • "Sometimes the only thing you can do with a poorly designed experiment is to try to find out what it died of." - R. A. Fisher

  • The experimenter who believes that only one factor at a time should be varied, is amply provided for by using a factorial experiment.

  • Only in exceptional circumstances do you need or should you attempt to answer all the questions with one experiment.

  • "The business of life is to endeavor to find out what you don't know from what you do; that's what I called 'guessing what was on the other side of the hill.'" - Duke of Wellington

  • "To find out what happens when you change something, it is necessary to change it."

  • "An engineer who does not know experimental design is not an engineer." - Comment made by to one of the authors by an executive of the Toyota Motor Company

  • "Among those factors to be considered there will usually be the vital few and the trivial many." - J. M. Juran

  • "The most exciting phrase to hear in science, the one that heralds discoveries, is not 'Eureka!' but 'Now that's funny...'" - Isaac Asimov

  • "Not everything that can be counted counts and not everything that counts can be counted." - Albert Einstein

  • "You can see a lot by just looking." - Yogi Berra

  • "Few things are less common than common sense."

  • "Criteria must be reconsidered at every stage of an investigation."

  • "With sequential assembly, designs can be built up so that the complexity of the design matches that of the problem."

  • "A factorial design makes every observation do double (multiple) duty." - Jack Couden

Where the quotes are not attributed, I'm assuming the quote is from one of the authors. The most well known of the quotes not attributed, above, "All models are wrong; some models are useful." is widely attributed to George Box in particular, which is accurate. Although I forgot to confirm that suspicion with him when I saw him over Christmas break, I suspect most of them are from George (as opposed to from Stu or my dad); George is 90 now and still off-the-charts smart, funny, and is probably the best story teller I've met in my life. If he were younger and on Twitter, he'd be one of those guys who churned out highly retweetable chestnuts again and again. [Update - George Box died in 2013]

 

Related thoughts

As you know if you've read my blog before, I am a strong proponent of using the Design of Experiments principles laid out in this book and applying them in field of software testing to improve the efficiency and effectiveness of software test case design (e.g., by using pairwise software testing, orthogonal array software testing, and/or combinatorial software testing techniques). In fact, I decided to create my company's test case generating tool, called Hexawise, after using Design of Experiments-based test design methods during my time at Accenture in a couple dozen projects and measuring dramatic improvements in tester productivity (as well as dramatic reductions in the amount of time it took to identify and document test cases). We saw these improvements in every single pilot project when we used these methods to identify tests.

My goal, in continuing to improve our Hexawise test case generating tool, is to help make the efficiency-enhancing Design of Experiments methods embodied in the book, accessible to "regular" software testers, and more more broadly adopted throughout the software testing field. Some days, it feels like a shame that the approaches from the Design of Experiments field (extremely well-known and broadly used in manufacturing industries across the globe, in research and development labs of all kinds, in product development projects in chemicals, pharmaceuticals, and a wide variety of other fields), have not made much of an inroad into software testing. The irony is, it is hard to think of a field in which it is easier, quicker, or immediately obvious to prove that dramatic benefits result from adopting Design of Experiments methods than software testing. All it takes is for a testing team to decide to do a simple proof of concept pilot. It could be for as little as a half-day's testing activity for one tester. Create a set of pairwise tests with Hexawise or another t00l like James Bach's AllPairs tool. Have one tester execute the tests suggested by the test case generating tool. Have the other tester(s) test the same application in parallel. Measure four things:

  1. How long did it take to create the pairwise / DoE-based test cases?

  2. How many defects were found per hour by the tester(s) who executed the "business as usual" test cases?

  3. How many defects were found per hour by the tester who executed the pairwise / DoE-based tests?

  4. How many defects were identified overall by each plan's tests?

These four simple measurements will typically demonstrate dramatic improvements in:

  • Speed of test case identification and documentation

  • Efficiency in defects found per hour

As well as consistent improvements to:

  • Overall thoroughness of testing.

 

A Suggestion: Experiment / Learn / Get the Data / Let the Efficiency and Effectiveness Findings Guide You

I would be thrilled if this blog post gave you the motivation to explore this testing approach and measure the results. Whether you've used similar-sounding techniques before or never heard of DoE-based software testing methods before, whether you're a software testing newbie or a grizzled veteran, I suspect the experience of running a structured proof of concept pilot (and seeing the dramatic benefits I'm confident you'll see) could be a watershed moment in your testing career. Try it! If you're interested in conducting a pilot, I'd be happy to help get you started and if you'd be willing to share the results of your pilot publicly, I'd be able to provide ongoing advice and test plan review. Send me an email or leave a comment.

To the grizzled and skeptical veterans, (and yes, Mr, Shrini Kulkarni / @shrinik who tweeted "@Hexawise With all due respect. I can't credit any technique the superpower of 2X defect finding capability. sumthng else must be goingon" before you actually conducted a proof of concept using Design of Experiments-based testing methods and analyzed your findings, I'm lookin' at you), I would (re)quote Sophocles: "One must try by doing the thing; for though you think you know it, you have no certainty until you try." For newer testers, eager to expand your testing knowledge (and perhaps gain an enormous amount of credibility by taking the initiative, while you're at it), I'd (re)quote Cole Porter: "Experiment and you'll see!"

I'd welcome your comments and questions. If you're feeling, "Sounds too good to be true, but heck, I can secure a tester for half a day to run some of these DoE-based / pairwise tests and gather some data to see whether or not it leads to a step-change improvement in efficiency and effectiveness of our testing" and you're wondering how you'd get started, I'd be happy to help you out and do so at no cost to you. All I'd ask is that you share your findings with the world (e.g., in your blog or let me use your data as the firms did with their findings in the "Combinatorial Software Testing" article below).

 

Related:

By: Justin Hunter on Jan 27, 2010

Categories: Combinatorial Testing, Design of Experiments, Hexawise test case generating tool, Multi-variate Testing, Software Testing

I responded to a recent blog post written by Gareth Bowles today and was struck - again - that a defect that must have been seen >10 million times by now has still not been corrected. When anyone responds to a blog post on Blogger.com, the stat counter says "1 comments" instead of correctly stating "1 comment." What's up with that?

 

Lands End-Wikipediathefreeencyclope

 

The clothing company Lands' End (with the apostrophe erroneously after the s instead of before it) has a bizarre but somewhat logical explanation for why they have printed their grammatical-mistake-laden brand name on millions of pieces of clothing. According to one version of the story I have heard, they printed their first brochures with the typo and couldn't afford to get it changed. I also remember reading a more detailed explanation in a catalog in the late 80's to the effect that by the time the company management realized their mistake and tried to get trademark protection on "Land's End" they discovered that another firm already had trademarked rights to that name. Quick internet searches can't verify that so perhaps my memory is just playing tricks on me. But I digress. Here's the defect I wanted to highlight with this post:

 

OptimalOperations ExchangingAnswers

 

For Blogger.com to leave the extra "s" in has me stumped for several reasons. First, this defect has been seen by a ton of people as Blogger.com is, according to Alexa's site tracking, the world's 7th most popular site. Second, Blogger.com is owned by Google (among the most competent, quality-oriented IT wizards on the planet) and no trademark protection is preventing the correction. Third, it would seem to be such an easy thing to fix. Fourth, other sites (like wordpress) don't make the same mistake. Fifth, it doesn't seem like a "style preference" issue (like spelling traveled with one "L" or two); it seems to me like a pretty clear case of a mistake. It would be a mistake to say "one cars," "one computers," or "one pedantic grammarians"; similarly, it is a mistake to say "one comments". What gives? Anyone have any ideas?

For anyone wondering where the ">10 million times" figure came from, it is pure conjecture on my part. If anyone has a reasoned way to refute or confirm it or propose a better estimate, I'm all ears.

By: Justin Hunter on Dec 9, 2009

Categories: Inexplicable Mysteries, Software Testing

Dave Whalen posted a good piece here asserting that software testing, done well, requires a blend of "Science" and "Art". I recommend it. (He also has a good post about testing databases here).

 

20091124-n96r4rh2t5qufjm3mq8eytwu51

 

He includes the statement below which I agree with. If you are a software tester and any doubts about whether all of these methods work (pairwise software testing in particular), I would encourage you to conduct a pilot project on your own and measure the results achieved with and without the technique applied.

 

From the scientific side, testing can include a number of proven techniques such as equivalency class testing, boundary value analysis, pair-wise testing, etc. These techniques, if used properly, can reduce test times and focus on finding the bugs where they tend to hang out - much like a porch light on a summer night.

 

My response to Dave's post, included below, is not especially profound or even well-written, but, hey, I'm in a hurry in the pre-Thanksgiving rush and the topic hit close to home so I couldn't resist jotting a little something. Enjoy. Please let me know your thoughts / reactions if you have any.

 

Dave,

Very well said!

I wholeheartedly, enthusiastically agree with your premise. I also wish that more people saw things the same way.

My father co-wrote Statistics for Experimenters which describes the “art and science” within the Design of Experiments ("DoE") field of applied statistics. Well-run manufacturing companies use DoE techniques in their manufacturing processes. Many companies, such as Toyota see them as an absolutely fundamental part of their processes (yet unfortunately, software testers, who could use DoE techniques such as pairwise and other forms of combinatorial testing, are often ignorant about how to use them properly and the software testing industry as a whole dramatically under-utilizes such techniques…. but I digress).

I brought the book up because it opens up with a good example relevant to the points you made. To win at the game of 20 questions, it is useful to know “the science” of game theory and DoE; choose questions so that there is a 50/50 chance that the answer will be Yes. Someone who knows this technique, all else being equal, will be win more because of their “scientific” approach than someone who doesn’t know the technique. And yet… other stuff (whether the subject matter expertise in this example, or subject matter expertise and “artistic” Exploratory Testing in your example) is indispensable as well.

You can’t truly excel at either 20 Questions or software testing unless you have a good mix of “science” (governed by mathematical principles, proven methods of DoE, etc.) and and “art” (governed by experience, instincts, and subject matter expertise).

By: Justin Hunter on Nov 24, 2009

Categories: Combinatorial Testing, Efficiency, Pairwise Testing, Software Testing, Uncategorized

On October 6th, I informally launched testing.stackexchange.com as "the stackoverflow.com for Software Testing" without much hoopla. So far, less than a month later, with no advertising other than word of mouth, the initial results are very promising. We've had approximately:

  • 70 new users join as members and contributors
  • 50 software testing questions
  • 160 answers to those questions
  • 2,200 views of the questions and answers

The most important development is not reflected in the numbers above. More important, by far, than the number of the participants have joined is the quality of people who are contributing. Members of the forum include some prominent experts including: Jason Huggins (creator of Selenium and cofounder of Sauce Labs, Alan Page and Bj Rollison (of "How we Test Software at Microsoft" fame), Michael Bolton (the testing expert, not the singer), Fred Beringer, Elisabeth Hendrickson, Joe Strazzere, Adam Goucher, Simon Morley, Rob Lambert, Scott Sehlhorst, etc. etc.). Given the high quality people the site has attracted, the quality of the answers delivered has been quite high. Perhaps the quality is also above average because people answering know that their answers will be analyzed by thoughtful testers and voted up (or down) based on how good they are. In short, testers are asking good questions and getting them answered which is why I created the site in the first place. I'm cautiously optimistic about the future: if the site

Members so far include:

 

1Users-TestingStackExchange

 

The most viewed questions so far include:

 

1TestingStackExchange-2

 

The most recent questions being asked and answered are:

 

1TestingStackExchangerecentqs-1

 

I'd like to extend special thanks to Alan Page (who likes the idea so much that has volunteered to join me as a co-manager/Moderator of the site), to Shmuel Gershon, Jason from NC, and Joe Strazzere for being particularly active and to Alan Page, Corey Goldberg, Shmel Gershon, and Konstantin for helping to get the word out about the form through their blog posts telling the world about testing.stackexchange.com. Without their combined help, we'd be nowhere. With their help and support, we're building a place where software testers can seek and receive high-quality, peer-reviewed answers to their testing questions.

Please help us succeed by spreading the word, asking a few questions, answering a few, and voting on the best answers.

Thanks everyone!

By: Justin Hunter on Nov 2, 2009

Categories: Interesting People , Software Testing

There are some phrases in English that, as often as not, come off sounding obligatory and/or insincere. The phrase "I'm honored..." comes to mind (particularly if someone is accepting an award in front of a room full of people).

Be that as it may, I genuinely felt really honored last night and again today by a couple comments James Bach has said about me, including these:

 

TwitterHexawiseresults-Oct232009

 

Here's the quick background: (1) James knows much more about software testing than I do and I respect his views a lot. (2) He has a reputation for not suffering fools gladly and pretty bluntly telling people he doesn't respect them if he doesn't respect the content of their views. (3) in addition to his extremely broad expertise on "testing in general" James, like Michael Bolton, knows a lot about pairwise and combinatorial testing methods and how to use them. (4) I firmly (and passionately) believe that pairwise and combinatorial testing methods are (a) dramatically under-appreciated, and (b) dramatically under-utilized. (5) James has published a very good and well-reasoned article about some of the limitations of pairwise testing methods that I wanted to talk to him about. (6) I co-wrote an article that IEEE Computer recently published about Combinatorial Testing that I wanted to discuss with him. (7) James and I have been at the STP Conference in Boston over the past few days. (8) I reached out to him and asked to meet at the conference to talk about pairwise and combinatorial testing methods and share with him my findings that - in the dozens of projects I've been involved with that have compared testers efficiency and effectiveness - I've routinely seen defects found per tester hour more than double. (9) I was interested in getting his insights into where are these methods most applicable? Least applicable? What have his experiences been in teaching combinatorial testing methods to students, etc.

In short, frankly, my goals in meeting with him were to: (a) meet someone new, interesting and knowledgeable and learn as much as could and try to understand from his experiences, his impressive critical thinking and his questioning nature, and (b) avoid tripping up with sloppy reasoning (when unapologetically expressing the reasons I feel combinatorial testing methods are dramatically under-appreciated by the software testing community) in front of someone who (i) can smell BS a mile away, and (ii) doesn't suffer fools gladly.

I learned a lot, heard some fantastic war stories and heard his excellent counter-examples that disproved a couple of the generalizations I was making (but didn't dampen my unshaken assertions that combinatorial testing methods are wildly under-utilized by the software testing community). I thoroughly enjoyed the experience. Moving forward, as a result of our meeting, I will go through an exercise which will make me more effective (namely carefully thinking through and enumerating all of the assumptions behind my statements like: "I've measured the effectiveness of testers dozens of times - trying to control external variables as much as reasonably possible - and I'm consistently seeing more than twice as many defects per tester hour when testers adopt pairwise/combinatorial testing methods."

His complement last night was private so I won't share it but it ranks up there in my all time favorite complements I've ever received. I'm honored. Thanks James.

By: Justin Hunter on Oct 23, 2009

Categories: Combinatorial Testing, Design of Experiments, Efficiency, Interesting People , Pairwise Testing, Software Testing, Software Testing Efficiency, Testing Case Studies, Uncategorized

matt-heusser

 

Matthew Heusser, an accomplished tester, frequent blogger, and insightful contributor in the Context Driven Testing mailing list, and a testing expert whose opinion I respect a lot, has just published a very thought-provoking blog post that highlights an important issue surrounding "PowerPointy" consultants in the testing industry who have relatively weak real world testing chops. It's called "[The Fishing Maturity Model](http://blogs.stpcollaborative.com/matt/2009/10/08/the-fishing-maturity-model/."

Matthew argues that testers are well-advised to be skeptical of self-described testing experts who claim to "have the answer" - particularly when such "experts" haven't actually rolled their sleeves up and done software testing themselves. In reading his article, I found it quite thought-provoking, particularly because it hit close to home: while I'm by no means a testing expert in the broader sense of the term, I do consider myself to know enough about combinatorial test design strategies applicable to software testing to be able to help most testing teams become demonstrably more efficient and effective... and yet, my actual hands-on testing experiences are admittedly quite limited. If I'm not one of the guys he's (justifiably) skewering with his funny and well-reasoned post (and he assures me I'm not; see below), a tester could certainly be forgiven for mistaking me for one based on my past experiences.

Matthew's Five Levels of the Fishing Maturity Model (based, not so loosely, of course on the Testing Maturity Model, not to mention CMM, and CMMi)...

 

The five levels of the fishing maturity model: 1 – Ad-hoc. Fishing is an improvised process. 2 – Planned. The location and timing of our ships is planned. With a knowledge of how we did for the past two weeks, knowing we will go to the same places, we can predict our shrimp intake. 3 – Managed. If we can take the shrimp fishing process and create standard processes – how fast to drive the boat, and how deep to let out the nets, how quickly, etc, we can improve our estimates over time, more importantly. 4 – Measured. We track our results over time – to know exactly how many pounds of shrimp are delivered at what time with what processes. 5 – Optimizing. At level 5, we experiment with different techniques; to see what gathers more shrimp and what does not. This leads us to continual improvement.

 

Sounds good, right? Why, with a little work, this would make a decent 1-hour conference presentation. We could write a little book, create a certification, start running conferences …

 

And the rub...

The problem: I’ve never fished with nets in my entire life. In fact, the last time I fished with a pole, I was ten years old at Webelo’s camp.

I posted the following response, based on my personal experiences: Words in [brackets] are Matthew's response to me.

Matthew,

Excellent post, as usual. [I'm glad you like it. Thank you.]

You raise very good points. Testers (and other IT executives) should be leery of snake oil salesmen and use their judgment about “experts” who lack practical hands-on experience. While I completely agree with this point, I offer up my own experiences as a “counter-example” to the problem you pointed out here.

3-4 years ago, while I was working at a management consulting and IT company, (with a personal background as an entrepreneur, lawyer, and management consultant – and not in software testing), I began to recommend to any software testers who would listen, that they start using a different approach to how they designed their test cases. Specifically, I was recommending that testers should begin using applied statisitics-based methods* designed to maximize coverage in a small number of tests rather than continuing to manually select test cases and rely on SME’s to identify combinations of values that should be tested together. You could say, I was recommending that they adopt what I consider to be (in many contexts) a “more mature” test design process.

The reaction I got from many teams was, as you say “this whole thing smells fishy to me” (or some more polite version of the rebuttal “Why in the world should I, with my years of experience in software testing, listen to you – a non-software tester?”) Here’s the thing: when teams did use the applied statistics-based testing methods I recommended, they consistently saw large time reductions in how long it took them to identify and document tests (often 30-40%) and they often saw huge leaps in productivity (e.g., often finding more than twice as many defects per tester hour). In each proof of concept pilot, we measured these carefully by having two separate teams – one using “business as usual” methods, the other using pairwise or orthogonal array-based test design strategies – test the same application. Those dramatic results led to my decision to create [Hexawise](http://www.hexawise.com/users/new, a software test design tool. [Point Taken ...]

My closing thoughts related to your post boils down to:

  1. I agree with your comment – “There are a lot of bogus ideas in software development.”

  2. I agree that testers shouldn’t accept fancy PowerPointed ideas like “this new, improved method/model/tool will solve all your problems.”

  3. I agree that testers should be especially skeptical when the person presenting those PowerPointed slides hasn’t rolled up their sleeves for years as a software testing practitioner.

Even so…

  1. Some consultants who lack software testing experience actually are capable of making valuable recommendations to software testers about how they can improve their efficiency and effectiveness. It would be a mistake to write them off as charlatans because of their lack of software testing experience. [I agree with the sentiment that sometimes, people out of the field can provide insight. I even hinted at that with the comment that at least, Forrest should listen, then use his discernment on what to use. I'm not entirely ready to, as the expression goes, throw the baby out with the bathwater.]

  2. Following the “bogus ideas” link above takes readers to your quote that: “When someone tells you that your organization has to do something ‘to stay competitive,’ but he or she can’t provide any direct link to sales, revenue, reduced expenses, or some other kind of money, be leery.” I enthusiastically agree. In the software testing community, in my view, we do not focus enough on gathering real data** about which approaches work (or -ideally- in what contexts they work). A more data-driven management approach would help everyone understand what methods and approaches deliver real, tangible benefits in a wide variety of contexts vs. those methods and approaches that look good on paper but fall short in real-world implementations. [Hey man, you can back up your statements with evidence, and you're not afraid to roll up your sleeves and enter an argument. I may not always agree with you, but you're exactly the kind of person I want to surround myself with, to keep each other sharp. Thank you for the thoughtful and well reasoned comment.]

-Justin

 

Company – http://www.hexawise.com
Blog – http://hexawise.wordpress.com
Forum – http://testing.stackexchange.com

 

*I use the term “applied statistics-based testing” to incorporate pairwise, orthogonal array-based, and more comprehensive combinatorial test design methods such as n-wise testing (that can capture, for example, all possible valid combinations involving 6-values).

**Here is an article I co-wrote which provides some solid data that applied statistics-based testing methods can more than double the number of defects found per tester hour (and simultaneously result in finding more defects) as compared to testing that relies on "business as usual" methods during the test case identification phase.

By: Justin Hunter on Oct 12, 2009

Categories: CMM, Combinatorial Testing, Pairwise Testing, Software Testing, Software Testing Efficiency, Testing Maturity Model

Today I've released a beta version of testing.stackexchange.com which is a "stackoverflow.com for software testers." I would appreciate your help in contributing content, and/or getting the word out. Stackoverflow has become an extraordinarily useful forum for software developers to ask difficult, practical questions, and get quick, actionable, peer-reviewed responses from software developers around the globe. While there are some software testing questions on stackoverflow itself, the questions are mostly software developer-centric. There's no reason why we can't create a very similar forum geared primarily towards the software testing community. So who's with me? Please show your support by posting a question, sharing an answer or voting on existing answers at [testing.stackexchange.com](http://testing.stackexchange.com/

If you share my belief in the significant potential benefit to the software testing community that would result from a mature, well-trafficked site with a rich collection of peer-reviewed questions on software testing and you would be interested in helping out beyond posting periodic questions and/or answers to the site, please post a reply here or contact me through Linkedin. I'd love to brainstorm ideas and work with like-minded people to get this forum created for the software testing community. As of now, the odds are against testing.stackexchange from growing to obtain the critical mass it needs (particularly since I'm busy day-to-day building my software testing tool company); a small number of active collaborators would improve the odds dramatically.

I first found out about stackoverflow.com through my brother's blog here

 

Joel Spolsky's video is fantastic. He set out to crack the code on:

  • How can you get a useful exchange of information between experts that results in very good questions and answers being actively shared by participants?

  • How can the community encourage visitors to the site to actively participate and share their expertise?

  • How can the site generate a critical mass and utilize Google to drive traffic to the site to make it self-sustaining?

  • How can users (who might not otherwise be able to tell which are the best answers from among multiple answers) tell which answers are in fact the best?

 

In my view, he has succeeded on all of the above counts, which is truly impressive. We're using the identical strategies (and Spolsky's technology) at testing.stackexchange.com. The way Spolsky lays out his vision is impressive. He logically progresses through a graveyard of multiple Q & A sites that have devolved into largely useless forums where inane questions are asked and dubious answers are shared. He then shares how he and his collaborators adjusted the model for Stackoverflow to maximize the value to participants. Their self-described strategy amounts to taking the best ideas they could from multiple different sites and putting them together in stackoverflow (and "using Google as our landing page" as a way to build traffic).

Thank you in advance for helping to get the word out.

 

By: Justin Hunter on Oct 6, 2009

Categories: Software Testing

I enjoyed talking about efficient and effective combination testing strategies (and highlights of a recent empirical study) at yesterday's TISQA meeting together with Lester Bostic of Blue Cross Blue Shield North Carolina, who shared his team's experiences of adopting a combinatorial testing approach. It addresses how tools like Hexawise can help software testers quickly identify the test cases they should execute to find as many defects as possible with as few tests as possible. I wanted to share it now; once I have more time, I will comment on it and highlight some of the good questions, comments, discussion points, and tester experiences that were raised by the attendees.

The presentation focused on combinatorial testing techniques, such as pairwise testing, orthogonal array-based testing methods, and more thorough combination testing strategies (capable of identifying all defects that could be captured by, say, any possible combination of three or four "things" that you've decided to test for (regardless of whether those "things" include features configurations or equivalence class of data or type of user a mix of each).

The middle of the presentation also highlights empirical evidence that shows this method of identifying test cases often has an enormous impact on how quickly software testers are able to identify defects; citing the IEEE Computer article I co-wrote last month on Combinatorial Testing, this approach - on average - led to more than twice as many defects found per tester hour.

The final section of the presentation was delivered by Lester Bostic of Blue Cross Blue Shield and addresses his lessons learned. Lester used Hexawise to reduce 1,356,136,048,589,996,428,428,909,447,344,392,489,476,985,674,792,960 possible tests (that would have been necessary to achieve comprehensive testing of the application he was testing) to only 220 tests that proved to be extremely effective at identifying defects. .

Comments and questions are welcome.

 

By: Justin Hunter on Sep 18, 2009

Categories: Combinatorial Testing, Pairwise Testing, Software Testing, Software Testing Efficiency, Testing Case Studies

lessons from car manufacturing-20090826-1718522

 

Tony Baer from Ovum recently wrote a blog post titled: Software Development is like Manufacturing which included the following quotes.

"More recently, debate has emerged over yet another refinement of agile – Lean Development, which borrows many of the total quality improvement and continuous waste reduction principles of [lean manufacturing](http://www.lean.org/WhatsLean/. Lean is dedicated to elimination of waste, but not at all costs (like Six Sigma. Instead, it is about continuous improvement in quality, which will lead to waste reduction....

In essence, developing software is like making a durable good like a car, appliance, military transport, machine tool, or consumer electronics product.... you are building complex products that are expected to have a long service life, and which may require updates or repairs."

Here are my views: I see valid points on both sides of the debate. Rather than weigh general high-level pro's and cons, though, I would like to zero in on what I see as an important topic that is all-too-often missing from the debate. Specifically, Design of Experiments has been central to Six Sigma, Lean Manufacturing, the Toyota Production System, and Deming's quality improvement approaches, and is equally applicable to software development and testing, yet adoption of Design of Experiments methods in software design and testing remains low. This is unfortunate because significant benefits consistently result in both software development and software testing when Design of Experiments methods are properly implemented.

What are Design of Experiments Methods and Why are they Relevant?

In short, Design of Experiments methods are a proven approach to creating and managing experiments that alter variables intelligently between each test run in a structured way that allows the experimenter to learn as much as possible in as few experiments as possible. From wikipedia: “Design of experiments, or experimental design, (DoE) is the design of all information-gathering exercises where variation is present, whether under the full control of the experimenter or not. Often the experimenter is interested in the effect of some process or intervention (the “treatment”) on some objects (the “experimental units”).”

Design of Experiments methods are an important aspect of Lean Manufacturing, Six Sigma, the Toyota Production System, and other manufacturing-related quality improvement approaches/philosophies. Not only have Design of Experiments methods been very important to all of the above in manufacturing settings, they are also directly relevant to software development. By way of example, W. Edwards Deming, who was extremely influential in quality initiatives in manufacturing in Japan and the U.S. was an applied statistician. He and thousands of other highly respected quality executives in manufacturing, including Box, Juran and Taguchi (and even my dad), have regularly used Design of Experiments methods as a fundamental anchor of quality improvement and QA initiatives and yet relatively few people who write about software development seem to be aware of the existence of Design of Experiments methods.

What Benefits are Delivered in Software Development by Design of Experiments-based Tools?

Application Optimization applications, like Google’s Website Optimizer are a good example of Design of Experiments methods can deliver powerful benefits in the software development process. It allows users to easily vary multiple aspects of web pages (images, descriptions, colors, fonts, colors, logos, etc.) and capture the results of user actions to identify which combinations work the best. A recent YouTube multi-variate experiment (e.g., and experiment created using Design of Experiment methods) shows how they used the simple tool and increased sign-up rates by 15.7%. The experiment involved 1,024 variations.

What Benefits are Delivered in Software Testing by Design of Experiments-based Tools

In addition, software test design tools, like the Hexawise test design tool my company created, enable dramatically more efficient software testing by automatically varying different elements of use cases that are tested in order to achieve an optimal coverage. Users input the things in the application they want to test, push a button and, as in the Google Web Optimizer example, the tool uses DoE algorithms to identify how the tests should be run to maximize efficiency and thoroughness. A recent IEEE Computer article I contributed to, titled "Combinatorial Testing" shows, on average, over the course of 10 separate real-world projects, tester productivity (measured in defects found per tester hour) more than doubled, as compared to the control groups which continued to use their standard manual methods of test case selection: http://tinyurl.com/nhzgaf

Unfortunately, Design of Experiments methods – one of the most powerful methods in Lean Manufacturing, Six Sigma, and the Toyota Production System – are not yet widely adopted in the software development industry. This is unfortunate for two reasons, namely:

  1. Design of Experiments methods will consistently deliver measurable benefits when implemented correctly, and

  2. Sophisticated new tools designed with very straightforward user interfaces make it easier than ever for software developers and testers to begin using these helpful methods.

By: Justin Hunter on Aug 25, 2009

Categories: Agile, Design of Experiments, Efficiency, Lean, Multi-variate Testing, Software Testing, Software Testing Efficiency

hipp1

 

Jeff Fry recently linked to a fantastic webcast in Controlled Experiments To Test For Bugs In Our Mental Models. I would highly recommend it to anyone without any reservations. Ron Kohavi, of Microsoft Research does a superb job of using interesting real-world examples to explain the benefits of conducting small experiments with web site content and the advantages of making data-driven decisions. The link to the 22-minute video is here.

I firmly believe that the power of applied statistics-based experiments to improve products is dramatically under-appreciated by businesses (and, for that matter, business schools), as well as the software development and software testing communities. Google, Toyota, and Amazon.com come to mind as notable exceptions to this generalization; they "get it". Most firms though still operate, to their detriment, with their heads in the sand and place too much reliance on untested guesswork, even for fundamentally important decisions that would be relatively easy to double-check, refine, and optimize through small applied statistics-based experiments that Kohavi advocates. Few people who understand how to properly conduct such experiments are as articulate and concise as Kohavi. Admittedly, I could be accused of being biased as: (a) I am the son of a prominent applied statistician who passionately promoted broader adoption of such methods by industry and (b) I am the founder of a software testing tools company that uses applied statistics-based methods and algorithms to make our tool work.

Here is a short summary of Kohavi's presentation:

 

Practical Guide to Controlled Experiments on the Web: Listen to Your Customers not to the HiPPO

1:00 Amazon: in 2000, Greg Linden wanted to add recommendations in shopping carts during the check out process. The "HiPPO" (meaning the Highest Paid Person's Opinion) was against it thinking that such recommendations would confuse and/or distract people. Amazon, a company with a good culture of experimentation, decided to run a small experiment anyway, "just to get the data" – It was wildly successful and is in widespread use today at Amazon and other firms.

3:00 Dr. Footcare example: Including a coupon code above the total price to be paid had a dramatic impact on abandonment rates.

4:00 "Was this answer useful?" Dramatic differences in user response rates occur when Y/N is replaced with 5 Stars and whether an empty text box is initially shown with either (or whether it is triggered only after a user clicks to give their initial response)

6:00 Sewing machines: experimenting with a sales promotion strategy led to extremely counter-intuitive pricing choice

7:00 "We are really, really bad at understanding what is going to work with customers…"

7:30 "DATA TRUMPS INTUITION" {especially on novel ideas}. Get valuable data through quick, cheap experimentation. "The less the data, the stronger the opinions."

8:00 Overall Evaluation Criteria: "OEC" What will you measure? What are you trying to optimize? (Optimizing for the “customer lifetime value”)

9:00 Analyzing data / looking under the hood is often useful to get meaningful answers as to what really happened and why

10:30 A/B tests are good; more sophisticated multi-variate testing methods are often better

12:00 Some problems: Agreeing upon Overall Evaluation Criteria is hard culturally. People will rarely agree. If there are 10 changes per page, you will need to break things down into smaller experiments.

14:00 Many people are afraid of multiple experiments [e.g., multi-variate experiments or MVE] much more than they should be.

(A/B testing can be as simple as changing a single variable and comparing what happens when it is changed, e.g., A = "web page background = Blue" / B = "web page background = Orange." Multi-variate experiments involve changing multiple variables in each test run which means that people running the tests should be able to efficiently and effectively change the variables in order to ensure not only that each of the variables is tested but also that the each of the variables is tested in conjunction with each of the others because they might interact with one another). My views on this: before software tools made conducting multi-variate experiments (and understanding the results of the experiments) a piece of cake, this fear had some merit; you would need to be able to understand books like this to be able to competently run and analyze such experiments. Today however, many tools, such as Google's Website Optimizer (used for making web sites better at achieving their click through goals, etc.) and Hexawise (used to find defects with fewer test cases) build the complex Design of Experiments-based optimization algorithms into the tool's computation engine and provide the user of the tool with a simple user interface and user experience. In short, in 2009, you don't need a PhD in applied statistics to conduct powerful multi-variate experiments. Everyone can quickly learn how to, and almost all companies should, use these methods to improve the effectiveness of applications, products and/or production methods. Similarly, everyone can quickly learn how to, and almost all companies should, use these methods to dramatically improve the effectiveness of their software testing processes.>

16:00 People do a very bad job at understanding natural variation and are often too quick to jump to conclusions.

17:00 eBay does A/B testing and makes the control group ~1%. Ron Kohavi, the presenter, suggests starting small then quickly ramping up to 50/50 (e.g., 50% of viewers will see version A, 50% will see version B).

19:00 Beware of launching experiments than "do not hurt," there are feature maintenance costs.

20:00 Drive to a data-driven culture. "It makes a huge difference. People who have worked in a data-driven culture really, really love it… At Amazon… we built an optimization system that replaced all the debates that used to happen on Fridays about what gets on the home page with something that is automated."

21:00 Microsoft will be releasing its controlled experiments on the web platform at some point in the future, but probably not in the next year.

21:00 Summary

  1. Listen to your customers because our intuition at assessing new ideas is poor.

  2. Don't let the HiPPO drive decisions; they are likely to be wrong. Instead, let the customer data drive decisions.

  3. Experiment often create a trustworthy system to accelerate innovation.

 

Related: Statistics for Experimenters - Articles on design of experiments

By: Justin Hunter on Aug 18, 2009

Categories: Design of Experiments, Multi-variate Testing, Software Testing