I recently gave a presentation to the Qualitative Researchers Consultants Association. The talk was on the use of eye tracking equipment in usability testing. Through a couple of demonstrations I’ve discussed in an earlier newsletter, I tried to show that people have to be careful not to jump to conclusions about the data they get from eye tracking equipment. In the subsequent discussion at my talk, the theme became that we, as consultants, have to be careful with qualitative data (which almost all usability data is) and about assuming cause-and-effect relationships. This is particularly so when we use biometric measurements, such as eye tracking data, galvanic skin response, heart rate, blink rate, or any other biometric data. The desire to see cause-and-effect when collecting this type of data is very compelling.
As humans, we have a tendency to see cause-and-effect relationships everywhere. This is one of the basic functions of our unconscious mental system. Data that is truly random will be seen as random, but give us data that has even a hint of a pattern and we’ll find it and give it meaning. We just can’t help ourselves. Our brains have evolved over millions of years to do this, so we’re really good at it. We see a duck or a boat in a set of clouds, or the face of Jesus or the Mother Mary in a slice of toast. (There is even a special part of the brain that does nothing but look for faces, so the tendency for the latter to occur is quite common.)
Consider the legend that storks bring babies. One story goes that this legend began in London, England, during either the 16th or 17th century. Poor people in London did not have much money so they tended not to waste it heating their homes unless they had a new baby in the house. Storks liked to nest at houses that had warm roofs. Therefore, houses that had babies also tended to have storks. So the storks must have been involved with the babies. This is known as a spurious correlation error, where we assume two things are correlated with each other when they are really correlated to a third item.
Today, if you look up the legend, you’ll see that there is a lot of cause-and-effect data (rationalizations) for why storks would be associated with babies. The question is, which came first? Are storks really a universal symbol for babies? Or did the English people invent this legend first, and then historians later found other patterns in the data that could support the cause-and-effect relationship conclusion (e.g., storks are associated with love in Roman culture; storks are associated with good luck in Germany and the Netherlands; chimneys are a good way to deliver babies into houses)?
Consider the following example from a recent project. We were asked to perform an analysis of server log data for a large federal client. We did the analysis and presented the analyzed data. Our clients looked at it and could tell us with certainty the cause of the pattern they saw in the analyzed data. (“Oh yes,” they said. “That makes sense that data X went up in August because we did Y in July.”) But there was a problem. Through a programming glitch on their end, we were given only one-third of all of the data to analyze. Once this problem was discovered, we were provided with a full set of data and we reran our analysis. This second analysis produced an entirely different pattern.
Once again it was presented to our client and once again they were able to tell us with equal certainty why the data looked the way that it did this time. (“Oh yes,” they said. “That makes sense that data X went up in June because we did Y in May.”) What they failed to notice was that they were just as certain about the cause-and-effect relationship when looking at one-third of the data. The takeaway from this is that they shouldn’t be certain of their cause-and-effect assumptions when looking at either set of data, but that was a hard thing for them to understand. The cause-and-effect engine is too powerful.
When collecting data in a usability evaluation, these false cause-and-effect assumptions occur all too frequently. We (and more importantly, our clients) look for, see, and convince ourselves that we know the cause-and-effect relationship that generated the data we collect or the behavior we observe. We need to recognize that these “findings” are automatic and unconscious thought processes that we cannot take at face value. We need to corroborate these automatic assumptions somehow. We need to consider alternative, maybe mundane, hypotheses for this data pattern and make sure they’re not just as plausible. We need to use solid psychological constructs and past experience to support or dispute our assumptions. After all, anyone can collect data. It is our responsibility as professionals to properly interpret it.