Tuesday, August 9, 2011

Online Behavior Isn't Always Enough to Answer "Why?"

Jian Shuo Wang, who lives in Shanghai, can "recite 85 digits of pi after the decimal", and works in the technology industry, has a popular blog that I first noticed shortly before moving to Shanghai years ago.  He comments on a mix of topics, ranging from daily life in Shanghai to his experiences working in the technology domain.  In a recent post he mentions that Facebook was initially surprised that its photos application was such a success, and they desired to know why.  In reference to the value of "knowing why" he writes [note: English is not his native language and I have kept his writings "as is"]:
Soul searching means the deep trace of the reason why something worked. It is easy to be happy about a great feature, and a successful campaign, but it is way to easy to just stop tracing the deeper reason of the product. Just like the photos application of Facebook. It is a simple application without most of the features other photo sites have, but it is soon becoming the most successful photo application on the Internet. What is the driver for that? Why is that? Why, Why and Why?

... If something happens, and it is a good one, don't let it go. Push ourselves to do a deep soul searching and understand the deeper reason behind it.
I certainly agree that when a design is unexpectedly successful there is often much to be gained from understanding why and that it may not be wise to simply bask in glory.  A deeper understanding could lead to even more successful designs and/or guard against later problems.

However, I don't entirely agree with his comments on where the answers to such questions can be found [emphasis mine]:
Thank God we are in Internet space, and we have all the data needed to understand the reason. Just like Facebook can dig into the data and understand every photo change leads to 25 new page views, there must be some link between the reason and the result.
It is the section in bold that particularly concerns me.  I assume he is referencing the data that can be collected by an online service's server logs (in short, server logs can potentially provide details of many, but not all, of people's actions on a web site such as pages visited, buttons clicked, text submitted, etc).  I think it's worth addressing this claim in a blog post as I've heard people I've worked with make similar comments, and I've found it can be invaluable to provide a fuller picture of how research can best approach seeking answers to questions about people's behavior.

I'll focus on two reasons in particular for why the data found in server logs aren't always sufficient for answering why something online is or isn't a success.  I could write chapters on each and delve into some deep issues, but for now I'll keep things relatively brief and make use of some simple analogies.

1.  Observing what people do doesn't always tell you why they do it.

Imagine if you spent a week only watching people as they purchased ice cream at an ice cream store offering 10 different flavors.  At the end of the week you may be able to say with confidence that chocolate ice cream was the most popular choice during that period of time.  However, you probably couldn't say why.  Maybe people naturally preferred its taste.  Maybe there were effective TV advertisements for chocolate ice cream.  Maybe news of chocolate's health benefits had a large impact.  You could continue observing ice cream purchases in the store for a year and still not get much closer to an answer for why chocolate is purchased the most.

Similar issues can hold true for understanding online behavior.  In fact, only looking at server logs might not just mean you can't answer any deep "why's" but that you can't even be sure whether something is truly a success.  For example, server logs may indicate that people were far more likely to click the "correct" link A than the "incorrect" link B.  But the logs won't tell you if people were clueless about the purpose of link A and only clicked it because they knew link B was not what they wanted and saw no other option (yes, I've seen this happen).  In some cases, that might be good enough to be considered a "success", but in many others it won't be.

2.  People's online behavior isn't only driven by online factors.

The online world is just a part of people's lives.  While the online world can be wonderful and vast, the offline world still matters (really!) and plays a key role in determining how people behave online (the reverse can also be also true).  Without any data regarding the offline world, one could be left in the dark about key issues impacting behavior online.  Continuing with the ice cream example, if you only observed behavior inside the store you would not likely discover potentially key information such as whether people who preferred chocolate were more likely to have seen TV advertisements for chocolate.

For the online world consider social networking.  Imagine you're concerned that certain types of experiences people have aren't being shared with others online and you want to know why.  It could be that those experiences are in fact only being shared when people are face-to-face offline.  This insight may be key to an explanation for the online behavior and innovating a new online feature/service yet it wouldn't likely be discovered by only viewing data from a server log. 

Sometimes, it's not only important to understand what is occurring in the offline world before and/or after an online experience, but during as well.  The following webcomic from xkcd helps illustrate this point:

YouTube Parties

If you place your mouse pointer over the image, you may be able to see this additional commentary:
This reminds me of that video where ... no?  How have you not seen that?  Oh man, let me find it.  No, it's ok, we can go back to your video later.
There are several key issues directly or indirectly implied in the scene above such as multiple people watching the video together, different levels of interest in the video, different ideas about what should be watched next, etc.  Yet again, these issues in the offline world could be critical not only for understanding the success of various online features and how they're being used but also for providing inspiration for potential innovations (on the side, I think there are some intriguing designs that could be based on this single webcomic).  However, if YouTube only analyzed their server logs they may never discover the degree to which these or similar issues are occurring.

To be clear, I'm not saying that server logs aren't valuable.  Knowing how people actually behave is important and there is much that can be learned about it through proper analysis of data from server logs (even if the data doesn't show the full story).  With the caveats mentioned above, server logs can play a particularly valuable role in measuring the success of various features/services -- for example which of two or more designs leads to more page views, more time spent on the site, more purchases, etc. 

However, no matter how many "why's" you ask, there are limitations to what can be uncovered solely by analyzing data in server logs.  For some questions, other sources of data will be required -- whether from observations of people in their natural environment, experiments under controlled conditions measuring any of a variety of factors (such as eye movements, body language, and spoken comments), in-depth interviews, etc.  Knowing which method(s) could best answer a particular question is one of the main challenges in conducting meaningful research.  Sometimes, research methods themselves need a bit of innovation.

Someday later I may discuss how factors such as business goals, the resources required to conduct certain types of research, and the current state of knowledge about human behavior/cognition (there is much we don't know) can add pragmatic constraints to which "why's" can or should be tackled.  Whether in the corporate world or the academic world, often you need to address why it's worth trying to answer "why".

No comments:

Post a Comment