Subscribe to Blog

Open-response survey questions are a great way to gather exploratory data.  They are often most useful during preliminary research when researchers don’t know what options to give respondents for a forced-choice question.  Open-response questions allow collection of rich data, i.e. detailed data that may contain nuances that are harder to obtain through more restricted question types.  But just like performers on stage for Open Mic, open-response survey questions require preparation so the survey doesn’t bomb. While open-response questions are probably the easiest to write, they can be challenging and time-consuming to analyze. Therefore, researchers should carefully weigh the costs and benefits of such questions as well as allocate sufficient resources for analysis prior to using them in their surveys.

Open-response data and analysis

Open-response data analysis will use many of the same techniques employed in previous question analyses. However, before open-response questions can be analyzed, responses must be coded.  Because respondents will use different words to explain the same things, the data must be organized in a way that allows similar answers to be grouped together.

The steps for analyzing single-themed, open-response questions are as follows:

  1. Define a coding scheme – this coding scheme can be based on results from previously conducted surveys, trial and error, academic theories or themes in other research. The goal should be to breakdown responses is a way that will be valuable to your organization. It’s up the researcher decide how granular the scheme will need to be for the results to be meaningful. However, schemes should be clear and descriptive as possible so that another coder, using the same information, can code the data in a similar way.

For example, the scheme may look something like this for pizza toppings:

Pepper (includes any type of pepper, but not cayenne (includes: peppers, green peppers, red peppers)

Pineapple (includes all citrus toppings and fruit: citrus, pineapple, orange, fruit)

Since the schemes are just categorical group names, it’s not all that important what they are called. The important factor is how items are grouped within the categories.  Researchers should expect their schemes to evolve during the coding process.  There is no way to know a priori the scope of the answers respondents will use.

  1. Code data by defined themes – this process involves categorizing the data. One way to track codes is to organize them in a spreadsheet. Each respondent’s answer can be row within the spread sheet (see below).

  1. Multiple coders – after the first round of coding by the primary coder, a second coder should code the same data using the defined coding scheme. This will allow the researcher to evaluate how clear and consistent the themes are and how consistently the coders were able to follow the scheme. A second coder will also help the researcher understand if the results will be replicable.
  1. Inter-rater agreement – after both coders have completed their data coding, an inter-rater agreement test can be performed. This statistic is called a Cohen’s Kappa and measures the agreement between raters. The higher the score, the better the agreement (McHugh, 2012). If the two coders have not reached at least a “moderate” level of agreement, they should discuss and adjust the coding scheme, and then recode the data until they reach a high level of agreement.


  1. Reconciling differences – there will be data that is coded differently between coders. To reconcile disagreements in coding, a third coder is employed to settle disagreements.  This coder will decide which code fits the data better according to the scheme description. All difference in coding need to be reconciled before moving to the analysis step.


  1. Analyze data as above – once you have completed steps 1-5 with good inter-rater agreement, you can move onto the analysis. You can now analyze these data as you would analyze other categorical data by looking at the frequencies of each code and then performing the χ2 Goodness of Fit test (see previous posts for analyses descriptions).

Unquestionably, open-response survey questions are a great way to obtain exploratory data.  However, as demonstrated, these types of questions can take a great deal of time and other resources to analyze.  However, the benefits from open-response questions are often worth the cost. It might be practical to use smaller samples during an exploratory research phase when using open-ended questions and then use forced-choice questions (with an optional “other” response possibility) and a larger sample size to confirm findings across a more diverse sample.


de Winter, J.C.F. and D. Dodou (2010), Five-Point Likert Items: t test versus Mann-Whitney-Wilcoxon, Practical Assessment, Research and Evaluation, 15(11)

More from this author

Scott Hutchins

Technical Co-founder

Scott Hutchins
Scott Hutchins

Scott Hutchins

Technical Co-founder

More from this author

Watch Highlight Reels

Find out how Truthlab can shed light on the customer experience with the truth quotient.

Customer Experience Update