Surveys are expensive and time consuming to develop, run and analyze. After a survey has closed, there’s not much that can be done about a confusing question or data types you want to analyze in a specific way that were not collected. The best time to consider how to analyze a survey is during development.
During the development of the survey, consider what sorts of analyses will be run, and what comparisons are to be made. This will help guide decisions about what research questions to ask, how to ask them and how to measure responses. For example, if the desire is to report average salaries, but only salary ranges are collected, simple means will not be able to be calculated. Best practices call for running a pilot of the survey with a small subset of participants and performing a practice analysis on the pilot data to make sure everything will run smoothly when the actual survey launches.
Analyses are limited by the types of data you collect. The four, typical classifications of data are: Nominal/Categorical, Ordinal, Interval and Ratio. As a general rule there is more analysis flexibility with Ratio data than with Nominal. It’s nearly always possible to transform data from Ratio to Nominal data, but much more challenging, if not impossible, to convert Nominal data to Ratio data. That said, surveys are often laden with questions that collect only Nominal data so these surveys will be limited to the types of analyses that are appropriate for Nominal data. A brief explanation of data types and the types of surveys questions usually associated with these types of data are reviewed below.
Just like the name says, these data are categories of information. Nominal data are data that have no inherent order such as pizza toppings (pepperoni, sausage, cheese, mushrooms, pineapple), gender (male, female), or color (red, blue, green). Single answer, multiple choice questions are often associated with nominal data.
These data do have some sort of inherent order, but do not always have any sort of number associated with them. Another important element of ordinal data is that even though it may have an inherent order, the difference between the categories may not be equal. For example, items such as levels of education (elementary, middle school, high school, college), medal earned (bronze, silver, gold), and levels of satisfaction (not at all satisfied, satisfied, very satisfied) are ordinal data. Each set can be listed in an order, however there is no implied consistency in the difference between each category or group. The only assumption that can be made is that one category comes before another category. For example, with medals earned, you can only say that gold is better than silver. You cannot say how much better. The difference may be very small or very large. The difference between a gold medal and a silver medal may be quite large while the difference between a silver medal and a bronze medal may be quite small. Ordinal data can be confusing because even though there might be numbers associated (1st, 2nd, 3rd place) it still must be essentially treated as categorical when considering most statistical analyses. And sometimes scales that look like interval or ratio data are just ordinal data.
Interval data are similar to Ordinal data because they have an inherent order. However they differ from Ordinal in that the intervals between each point are equidistant. Like Ratio, they are often numerical and continuous. Unlike Ratio data, Interval data do not have a true zero, however they can be treated like Ratio data in many ways. Basic arithmetic (addition, subtraction, multiplication, division) can performed on Interval data. Therefore, means and standard deviations can be calculated, and statistical significance tests such as ANOVAs, T-tests, and correlations can be performed. An example of interval data is degrees Celsius or Fahrenheit. Zero degrees is not the absence of temperature. Still, the difference between 5o and 6o is the same as the difference between 6o and 7o.
The final type of data to discuss are Ratio data. Ratio data are likely the easiest and most familiar to analyze. Ratio data, such as salaries, costs or times, are continuous, numerical data that have a true zero. Like Interval data, many types of data analyses can be performed using Ratio data. In surveys, Ratio data are often more difficult to collect than other types of data because salaries or costs require participants to enter their answer in a free-response text box. This means that participants may enter their answers in a variety of formats that might require cleaning and reformatting after collection.
Open-response questions can collect any of the types of data previously described. Open response can be long paragraphs or short, quick answers. Instead of asking a multiple-choice question about color choice, a free-response text box can be used. When choosing to ask multiple choice vs. free-response questions, survey developers should think about the amount of cleaning, coding and formatting that will be required during analysis. While open-response questions are a terrific way to collect data when first exploring a topic, recoding thousands of responses can be overwhelming. Even open-response, Ratio data may require reformatting and cleaning depending on the restrictions the survey platforms allows for entry.
Understanding the types of data that can be collected via surveys is the first step to writing great questions that can afford the types of analyses that will be meaningful to each organization. In coming weeks, we’ll discuss how to analyze these types of data specific to the types of questions that were used to collect them.