Step 5: Organise and analyse the data
These notes go into some detail on survey data analysis. They should be read alongside the general notes on working with qualitative and quantitative data in the School Excellence Framework evidence guide.
Preparing the data for analysis
Responses from hard copy questionnaires need to be entered into a spreadsheet prior to analysis. Manual data entry can be made easier by setting up the survey as an online form and then entering the responses (see note in Step 2: Choose a survey format for commonly-used platforms).
Online survey data is already in a database form and can be extracted into a spreadsheet or analysed within the survey software itself.
Once the responses are all entered, extract the summary data for each question, showing:
- the raw number of responses in each category
- summary statistics for numeric data (the mean, for example)
- a list of verbatim responses to the open-ended questions.
Now check this output to see if anything needs correcting, and make adjustments in the source data. This may include:
- removing incomplete responses where the person started the survey but only did the first one or two questions
- removing non-serious responses, if there are any (check the open-enders)
- checking for data entry errors, if the responses have been manually entered into a spreadsheet
- correcting typographical errors in open-ended responses (ensuring that meaning is not changed)
- deleting data where respondents have answered a question that they should not have been asked. This is more common in paper surveys, where flow control relies on people following instructions about which question to go to next.
Summarising the results in an annotated questionnaire
Once the data are ‘clean’ and ready to use, paste the summary results into the questionnaire itself. This document can then serve as a resource for analysis, or an Appendix to a report.
Simple descriptive analysis of quantitative data
In a report structure, start by summarising the key quantitative data using tables or charts, accompanied by a narrative. Some questions won’t need a chart or table – they can just be summarised in the text.
Where a chart or table only uses percentages, state the sample size using the notation ‘n=…’ (‘n=123’, for example).
- Where only a subset of the sample answered a question, explain this (for example, ‘n=87 students who attended the event’).
- For questions that had a ‘not sure’ response option, be intentional and explicit about whether these responses are included or excluded from the sample.
For a question that allowed multiple responses:
- expect the percentages to add to more than 100%
- be careful when creating sub-categories, as it is not a simple matter of addition. For example, if 20% of students say they walked to school last week and 15% said they cycled, you cannot simply add these together and say that 35% walked or cycled. This is because it is a multiple response question, and some students may have both walked and cycled on different days.
Charts should be self-explanatory and easy to interpret:
- Include the question and response options as they were asked in the questionnaire.
- For questions with lots of response options (or long response options), use a bar chart rather than a column chart).
- For an ordinal scale (‘Excellent’ to ‘Poor’, for example), show the categories in their natural order, as displayed in the questionnaire.
- For a categorical scale (a list of subjects, for example), report the categories in order of frequency (say, most to least popular).
- Use pie charts sparingly, and only when there are a small number of mutually exclusive categories.
Analysis of open-ended responses
Analysis of open-enders should identify the main themes (in your words) and illustrate these with quotes (in the words of respondents). The analysis should also identify ‘outlier’ positions – perspectives that are equally valid and important, but less common.
Working with a large volume of open-ended responses may require ‘coding’. Once open-enders have been coded, the data should behave like a multiple response question, allowing additional analysis can be undertaken (see below). See notes on analysing interview and focus group data (linked below) for more detail.
Analysing quantitative data more deeply
Once the overall survey data have been summarised, it is natural to want to dig deeper. For example:
- That’s an interesting perspective… I wonder who that’s coming from?
- The people who are critical of X… I wonder, are they the same people who are critical of Y?
The main way to do this is take one question of interest (the ‘dependent variable’) and then compare responses given by different groups of people (each different characteristic is an ‘independent variable’).
- Where the question of interest has response categories (an attitudinal scale, for example) you can create a cross-tab that shows responses from two or more groups side by side (male vs female students, for example).
- Where the question of interest is numeric (a test score, for example), you might need to take a different approach where you compare the mean (average).
- Both of these operations can be done in Microsoft Excel using pivot tables. Many online survey software product also provide simple tools for this.
- Be careful about small sample sizes. Slicing and dicing the data into different sub-groups can produce differences that seem large but in fact only come down to a handful of people.
- When comparing groups like this, it is wise to run tests for statistical significance. The test to use depends on the comparison you are making. For more on this topic, see the notes on working with quantitative data in the School Excellence Framework evidence guide, linked below.
Beyond this, analysis involves ‘multivariate analysis’: statistical modelling where you explore the interaction between several variables alongside each other. For example, multiple regression offers a set of techniques to explore the drivers of a particular opinion or behaviour (the dependent variable), looking at how much it is influenced by a range of other factors (independent variables), and which of these ‘predictors’ has more influence.
Multivariate analysis requires a fair amount of statistical knowledge, as well as statistical analysis packages.