Skip to content

The Importance Of Large Survey Data Sets

The Importance Of Large Survey Data Sets

Survey Data Collection

A satisfactory sample size is one of the most important requirements for generating reliable insights from survey data. Without a large enough sample size (relative to the total population that we’re surveying), we risk generating unrepresentative data of our customers. Ultimately, this can hinder our ability to elicit meaningful insights to drive decision-making.

For those unfamiliar with survey methodology or statistics, there is solid math behind how we determine how representative a sample of a given population is, but there is not necessarily a firm line on how representative a sample has to be before we say, “ok, we’ve got enough.”

Reliable & Actionable Insights


Margin of Error

One measure, in particular, can help us determine how representative a sample is of a given population – margin of error. Most of you are probably familiar with this concept by way of political polling. We’ve all seen political polls reported as “candidate X’s expected percentage of the vote is 55%, plus-or-minus 5%.” This “plus-or-minus” portion is the margin of error. It means that if we were to ask the entire voting population to respond, the true value could be anywhere from 50% to 60%. Since that would be very, very expensive to do, we tend to be tolerant of some uncertainty in our estimates of the true values.

The margin of error is highly dependent on the size of the population you want to survey. With a large population (say, the total population of the United States), you would need a smaller sample size relative to the entire population to achieve a low margin of error (approximately 1,200 for a very representative sample). For smaller population sizes in the thousands or tens of thousands, you would need something on the order of 350 to 450 responses to achieve a 5% margin of error. The specifics behind this are a little too complicated to go into here, so if you would like to discuss this more, please feel free to contact us for a more in-depth discussion.

Getting the margin of error as low as possible is crucial to getting an accurate picture of what’s going on among your customer base. This is the primary reason for maximizing response rates, and why we at Satrix Solutions place so much emphasis on achieving industry-leading response rates for our clients.

The “Gold Standard”

Now that we have covered how to determine how representative a sample is, let’s discuss how small the margin of error has to be before we say, hold it; we’ve got enough.

We consider our “gold standard” to be a 3% margin of error. This is widely considered to be an excellent margin of error. Unfortunately, a 3% margin of error isn’t always achievable, but it is still important to get as close to that number as you can. With that said, margins of error above 3% don’t necessarily mean that we have junk data – just like with all things statistics, we have to keep in mind how representative our sample is and adjust our expectations accordingly.

At Satrix Solutions, our strategy is to hear from as many customers as possible. In other words, we don’t limit ourselves to a statistically-defined ceiling – we strive to produce the maximum number of responses that we can attain and, therefore, provide the most accurate information on your customer base that is possible.

Confidence Level

Beyond population size and sample size, another factor that greatly impacts the eventual margin of error is the confidence level. In simple terms, the confidence level helps us determine how “good” an estimate of the population is. For example, imagine if we were to conduct the same survey on the same population 100 times, and we drew a random sample each time. A 95% confidence level would tell us that, 95 times out of 100, the average of whatever we’re measuring would be within the given margin of error.

We can use a different confidence level than 95%, say 90% or 99%, but a 95% confidence level is widely considered an excellent target. One thing to note – lowering the confidence level that you would like to have generally also decreases the sample size you need, while raising the confidence level would require a larger sample size. Nearly everything in life comes with some amount of uncertainty (except death and taxes, of course).

Segmenting Your Data

So far, we’ve discussed how confidence levels and margins of error apply to your overall sample. What about when you want to reveal important information on certain segments of your overall customer base?

When we start segmenting our overall sample, we need to keep in mind that this will inevitably impact our margin of error and confidence level calculations. Generally speaking, segmenting will tend to increase margins of error, making us less sure about the accuracy of our data. Simply put, if your sample size is too small, you won’t be able to perform very reliable segmentation. Therefore, most organizations should seek to gather the largest sample during the survey window that they can. Read our tips for minimizing survey non-responders here.

The margin of error, influenced by confidence level, is a great tool in that we can assign a definite number to how confident we are that our sample is representative of our entire customer base. This is very important as the analysis of your survey data can help inform decisions about resource allocation, organizational changes, and process improvement initiatives.

Reliable & Actionable Insights