Sampling is an art. Getting your sampling plans right can make a huge difference to your operations. From business-as-usual quality auditing to ascertaining root cause of problems - how you sample, and the volumes you select can make significant steps to making your job a lot easier.
Your sample must be:
- Fair and unbiased
- Representative
- Appropriately sized
To ensure your sampling is fair and unbiased, a simple random sample is the easiest way to do it. Simply, give every row in your data set a number (enumerate), then generate random numbers in a separate data set, then join the two together.
Leaving the arguments about truly random numbers aside. There are many reasons why the simple random sample is not enough. Firstly, a random sample will not contain an even distribution of the population size. You may also need your sample to be representative of the population of different categories of data (i.e. 50:50 split between gender, or representative of long lists of variables, like age ranges or product lists).
If you have any questions about sampling, leave in the comments below.