You have a project, and you need data. So you go to your metadata dictionary and search for a data source, and you discover that there are several sources that you could possibly choose. Perhaps you have multiple measures and you need to know which ones to retire. How do you make the right choice? This is my 5 steps to choosing the best data source for your project.
1. Classify and develop your objectives:
List all of your requirements. What data fields you need, reporting frequency, timings, transaction types, granularity etc. Make sure they are either classified as 'musts haves' or 'wants'. When you have a full list, give each 'want' a weighting score - highest value being most important.
2. Profile the data sources.
Build and run profiles of the data in each data source. Examine the field types, volumes, dates, times, transaction types and granularity. Profiling any creation timestamps will give you an idea of the scheduling that runs on the data.
3. Match the attributes and profile results of the data sources against the objectives.
Based on the profiling, how well does each source satisfy the objectives? Consider the timeliness of update and batch windows. Do they match the schedule in your objectives? Are the data sources structurally compatable to your requirements? Does each data source provide the correct level of granuarity? If any of the 'must haves' are not met for a data source, reject it outright. For all the other options, total the score based on how well they achieve the 'wants'.
4. Idenfity the risks
Take your two highest scorers and ask yourself the following questions about them:
- What future threats should we consider?
- If we choose the data source, what could go wrong?
- Is our understanding of this data source good enough?
- What are the capacity/system constraints?
5. Choose your preferred data source:
Are you willing to accept the risks carried by the best performer in order to attain the objectives?
If yes, choose it.
So there you have it, a rigorous approach to choosing the best data source. How much detail you go to will depend on the rigour that is required for your industry sector.