International Household Survey Network
IHSN
Quick Reference Guide for Data Archivists¶
Version 2019 -04
Authors of this guide: Olivier Dupriez, Diana Marcela Sanchez Castro, Matthew Welch (The World Bank)
The production of this guide was made possible through a grant from the TFSCB - DFID funding to the World Bank P167116/TF0A7461.
Acknowledgments: Francois Fonteneau (PARIS21), Geoffrey Greenwell (PARIS21), Chris Rockmore (World Bank) and Jan Smit (ESCAP) provided valuable input to an earlier version of the document. Trevor Croft (UNICEF) provided many of the examples of good practices for completing survey metadata.
Content¶
- 1. Introduction
- 2. Before you start: organizing your files
- 3. Gathering and preparing the data set
- 3.1. Data files should be organized in a hierarchical format
- 3.2. Datasets with multiple units of analysis should be stored in different data files
- 3.3. Columns in a dataset should represent variables, not values
- 3.4. Each observation in every file must have a unique identifier
- 3.5. Identifying duplicate observations
- 3.6. Ensure that each individual dataset can be combined into a single database
- 3.7. Check for variables with missing values
- 3.8. Check Improper value ranges
- 3.9. Verify that the number of records in each file corresponds to what is expected
- 3.10. Datasets must contain all variables from the questionnaire and be in a logic sequence
- 3.11. Include the relevant weighting coefficients and variables identifying the stratification levels
- 3.12. Variables and codes for categorical variables must be labelled
- 3.13. Temporary, calculated or derived variables should not be disseminated
- 3.14. Check that the data types are correct
- 3.15. Datasets must not have directed identifiers
- 3.16. Compress the variables to reduce the file size
- 4. Gathering and preparing the documentation
- 5. Importing data and establishing relationships
- 6. Importing external resources
- 7. Completing metadata
- 7.1. Good practices for completing the Document Description
- 7.2. Good practices for completing the Study Description
- 7.3. Good practices for completing the File Description
- 7.4. Good practices for completing the Variables Description
- 7.5. Good practices for completing the External Resources description
- 8. Creating variable groups
- 9. Running validations and diagnostics
- 10. Generating the survey documentation in PDF
- 11. Independent quality review
- 12. Section A . Data Validations in Stata: Practical Examples