Primary data
Especially if you work with primary data, you have to ensure the data is de-identified as early as possible:
- Consult with the data protection officer before collecting/accessing primary data to ensure data protection regulations can be fulfilled
- Check the DIME and J-PAL guidelines for de-identification of primary data
- Check the section on random variables to ensure you create reproducible, unique identifiers
- Check out the ietoolkit with the command ieboilsave to perform checks on de-identification
Also, it is even more important to ensure that the data is well-documented and prepared. Some helpful tools for data cleaning are provided by the iefieldkit . Check out the section on interaction with the system for code to automatically produce codebooks.