Things to consider
Consider the following when combining datasets:
- Always make sure that that variables in both datasets have the same units/categories/definitions
- E.g., if you have a country id with value labels, make sure the values actually denote the same countries (don’t be distracted by the value labels)
- For labels & variables with the same name, append / merge uses the definition/values of the master file (unless you specify it differently)
- Always explore whether appending/merging was correctly done
- Never use a m:m merge. Use joinby instead.
- Note: The description of append and merge as adding observations vs. adding variables is a simplification to visualize the concepts. If there are additional variables in the append data, these are added to the data (and set to missing for the initial observations). If there are additional observations in the merge data, they are also added, but with missing values for the initial variables if these do not exist in the new dataset.