Overview

In this chapter, we will have a look at the overall data structure: What are the dimensions of the data? Which level does each “observation” (each row) of our data set represent? How do we change this?

Consider the following data:

One row stands for one person-year. In other words, the combination of “id” and “year” uniquely identify each observation. We call this the “long” format (multiple observations for each unit of interest). But there are many possible scenarios in which we would like to transform the data, or to combine them with other data, changing the dimensions:

Aggregate data
Transpose data (seldomly used)
Change dimensions (reshape data)
Combine data

Add observations (append data)

Add variables (merge data)

In this chapter, you will learn how to transform data by aggregating, reshaping, and combining datasets, and how to work with several datasets simultaneously.

You can download the complete presentation and do-file here: