Collapse & contract
Often, the observations of your data belong to a larger group, e.g., you have observations on the state-level, and states belong to regions. If you would like to do some analyses on the “higher” level (or aggregate), you can use collapse. This creates a dataset aggregated by certain statistics (e.g. mean, sum, max…).
collapse
Collapse & contract data
***************************
*** collapse & contract ***
/*
"collapse" aggregates data as means, sum, etc.
*/
sysuse census, clear
// let's aggregate all population variables into means by region
collapse pop*, by(region) // mean is the default
br // the original data is lost
sysuse census, clear
// let's aggregate all population variables into maximum values by region
collapse (max) pop*, by(region)
br
sysuse census, clear
// if you need different ways of aggregation, you can create new variables
collapse (count) state_num = pop (mean) pop* (max) pop_max=pop, by(region)
br
// similar command: contract (converts data into percentages & frequencies)
sysuse census, clear
contract state
br // each state occurs only once
sysuse census, clear
contract region
br // the different regions occur several times