Mata functions
Mata comes with many more functions to read and amend directories and files: help m4_io
If you work with binary/text data, check out fopen() [help mf_fopen]. If you work with Excel files, you might find xl() helpful [help mf_xl]. As always, you can also combine Mata and Stata.
In our example, we have an Excel file with data on every sheet of the file (one sheet per year). We can use Mata to read the names of the sheets, and use this information in Stata to loop over the sheets to import and combine the data (as in example 1). Alternatively, you could also loop over the sheets in Mata.
In the first example, we have a folder with datasets from different regions we would like to combine. In the second example, we have a folder with datasets with different variables. You can apply the same logic to automatically create codebooks for all .dta files in a given directory (see extension below).
mata
b = xl() // initializes xl()
b.load_book("data_3/WDI_Data") // opens Excel file
sheets = b.get_sheets() // reads sheet names into a vector
sheets = invtokens(sheets',",") // converts vector into a string, elements separated by ","
st_local("namelist",sheets) // puts string into the Stata local "namelist"
end
local N = wordcount(`"`namelist'"') // counts elements in local "namelist"
tokenize `"`namelist'"', parse(",") // puts "namelist" into locals 1, ... N
// attention!!! tokenize also creates a local for each ","
// --> sheet names are in odd locals (1,3,5,...)
// import each sheet, save it temporarily
forv n = 1(2)`N' {
import excel "data_3/WDI_Data", sheet(`"``n''"') first clear
tempfile file_`n'
save `file_`n''
}
// append all imported files
use `file_1'
forv n = 3(2)`N' {
append `file_`n''
}
// some data cleaning/preparation
destring GDPpercapitaPPPconstant20 GDPPPPconstant2017internat ///
Mortalityrateunder5female Mortalityrateunder5malep ///
Lifeexpectancyatbirthfemale Lifeexpectancyatbirthmale, replace force
save "WDI_new.dta", replace