Loading & saving data

There are many ways to load datasets into stata, below you find an explanation how to load built-in example datasets, datasets from websites as well as datasets stored on your computer.

Note:

  • If no file extension is specified, stata will assume that the respective file has the extension “.dta” and will return an error message otherwise. Consult the respective help files to see which command accepts which file extensions.
  • Make sure to always clear the workspace before loading a new dataset either with the command “clear all” or with the option “, clear” in the respective loading command
  • After the initial import, save your data as stata file in the raw data file in your working directory using the “save” command.

1. Loading built-in and web-based datasets: sysuse & webuse

sysuse lifeexp // load built-in example datasets
sysuse dir // display a list of all the built-in datasets
webuse lifeexp // load dataset from the stata website
webuse query // shows that by default the url is set to the stata website
webuse set [https://]url[/] // change the url to draw the dataset from to any other website

2. Load locally stored datasets: use, import excel, import delimited

  • If your data is already in stata format (dta): The most common command to load data from your computer is “use”
  • If your data is stored in an excel file: import excel
  • If you have any text-delimited data: import delimited
  • If you have the option, try to get your raw data either as stata, excel or text-delimited files. This will be possible in most cases. In case this is not possible, check “help import” for many options of importing data from other formats.

do file structure

Get used to saving all the steps you do in dofiles. Make sure to start out with this basic structure and add details as needed:

* Informative heading as comment
cd // set the directory to your working directory for this project
import excel using "raw data/lifeexp.xlsx", clear // clear the workspace and import the raw data in excel format
save "raw data/lifeexp", replace // save the raw data in stata format, don’t forget the “replace” option

/*
Any manipulations that you would like to do with the data, see “Variables” chapter for possibilities.
*/


save "clean data/lifeexp_clean", replace

Exercise

Open a new dofile, in which you:

  1. Load the built-in dataset bplong
  2. Save it to the raw data file in your working directory
  3. Make sure to comment each step and save the dofile to your working directory