top of page
Analysing the Numbers

INTRODUCTION TO STATA

Stata is a statistical software that does many quantitative calculations that would be nightmarish to do by hand.  All of the Windows computers in Gelman have version 13 installed, and it runs on both PCs and on Macs.


Stata may seem intimidating if you are unused to working with similar programs.  It certainly was for me.  But never fear.  I began learning to use Stata in Cameroon while two rats I named Fred and Joe scratched away in my cupboard.  You can do it too!


This is what Stata looks like when you first open it:

Introduction to Stata: News & Resources
Screen Shot 2018-01-21 at 8.01.37 PM2.png

What Stata Looks Like:

1: Review Window: Where Stata keeps track of every command you execute (or tell Stata to “do”) in this session

2: Results Window: When Stata executes an action, it is displayed here

3: Command Window: Where you enter the commands you want Stata to execute

4: Properties Window: Where details about the data are stored

5: Variables Window: Where all the variables in your dataset and their labels are listed


You can import data to excel in two main ways:

1: If the data is in a .dta file, which is Stata’s extension, you can click on it and will open in Stata

2: If your data is in excel, click File à Import à and click on the excel file to open it in Stata


Basic Commands:

The basic structure of all commands used in Stata is:

command varname(s) [if varname = = value] [, options] 


Commands often have abbreviations that can be entered in lieu of the entire name.  For example, the command “summary” can be shortened to “sum”

If you don’t know the name of a command or how to structure it, type help and then whatever you need help with and Stata will pull up help files on that subject.

Example: help linear regression


Sample dataset viewed in data browser:



Introduction to Stata: News & Resources
Picture2.png

Three Basic Commands:


1: Tabulate “tab” displays the following summary statistics: frequency, percentage, and cumulative distribution

tab varnames

Example: tab Age

In the photo below, in the command window, I have typed the command.  When I press enter, it displays what is above (I reentered the command so you could see it)



Introduction to Stata: News & Resources
Picture3.png

You can see the ages displayed in the first column, and then the descriptive statistics in the following columns.


2: Summary “sum” displays the following summary statistics: observations, standard deviation, mean, minimum and maximum.  The detail option gives you additional descriptive statistics

sum varname, [detail]

Example: sum SAT



Introduction to Stata: News & Resources
Picture4.png

Example: sum SAT, detail


Introduction to Stata: News & Resources
Picture5.png

3: Generate “gen” is for creating a new variable that does not currently exist in your dataset. 

Gen newvarname = what your new variable should equal

Example: gen awesomeness = 100.  This will generate a variable called “awesomeness” that equals 100 for everyone because we are all awesome


Introduction to Stata: News & Resources
Picture6.png


Introduction to Stata: News & Resources
Screen Shot 2018-01-21 at 8.06.33 PM.png

In the above photo, you can see that Stata added “awesomeness” as a variable.  Below, you can see in the data browser that Stata added awesomeness to the dataset.

Introduction to Stata: News & Resources
Picture8.png

So those are three basic commands that you can use in Stata.  More commands will be added as the year goes on, so stay tuned for more help with Stata.  Good luck!

Introduction to Stata: News & Resources
bottom of page