Last update: April 6, 2004
Stata: Programming Class Notes
Stata: Programming Daver C. Kahvecioglu ITS High Performance Computing Group UNC Chapel Hill
[email protected]
Do-files Rather than typing commands at the keyboard, you can create a disk file containing commands and instruct Stata to execute the commands stored in that file. Such files are called do-files, since the command that causes them to be executed is do.
A do-file is a standard ASCII text file. A do-file is executed by Stata when you type do filename.
You can use any text editor to create do-files, or you can use the built-in do-file editor by typing doedit, or by clicking on the do-file editor icon on the menu bar at the top.
By default, if any line in the do-file contains an error, Stata stops immediately and does not attempt to execute the rest of the commands. If you want Stata to keep going even though something may be wrong with the do file, add the nostop option:
do file, nostop
1
Last update: April 6, 2004
Stata: Programming Class Notes
Executing Stata in Background (batch) Mode in Windows Open a DOS window and type
c:\stata\wsestata /b do myjob
assuming that Stata-SE is installed in the folder c:\stata and you have a do-file named myjob in the current folder. When the do-file completes, the Stata icon in the taskbar will flash. You can then click on it to close Stata. If you want to stop the do-file before it completes, click on the Stata icon in the taskbar, and Stata will ask you if you want to cancel the job.
/b will make Stata open an ASCII text log named myjob.log
If you do not know where Stata is installed, you can right click on the Stata icon on the desktop and click on Properties. You can also click on Start, then Programs, then right click on Stata to see the location. For example, in ATN labs we should type
J:\.isis.unc.edu\pc-pkg\stata-80\program\wsestata /b do myjob
Macros: Macros are the variables of Stata programs. A macro is a string of characters, called the macroname, that stands for another string of characters, called the macro contents.
There are two types of macros: Local and global. Local macros are private to the program in which they are defined. They cannot be accessed from outside that program. Global macros, on the other hand, are public, and are accessible in any program.
sysuse auto local set1 " weight foreign " global set1 " weight foreign "
2
Last update: April 6, 2004
Stata: Programming Class Notes
Enclosing local macro names in single quotes exposes what they contain. The content of a global macro is revealed when we prefix $ to the macroname. regress mpg `set1' regress mpg $set1
Try the following:
display `set1' display $set1 display "`set1'" display "$set1"
display `set1' resolves to display weight foreign. display displays strings and values of scalar expressions.
display "What to display?"
Let's define a scalar pi: scalar pi = 3.14 di pi
_pi is a system variable that has the value of pi in it. di _pi
display weight foreign displays the most reasonable scalar that you can get out of weight foreign that is weight[_n] foreign[_n], which is also weight[1] foreign[1] since current observation starts from observation # 1.
If we enclose the content of a macro in double quotes, (for example, "`set1'" or "$set1") then content of the macro is nothing but a string.
3
Last update: April 6, 2004
Stata: Programming Class Notes
Argument passing with do-files Let's say you want to write a do-file that gets a data set name (technically, we are passing an argument, datasetname, to the do-file here) from the user and produces the means of all the variables in that data set. So, the user will type something like do dofilename datasetname
and the result will be the summary statistics for the variables in the datasetname.
Our do-file should be something like this:
use `1' summarize
Or,
args varname use `varname' summarize
1 and varname are the names of local macros.
Programs When you type something, Stata first checks if it is a built-in command. If it is, Stata executes what you typed. Otherwise, Stata checks if it's a defined program. If it is, the program is executed. Otherwise, Stat looks in certain directories (the names of these directories can be seen by typing sysdir) for a file that has the name you typed and the extension, ado. If the search is not successful, we get the "unrecognized command" error. In this section we will briefly discuss programs, which is the second type of object Stata thinks what we typed is. In the next section we will briefly discuss ado-files, which is last thing Stata thinks what we typed is.
4
Last update: April 6, 2004
Stata: Programming Class Notes
Here is a sample program you define interactively:
program hello
Now, you will start typing the first line of your program named hello: 1. display "Hello World" 2. end When you type end on line 2, line 2 becomes the last line of the program and thus program declaration is ended.
Now program hello is loaded into memory and it can be executed by typing
hello
If you type your program in an editor and save it as a do-file, then you can load it by "do"ing your do-file. If you want to run your program by "do"ing it, add hello as the last line of your do-file.
A program is not much different than a do-file.
Ado-files To run a program you have to load it first. If you save your program with the .ado extension and put it in certain directories, you do not have to load it. Stata will treat your program as a Stata command once it finds it in one of those designated directories.
Some of Stata's own commands are written as ado-files. Rest of Stata's commands are buil-in commands. You can tell if a command is built-in or not by typing which command.
. which table C:\PROGRAM FILES\STATA8-SE\ado\base\t\table.ado *! version 5.3.0
09oct2001
5
Last update: April 6, 2004
Stata: Programming Class Notes
. which tabulate built-in command:
tabulate
You can read the contents of table.ado by opening it in a text editor (they are ASCII text files), or by typing
type C:\PROGRAM FILES\STATA8-SE\ado\base\t\table.ado
You can add your own commands by creating your own ado-files. An ado-file defines a Stata command, even though there are some Stata commands which are not ado-files.
Stata looks for ado-directories in seven places, which can be categorized in three ways: I. the official ado-directories, meaning 1. (UPDATES), the official updates directory 2. (BASE), the official base directory II. your personal ado directories, meaning 3. (SITE), the directory for ado-files your site might have installed, 4. (PLUS), the directory for ado-files you personally might have installed, 5. (PERSONAL), the directory for ado-files you personally might have written, and 6. (OLDPLACE), the directory where Stata users used to save their personally written adofiles; and III. the current directory, meaning 7. (.), the ado-files you have written just this instant or for just this project.
. sysdir STATA: /usr/local/stata8/ UPDATES: /usr/local/stata8/ado/updates/ BASE: /usr/local/stata8/ado/base/ SITE: /usr/local/ado/ PLUS: ~/ado/plus/ PERSONAL: ~/ado/personal/ OLDPLACE: ~/ado/
6
Last update: April 6, 2004
Stata: Programming Class Notes
Stata has a range command that is shipped with it. range generates a numerical range, which is useful for evaluating and graphing functions. Here is the syntax for it:
range varname #first #last [#obs]
Let's create our own version of this range command: rangeours
program rangeours
// arguments are n a b
drop _all args n a b set obs `n' gen x = (_n-1)/(_N-1)*(`b'-`a') + `a' end
Then save this as rangeours.ado file in the current directory.
Now, type
rangeours 100 1 2
Accessing results calculated by commands/programs You can access the results of Stata commands after they are executed. In terms of the way their results are accessed, there are 4 types of commands in Stata:
r-class commands (most commands) e-class commands (estimation commands) s-class and n-class commands (which are rarely used)
After running an r-class command, say summarize, type return list to get the list of saved results such as the mean, variance, maximum, and minimum of the variable. After running the
7
Last update: April 6, 2004
Stata: Programming Class Notes
regress command, which is an e-class command, type ereturn list to get the list of saved estimation results, such as the number of observations, degrees-of-freedom, r-squared, coefficient estimates, etc. If you type creturn list, then you get the list of system values such as, today's date, current time, current directory, current system settings, etc. You can also save your programs' results. See Section 21.10 of User's Guide for how to do that.
Some Examples Example 1:
Let's write the following in the do-file editor.
version 8
// This tells Stata the version under which this do-file is written
set more off
// Now Stata does not pause every time the screen is full
log using
/* unless you specify an explicit address, myjob.log is saved in the current
directory */
myjob, replace text
use http://www.stata-press.com/data/r8/cencus
/// the command
continues on the next line/* , clear log close
Some string processing: upper(A): changes string A to uppercase A lower(A): changes string A to lowercase A word(A,n): returns the nth word in string A substr(A,m,n): returns the substring of A that is mth through nth characters index(A,B): returns the position of string A where string B is first found
8
Last update: April 6, 2004
Stata: Programming Class Notes
Example 2:
local logfilename1 = upper(
word(c(current_date),1) +
///
word(c(current_date),2) +
///
word(c(current_date),3)
)
log using `logfilename1', text replace
Example 3:
Let's say we want to create a log file and want to name it as the name of the data set. c(filename) gives us the name of the data set with the whole path to the data set. We are interested only in the name of the data set. The following do-file first trims the extension off the name, and then trims the path all the way up to the name of the data set. Note that we make use of the fact that the upper and lower cases of "." and "/" are the same.
local logfilename2 =
///
substr( "`c(filename)'", 1 , index("`c(filename)'", ".") - 1) di "logfilename2 = " "`logfilename2'" local i 0 while
/// upper(substr(reverse("`logfilename2'"),`i'+1,1)) != lower(substr(reverse("`logfilename2'"),`i'+1,1)) { di "i = " `i' local ++i
// equivalently, local i = `i' + 1
} di "final i = " `i' local logfilename2 = substr("`logfilename2'",-`i',.) log using `logfilename2', replace text
9
///
Last update: April 6, 2004
Stata: Programming Class Notes
Example 4:
forvalues repeatedly sets local macro macroname to each element of range and executes the commands enclosed in braces.
forvalues x = 1/10 { if mod(`x',2) { display "`x' is odd" continue } display "`x' is even" }
foreach repeatedly sets local macro macroname to each element of the list and executes the commands enclosed in braces. In Example 5 the list is "newlist" (we are creating new variables) and in Example 6 the list is a "numlist" (we are doing things for each number in the number list).
Example 5: foreach var of newlist z1-z20 { gen `var' = uniform() } su
10
Last update: April 6, 2004
Stata: Programming Class Notes
Example 6:
foreach num of numlist 1(1)4
6(2)13 {
if mod(`num',2) { display "`num' is odd" continue } display "`num' is even" }
Example 7:
clear set obs 100 *Generate 10 uniform random variables named x1, x2, ..., x10. set seed 12345 forvalues i = 1(1)10 {
// equivalently 1/10, or 1 2 to 10
generate x`i' = uniform() qui cou if x`i' < .1 display " % of x`i' < 1/10
=
gen x`i'ltdec = x`i' < .1 } sum x*dec
Do the same for obs 1,000, and for 1,000,000.
11
" round(100*r(N)/_N,.01)
Last update: April 6, 2004
Stata: Programming Class Notes
Example 8:
We can do exactly the same thing by using a while loop:
set seed 12345 local i = 1 while `i' <11 { generate x`i' = uniform() qui cou if x`i' < .1 display " % of x`i' < 1/10
=
gen x`i'ltdec = x`i' < .1 local i = `i' + 1 } sum x*dec
12
" round(100*r(N)/_N,.01)
Last update: April 6, 2004
Stata: Programming Class Notes
SAS data sets as inputs and outputs to Stata on SUNNY
savas When you are on sunny, you can convert a SAS dataset (say, filename.sas7bdat) into a Stata data set, by simply typing:
savas filename.sas7bdat
Then, sunny will create filename.dta for you.
You can also convert a Stata data set (say, filename.dta) into a SAS data set, by simply typing
savas filename.dta
Then, sunny will create filename.sas7bdat for you.
usesas & savasas These are two user-written Stata programs installed on sunny. In other words, they are two userwritten commands that can be run within Stata running on sunny.
usesas allows you to read a SAS data set directly into Stata: usesas using filename.sas7bdat
savasas allows you save the current data set in Stata's memory as a SAS data set: savasas using filename.sas7bdat
13
Last update: April 6, 2004
Stata: Programming Class Notes
savas, usesas, and savasas are all written by Dan Blanchette of UNC's Carolina Population Center. You can install these programs on your personal copies of Stata as well. For more information please see the following Stata Resources section.
sas2stata When you are on sunny, there is another simple way you can convert a SAS dataset (say, filename.sas7bdat) into a Stata data set:
sas2stata filename.sas7bdat
Then, sunny will create filename.dta for you.
Sas2stata is a Unix utility - written by the RAND Corporation - which converts SAS data sets into Stata format. For more information in sas2stata on sunny, visit:
http://www.unc.edu/atn/hpc/applications/index.shtml?id=4208
14
Last update: April 6, 2004
Stata: Programming Class Notes
Stata Resources There are several excellent Stata resources available on the Internet. They include program databases, discussion forum, and task-specific user web sites. Here I will list only a few of them: •
Stata Corporation (www.stata.com)
This is a good place to find wealth of information about Stata. It offers wealth of information about new features of Stata, a large frequently asked questions database extensive selection of procedures, as well as links to other Stata-related sites. Some of the useful pages within this site are: FAQ page: http://www.stata.com/support/faqs/ A large and very useful database for frequently asked questions about statistics, data management, graphics, programming, etc. A list of resources such as Tutorials, FAQs for learning Stata: http://www.stata.com/links/resources1.html •
UCLA Academic Technology Services: Resources to help you learn and use Stata: http://www.ats.ucla.edu/stat/stata
Hosted by UCLA, this site provides an extensive resource of Stata information including FAQs, learning modules, quick reference guide, annotated output, textbook examples, and more. •
Statalist: http://www.stata.com/support/statalist/faq/
Statalist is an active group of users who exchange information via email about using Stata. This is where you can ask questions about Stata and statistics, and get some help and guidance. •
Carolina Population Center's (CPC) Stata learning resources: Stata Tutorial: http://www.cpc.unc.edu/services/computer/presentations/statatutorial/ A SAS User's Guide to Stata:
http://www.cpc.unc.edu/services/computer/presentations/sas_to_stata/ In the above page, you will also find links to CPC's programs that convert between SAS and Stata data sets such as savastata, savasas, usesas, and savas. •
The Odum Institute at UNC-Chapel Hill offers free Stata classes at Manning Hall at the UNC-CH campus. Visit http://www2.irss.unc.edu/irss/shortcourses/shortcourse.asp
15