Most of the Stata commands can be shortened. For example, instead of typing summarize, Stata will also accept gen. The help screen demonstrates for each command
how it can be abbreviated, by showing underlined letters in the syntax section of the help.
Stata syntax follows mostly the following basic structure:
Syntax:
[by varlist1:] command [varlist2] [=exp] [if] [in] [using filename] [, options]
where square brackets shows optional qualifiers.
example:
bysort gender: tabulate age if weight < 50, nolabel
A variable list (varlist) is a list of variable names with blanks in between. There are a number of shorthand conventions to reduce the amount of typing. For instance:
myvar Just one variable
myvar var1 var2 Three variables
myvar* All variables starting with myvar
*var All variables ending with var
my*var All variables starting with my and ending with var
my~var A single variable starting with my and ending with var
my?var All variables starting with my and ending with var with one other character between
myvar1-myvar6 myvar1, myvar2, ..., myvar6 (probably)
this-that All variables in the order of the variables window this through that
The * character indicates to match one or more characters. All variables matching the pattern are returned. The ~ character also indicates to match one or more characters, but unlike *, only one variable is allowed to match. If more than one variable match, an error message is returned. The ? character matches a single character. All variables matching the pattern are returned. The - character indicates that all variables in the dataset, starting with the variable to the left of the - and ending with the variable to the right of the – are to be returned. Any command that takes varlist understands the keyword _all to mean all variables. Some commands are using all variables by default if none are specified (e.g., summarize shows summary statistics for all variables, and is equivalent to summarize _all).