You can set formula macros globally with setFixest_fml. These macros can then be used in fixest estimations or when using the function xpd.
setFixest_fml(..., reset =FALSE)getFixest_fml()
Arguments
...: Definition of the macro variables. Each argument name corresponds to the name of the macro variable. It is required that each macro variable name starts with two dots (e.g. ..ctrl). The value of each argument must be a one-sided formula or a character vector, it is the definition of the macro variable. Example of a valid call: setFixest_fml(..ctrl = ~ var1 + var2). In the function xpd, the default macro variables are taken from getFixest_fml, any variable in ... will replace these values. You can enclose values in .[], if so they will be evaluated from the current environment. For example ..ctrl = ~ x.[1:2] + .[z] will lead to ~x1 + x2 + var if z is equal to "var".
reset: A logical scalar, defaults to FALSE. If TRUE, all macro variables are first reset (i.e. deleted).
Returns
The function getFixest_fml() returns a list of character strings, the names corresponding to the macro variable names, the character strings corresponding to their definition.
Details
In xpd, the default macro variables are taken from getFixest_fml. Any value in the ... argument of xpd will replace these default values.
The definitions of the macro variables will replace in verbatim the macro variables. Therefore, you can include multipart formulas if you wish but then beware of the order the macros variable in the formula. For example, using the airquality data, say you want to set as controls the variable Temp and Day fixed-effects, you can do setFixest_fml(..ctrl = ~Temp | Day), but then feols(Ozone ~ Wind + ..ctrl, airquality) will be quite different from feols(Ozone ~ ..ctrl + Wind, airquality), so beware!
Examples
# Small examples with airquality datadata(airquality)# we set two macro variablessetFixest_fml(..ctrl =~ Temp + Day, ..ctrl_long =~ poly(Temp,2)+ poly(Day,2))# Using the macro in lm with xpd:lm(xpd(Ozone ~ Wind + ..ctrl), airquality)lm(xpd(Ozone ~ Wind + ..ctrl_long), airquality)# You can use the macros without xpd() in fixest estimationsa = feols(Ozone ~ Wind + ..ctrl, airquality)b = feols(Ozone ~ Wind + ..ctrl_long, airquality)etable(a, b, keep ="Int|Win")# Using .[]base = setNames(iris, c("y","x1","x2","x3","species"))i =2:3z ="species"lm(xpd(y ~ x.[2:3]+ .[z]), base)# No xpd() needed in feolsfeols(y ~ x.[2:3]+ .[z], base)## Auto completion with '..' suffix## You can trigger variables autocompletion with the '..' suffix# You need to provide the argument database = setNames(iris, c("y","x1","x2","x3","species"))xpd(y ~ x.., data = base)# In fixest estimations, this is automatically taken care offeols(y ~ x.., data = base)## You can use xpd for stepwise estimations## Note that for stepwise estimations in fixest, you can use# the stepwise functions: sw, sw0, csw, csw0# -> see help in feols or in the dedicated vignette# we want to look at the effect of x1 on y# controlling for different variablesbase = iris
names(base)= c("y","x1","x2","x3","species")# We first create a matrix with all possible combinations of variablesmy_args = lapply(names(base)[-(1:2)],function(x) c("", x))(all_combs = as.matrix(do.call("expand.grid", my_args)))res_all = list()for(i in1:nrow(all_combs)){ res_all[[i]]= feols(xpd(y ~ x1 + ..v, ..v = all_combs[i,]), base)}etable(res_all)coefplot(res_all, group = list(Species ="^^species"))## You can use macros to grep variables in your data set## Example 1: setting a macro variable globallydata(longley)setFixest_fml(..many_vars = grep("GNP|ployed", names(longley), value =TRUE))feols(Armed.Forces ~ Population + ..many_vars, longley)# Example 2: using ..("regex") or regex("regex") to grep the variables "live"feols(Armed.Forces ~ Population + ..("GNP|ployed"), longley)# Example 3: same as Ex.2 but without using a fixest estimation# Here we need to use xpd():lm(xpd(Armed.Forces ~ Population + regex("GNP|ployed"), data = longley), longley)# Stepwise estimation with regex: use a comma after the parenthesisfeols(Armed.Forces ~ Population + sw(regex(,"GNP|ployed")), longley)# Multiple LHSetable(feols(..("GNP|ployed")~ Population, longley))## lhs and rhs arguments## to create a one sided formula from a character vectorvars = letters[1:5]xpd(rhs = vars)# Alternatively, to replace the RHSxpd(y ~1, rhs = vars)# To create a two sided formulaxpd(lhs ="y", rhs = vars)## argument 'add'#xpd(~x1, add =~ x2 + x3)# also works with character vectorsxpd(~x1, add = c("x2","x3"))# only adds to the RHSxpd(y ~ x, add =~bon + jour)## Dot square bracket operator## The basic use is to add variables in the formulax = c("x1","x2")xpd(y ~ .[x])# Alternatively, one-sided formulas can be used and their content will be inserted verbatimx =~x1 + x2
xpd(y ~ .[x])# You can create multiple variables at oncexpd(y ~ x.[1:5]+ z.[2:3])# You can summon variables from the environment to complete variables namesvar ="a"xpd(y ~ x.[var])# ... the variables can be multiplevars = LETTERS[1:3]xpd(y ~ x.[vars])# You can have "complex" variable names but they must be nested in character formxpd(y ~ .["x.[vars]_sq"])# DSB can be used within regular expressionsre = c("GNP","Pop")xpd(Unemployed ~ regex(".[re]"), data = longley)# => equivalent to regex("GNP|Pop")# Use .[,var] (NOTE THE COMMA!) to expand with commas# !! can break the formula if missusedvars = c("wage","unemp")xpd(c(y.[,1:3])~ csw(.[,vars]))# Example of use of .[] within a loopres_all = list()for(p in1:3){ res_all[[p]]= feols(Ozone ~ Wind + poly(Temp, .[p]), airquality)}etable(res_all)# The former can be compactly estimated with:res_compact = feols(Ozone ~ Wind + sw(.[,"poly(Temp, .[1:3])"]), airquality)etable(res_compact)# How does it work?# 1) .[, stuff] evaluates stuff and, if a vector, aggregates it with commas# Comma aggregation is done thanks to the comma placed after the square bracket# If .[stuff], then aggregation is with sums.# 2) stuff is evaluated, and if it is a character string, it is evaluated with# the function dsb which expands values in .[]## Wrapping up:# 2) evaluation of dsb("poly(Temp, .[1:3])") leads to the vector:# c("poly(Temp, 1)", "poly(Temp, 2)", "poly(Temp, 3)")# 1) .[, c("poly(Temp, 1)", "poly(Temp, 2)", "poly(Temp, 3)")] leads to# poly(Temp, 1), poly(Temp, 2), poly(Temp, 3)## Hence sw(.[, "poly(Temp, .[1:3])"]) becomes:# sw(poly(Temp, 1), poly(Temp, 2), poly(Temp, 3))## In non-fixest functions: guessing the data allows to use regex## When used in non-fixest functions, the algorithm tries to "guess" the data# so that ..("regex") can be directly evaluated without passing the argument 'data'data(longley)lm(xpd(Armed.Forces ~ Population + ..("GNP|ployed")), longley)# same for the auto completion with '..'lm(xpd(Armed.Forces ~ Population + GN..), longley)