# Fixed effects vs. random effects for web browsing data

A simulation

When you work with trace data — data that emerge when people interact with technology — you will notice that such data often have properties that open up questions about statistical modelling. I currently work with browsing records, obtained at several times from the same users (i.e., a panel data set). A first typical characteristic of such data: Browsing behaviors are skewed. If you’re interested in people reading politically extreme web sites, you will find a few people doing it a lot, and most people doing little to none.

A second characteristic relates to the panel nature of the data. If you’re looking at people’s visits to online shops, they do not change their habits much over time — at least these within-person differences are not as great as the differences across people. Panel data lend itself to hierarchical modelling, for example with (1) fixed effects (FE) or (2) random-effects multilevel modelling (RE). For commonalities and differences between these two modelling approaches, see for example Gelman and Hill (2006).[1] The amount of within-person variation relative to between-person variation has important implications for these two approaches.

Below, I simulate the performance of FE vs. RE models, with these data characteristics in mind. I am mainly interested in the statistical power of each model, although my code can be easily adapted to examine, for example, bias and the RSME. The code builds on two sources: Conceptually, on the paper “Should I Use Fixed or Random Effects?” by Clark and Linzer (2015)[2]. In terms of code architecture, on the R package `simhelpers`

, and its documentation.[3]

I consider a model with independent variable xᵢ and dependent variable yᵢ, for which we have i = 1, … m observations grouped in j = 1, … n units. Importantly, we model theunit effect (or unit intercept) aⱼ, which intuitively can be understood as explaining everything about y that cannot be explained by xᵢ at the unit level. The model formally: yᵢ = aⱼ+ bxᵢ + eᵢ. To understand how the unit effects are estimated differently by FE and RE, I again recommend Gelman and Hill (2007). One key difference between RE and FE is that for the former to be unbiased, it requires a zero correlation between unit effects aⱼ and xᵢ.

Let’s start with the required packages.

`library(tidyverse)`

library(broomExtra)

library(mnonr)

library(MASS)

library(plm)

library(lme4)

library(lmerTest)

library(simhelpers)

In the simulation, I assume a small true effect b that is kept constant across simulations. I vary the following four parameters: (1) the number of units (500, 1000 or 2000); (2) the number of observations per unit (2, 3 or 5); (3) the true correlation between unit effects aⱼ and xᵢ (0 or 0.8); (4) the within-unit standard deviation in xᵢ (between 0.1 and 1, while the standard deviation of xᵢ *between* units is fixed at 1).

`design_factors <- list(`

n = c(500, 1000, 2000),

n_perunit = c(2, 3, 5),

truecor = c(0, 0.8),

b_1 = c(0.2),

x_sd = seq(0.1, 1, 0.1))

I start by defining a data-generating function. It first creates a unit-level average of xᵢ, as well as the unit effects aⱼ. To account for the skewness of browsing data, I draw the unit-level averages of xᵢ from a skewed normal distribution (note that the values of 3 for skew and 33 for kurtosis are motivated by actual browsing data). I then create values of yᵢ according to the model outlined above, adding the normally distributed error eᵢ.

`generate_dat <- function(n, n_perunit, truecor, b_1, x_sd) {`

# Generate unit effects and unit-level means

v <- unonr(n, c(0, 0),

matrix(c(1, truecor, truecor, 1), 2, 2),

skewness = c(0, 3), kurtosis = c(0, 33))

b_0 <- v[, 1]

xmean <- v[, 2]

# Generate observations per unit

dat <- NULL

for (unit in 1:n) {

x <- rnorm(n_perunit, xmean[unit], x_sd)

y <- b_0[unit] + b_1*x + rnorm(n_perunit, 0, 1)

dat <- rbind(dat, cbind(x, y, unit = unit))

}

return(dat)

}

Next, I define a function that estimates the models , and one that pulls the estimate, standard error and p-value for b. Note that I also estimate a standard “pooled” model for reference.

pull_stats <- function(model, method) {

estimate <- broomExtra::tidy(model) %>%

filter(term == “x”) %>% pull(estimate)

p.value <- broomExtra::tidy(model) %>%

filter(term == “x”) %>% pull(p.value)

std.error <- broomExtra::tidy(model) %>%

filter(term == “x”) %>% pull(std.error)

return(tibble(method = method,

est = estimate,

se = std.error,

p = p.value))

}estimate <- function(dat){# Pooled model

reg_pool <- lm(y ~ x, data = dat)

# Random-effects model

reg_fe <- plm(y ~ x, data = dat, index = “unit”, model = “within”)

# Fixed-effects model

reg_re <- lmerTest::lmer(y ~ x + (1|unit), data = dat)

results <- bind_rows(

pull_stats(reg_pool, "pooled"),

pull_stats(reg_fe, “FE”),

pull_stats(reg_re, “RE”))

return(results)

}

Next, the function driving the simulation:

`run_sim <- function(iterations, n, n_perunit, truecor, b_1, x_sd, seed = NULL) {`

if (!is.null(seed)) set.seed(seed)

results <- rerun(iterations, {

dat <- generate_dat(n, n_perunit, truecor, b_1, x_sd)

estimate(dat)}) %>% bind_rows()

}

At last, run the simulation:

set.seed(20220516)params <-

cross_df(design_factors) %>%

mutate(

iterations = 1000,

seed = round(runif(1) * 2³⁰) + 1:n()

)system.time(

results <-

params %>%

mutate(

res = pmap(., .f = run_sim)

) %>%

unnest(cols = c(res))

)

Now you can inspect and plot the results.

power <- results %>%

group_by(method, n, n_perunit, truecor, x_sd, b_1) %>%

do(calc_rejection(., p_values = p))power %>%

as.data.frame() %>%

filter(truecor == 0) %>%

rename(“Number of units” = n,

“Observations per unit” = n_perunit) %>%

ggplot(aes(x = x_sd, y = rej_rate,

group = method, color = method)) +

geom_smooth(se = FALSE) +

scale_y_continuous(name = “Rejection rate”) +

scale_x_continuous(name = “Within-unit SD”) +

scale_color_discrete(name = “Method”) +

theme_light() +

theme(legend.position = “bottom”) +

facet_wrap(~ `Number of units` + `Observations per unit`,

labeller = label_both)

The plot below shows that, as intuition would tell us, when the within-unit variation is small, FE models have a hard time detecting a true effect, especially when the number of units, and the number of observations per unit, are small.

Sources:

[1] Gelman, A., & Hill, J. (2006). *Data analysis using regression and multilevel/hierarchical models*. Cambridge university press.

[2] Clark, T. S., & Linzer, D. A. (2015). Should I use fixed or random effects?. *Political science research and methods*, *3*(2), 399–408.

[3] https://cran.r-project.org/web/packages/simhelpers/vignettes/MCSE.html and https://github.com/meghapsimatrix/simhelpers/blob/master/data-raw/welch_res_dat.R