How much do teachers value pension flexibility?

Analysis of a discrete choice experiment

Analysis

Author

James Zuccollo

Published

February 11, 2025

Abstract

This study explores the preferences of teachers in England regarding compensation packages through a discrete choice experiment. Conducted on December 4, 2024, via Teacher Tapp, the survey presented five choice sets to each respondent, each containing two compensation options with varying attributes such as current salary adjustments, retirement income variations, and the certainty of retirement income. The objective was to identify the compensation packages that best align with teachers’ preferences and their willingness to pay for specific pension features.

The analysis included responses from a large sample of teachers, considering factors such as age, current pension scheme, salary, future career expectations, and household financial stability. It estimates the average marginal component effects of each attribute level on the probability of selecting a compensation package for both the full sample and across different demographic groups.

Results indicate a strong preference for guaranteed retirement incomes, with significant variations across demographic groups. Teachers showed a clear preference for higher current salaries and more secure retirement incomes. Younger teachers, in particular, preferred higher current salaries over future pension benefits compared to older teachers.

The study also simulates potential policy changes, suggesting that offering teachers a choice between higher current salaries and lower pensions could lead to many teachers opting for the higher salary option. These findings have implications for policymakers designing compensation packages to attract and retain teachers.

Introduction

This document presents analysis of a survey of teachers in England. The survey asked teachers about their preferences for pension flexibility. This analysis document is intended to be read by other researchers and a separate report will be produced for public release.

Method

Survey

The survey implemented a discrete choice experiment where each teacher was presented with 5 choice sets. Each choice set presented two compensation options and teachers were asked to choose the option they preferred. The options varied on three attributes, which had the following possible levels:

The current salary
- 10 per cent less than the teacher’s current salary.
- 5 per cent less than the teacher’s current salary.
- The teacher’s current salary.
- 5 per cent more than the teacher’s current salary.
- 10 per cent more than the teacher’s current salary.
The retirement income
- 20 per cent less than the teacher’s current pension provides.
- 10 per cent less than the teacher’s current pension provides.
- The same as the teacher’s current pension.
- 10 per cent more than the teacher’s current pension provides.
- 20 per cent more than the teacher’s current pension provides.
The certainty of the retirement income
- Pension guarantees final retirement income.
- Pension income depends on stock market performance.

Attributes were randomly assigned to each option in each choice set.

The survey also collected information on respondents’ demographics, and on four other relevant attributes:

The teacher’s current pension scheme.
The teacher’s current salary.
Whether the teacher expected to be a teacher in three years time (ie career intentions).
Whether the teacher’s household earns enough to live on and save (ie financial security).

The survey was conducted on 4 December 2024 through Teacher Tapp and received responses from 5,929 teachers.

Discrete choice analysis

Discrete choice experiments are built on a random utility function, which assume that the utility a teacher derives from a particular compensation package can be decomposed into a systematic component and a random component. Let $U_{ij}$ represent the utility that teacher $i$ derives from choosing compensation package $j$. This utility can be expressed as:

\[ U_{ij} = V_{ij} + \epsilon_{ij} \]

where:

$V_{ij}$ is the systematic component of the utility, which is a function of the observed attributes of the compensation package. $\epsilon_{ij}$ is the random component of the utility, capturing unobserved factors and assumed to be independently and identically distributed (i.i.d.) with a Type I Extreme Value distribution.

The systematic component of the utility, $V_{ij}$, is modeled as a linear function of the attributes of the compensation package:

\[ V_{ij} = \beta_1 \text{Salary}_{ij} + \beta_2 \text{RetirementIncome}_{ij} + \beta_3 \text{PensionType}_{ij} \]

where:

$\text{Salary}_{ij}$ represents the salary level of option $j$ for teacher $i$.
$\text{RetirementIncome}_{ij}$ represents the retirement income level of option $j$ for teacher $i$.
$\text{PensionType}_{ij}$ represents the certainty of the retirement income of option $j$ for teacher $i$.
$_1, _2, $\beta_3$ are the coefficients to be estimated, representing the importance of each attribute.

The probability that teacher $i$ chooses compensation package $j$ over another package $k$ is given by the logistic choice probability:

\[ P_{ij} = \frac{\exp(V_{ij})}{\exp(V_{ij}) + \exp(V_{ik})} \]

where $V_{ij}$ and $V_{ik}$ are the systematic utilities of the two options in the choice set.

The parameters $\beta_1$, $\beta_2$, $\beta_3$ are estimated using a logistic regression model, where the dependent variable is the binary choice indicator (1 if the option is chosen, 0 otherwise), and the independent variables are the attributes of the compensation packages. This setup allows us to quantify the impact of each attribute on the probability of choosing a compensation package, providing insights into teachers’ preferences for different aspects of their compensation.

The model we estimate is:

\[ \text{logit}(P_{ij}) = \beta_1 \text{Salary}_{ij} + \beta_2 \text{RetirementIncome}_{ij} + \beta_3 \text{PensionType}_{ij} \]

Estimation is performed using a survey-weighted logistic regression model in the R package survey. The survey weights are used to account for the complex sampling design of the survey and ensure that the results are representative of the population of teachers in England. Standard errors are adjusted for clustering at the respondent level.

We do not model question set effects because each question set is equally randomized and no different from the others.

The results of the logistic regression model are presented as both odds ratios and average marginal component effects (AMCEs), which represent the average change in the probability of choosing an option associated with a one-unit change in the attribute level. AMCEs provide a straightforward way to interpret the impact of each attribute on teachers’ choices.

We also calculate the willingness to pay (WTP) for each attribute level, which represents the percentage change in salary that would make a teacher indifferent between two compensation packages. That is calculated by dividing the AMCE by the AMCE for salary.

Data

Data cleaning

The data were provided by Teacher Tapp with some initial cleaning and exclusions already performed. They reported to EPI that there were 7,437 teachers who responded to at least one of the questions in the survey. Of those, 6,658 answered all five rounds, and 5,929 gave a valid phase (either primary or secondary), seniority (classroom teacher, middle leader, SLT excl head, or headteacher), and country (ie they teach in England).

That brings the total number of survey respondents in the dataset provided to EPI to 5,751, which is the dataset described below. All respondents have 10 records associated with them, one for each of two options in each of five choice sets they were presented with.

For the analysis, we have excluded respondents who answered “Not relevant / cannot answer” to any of the demographic questions we rely upon, or who did not answer them at all. Those key demographic questions relate to:

School type
Age
Salary
Financial security
Career intentions

Doing that drops 1,780 responses, which accounts for 178 of our 5,929 respondents, and leaves us with 57,510 responses for our discrete choice analysis from 5,751 respondents.

Descriptive statistics

Figure 1 displays unweighted descriptive statistics for the key demographic variables in the sample at respondent level.

Show the code

# create a respondent-level dataset
df_demog <- rawdata |>
  distinct(respondent, .keep_all = TRUE)

df_demog |>
  select(starts_with("demog")) |>
  gtsummary::tbl_summary()

Figure 1: Descriptive statistics

Characteristic	N = 5,929¹
demog_age
Age in 20s	602 (10%)
Age in 30s	1,876 (32%)
Age in 40s	2,051 (35%)
Age in 50s+	1,391 (23%)
Unknown	9
demog_gender
Female	4,406 (75%)
Male	1,489 (25%)
Unknown	34
demog_experience
Less than 5 years	730 (12%)
Between 5 and 10 years	1,194 (20%)
Between 10 and 20 years	2,204 (37%)
Over 20 years	1,770 (30%)
Unknown	31
demog_funding
State-funded school	5,480 (93%)
Private School	434 (7.3%)
Unknown	15
demog_phase
Primary	2,054 (35%)
Secondary	3,875 (65%)
demog_seniority
Headteacher	352 (5.9%)
SLT (excl head)	1,194 (20%)
Middle Leader	2,419 (41%)
Classroom Teacher	1,964 (33%)
demog_region
North West	653 (11%)
Yorkshire and North East	723 (12%)
East of England	781 (13%)
Midlands	1,051 (18%)
South West	659 (11%)
London	735 (12%)
South East	1,327 (22%)
demog_subject
Science	839 (18%)
KS2	1,150 (25%)
Maths	668 (14%)
Humanities	734 (16%)
English	678 (15%)
EYFS/KS1	541 (12%)
Unknown	1,319
demog_children
No children at home	2,681 (46%)
Under 5	808 (14%)
5-11 years	969 (16%)
Over 11 years	1,424 (24%)
Unknown	47
demog_financial
Comfortable	2,061 (35%)
Reasonable	2,954 (50%)
Scraping by	767 (13%)
Falling short	75 (1.3%)
Prefer not to say	51 (0.9%)
Not relevant / cannot answer	20 (0.3%)
Unknown	1
demog_pension
Teachers' Pension Scheme (TPS)	5,519 (93%)
Another employer pension scheme	226 (3.8%)
I don't know	88 (1.5%)
Not currently enrolled in an employer pension scheme	80 (1.4%)
Not relevant / cannot answer	3 (<0.1%)
Unknown	13
demog_stay_in_teaching
Yes, most likely	3,506 (59%)
Perhaps	1,519 (26%)
No, probably not	742 (13%)
Don't know	133 (2.2%)
Not relevant / cannot answer	16 (0.3%)
Unknown	13
demog_salary
less than £24,000	96 (1.6%)
£24,000 to £34,999	695 (12%)
£35,000 to £44,999	1,338 (23%)
£45,000 to £54,999	1,838 (31%)
£55,000 to £64,999	929 (16%)
£65,000 to £74,999	460 (7.8%)
£75,000 to £84,999	189 (3.2%)
£85,000 to £94,999	67 (1.1%)
£95,000 to £104,999	42 (0.7%)
£105,000 or more	39 (0.7%)
Not relevant / cannot answer	23 (0.4%)
I don't want to say	212 (3.6%)
Unknown	1
demog_financial_binned
Comfortable	2,061 (35%)
Reasonable	2,954 (50%)
Struggling	842 (14%)
Not relevant / cannot answer	20 (0.3%)
Unknown	52
¹ n (%)

Which pension scheme do teachers currently have?

The default pension scheme for teachers in England is the Teachers’ Pension Scheme (TPS). However, some teachers may have opted out of the TPS and have a pension scheme with another employer or, if they are teaching at a private school, their school may no longer offer it as an option. Figure 2 shows the distribution of pension schemes by school type.

Show the code

create_demographics_crosstab(df_demog, demog_funding, demog_pension)

Figure 2: Pension scheme by school type

demog_funding	Teachers’ Pension Scheme (TPS)	Another employer pension scheme	I don’t know	Not currently enrolled in an employer pension scheme	Not relevant / cannot answer	NA_
State-funded school	96.1% (5,267)	0.7% (38)	1.6% (86)	1.4% (74)	0.0% (2)	0.2% (13)
Private School	55.1% (239)	43.1% (187)	0.5% (2)	1.4% (6)	0.0% (0)	0.0% (0)
NA	86.7% (13)	6.7% (1)	0.0% (0)	0.0% (0)	6.7% (1)	0.0% (0)

Missing data

At respondent level, the survey has missing data on the variables shown in Figure 3.

Show the code

naniar::gg_miss_var(df_demog) +
  labs(title = "Missing data by variable")

Some of these variables are jointly missing, as shown in Figure 4.

Show the code

naniar::gg_miss_upset(df_demog)

The most missing data is on the subject a teacher teaches, which is not relevant to this analysis. It is typically missing because the respondent has declined to provide the information to Teacher Tapp.

The other missing data is on the expected proportion of teachers in a demographic group, which is relevant because it affects the generated sample weights. Teacher Tapp weights against the School Workforce Census using the following demographics:

Phase (either Primary or Secondary)
Seniority (either Classroom Teacher, Middle Leader, SLT (excl head) or Headteacher)
Funding (either State or Private)
Mainstream (either Mainstream or Special/AP)
Age (either Age in 20s, 30s, 40s or 50s+)
Gender (either Male or Female)

For any respondent who has at least one of those characteristics missing from their profile, it is not possible to generate a weight for them, which entirely explains this missingness.

Full sample analysis

Effect of pension attributes on choice

Figure 5 shows the results of the logistic regression model outlined above as log-odds ratios.

Show the code

# Create the survey design object
pension_svy_design <- create_survey_design(analysisdata, sample_weight)

# Fit the generalized linear model
model_svy <- fit_glm(
  selected ~ choice_salary + choice_pension + choice_pensiontype,
  pension_svy_design
)

# print the regression results
regression_table(model_svy)

Figure 5: Results of logistic regression model

Characteristic	N	log(OR)¹	SE
choice_salary	57,439
Same		—	—
10% lower		-1.0***	0.036
5% lower		-0.49***	0.034
5% higher		0.13***	0.034
10% higher		0.43***	0.035
choice_pension	57,439
Same		—	—
20% lower		-1.1***	0.036
10% lower		-0.55***	0.034
10% higher		0.26***	0.035
20% higher		0.50***	0.034
choice_pensiontype	57,439
Defined benefit		—	—
Defined contribution		-1.0***	0.025
Abbreviations: CI = Confidence Interval, OR = Odds Ratio, SE = Standard Error
¹ p<0.05; p<0.01; **p<0.001

Average marginal component effects

Figure 6 below shows the average marginal component effects of the pension attributes on the probability of choosing an option.

Show the code

logit_margeff <- model_svy |>
  marginaleffects::avg_comparisons() |>
  tidy_amces(variable_lookup, choice_lookup)

ggplot2::ggplot(
  logit_margeff,
  aes(x = estimate, y = variable_level, color = variable_nice)
) +
  geom_vline(xintercept = 0) +
  geom_pointrange(aes(xmin = conf.low, xmax = conf.high)) +
  scale_x_continuous(labels = label_amce) +
  guides(color = "none") +
  labs(
    x = "Percentage point change in probability of pension selection",
    y = NULL,
    title = "AMCEs from logistic regression marginal effects"
  ) +
  ggforce::facet_col(vars(variable_nice), scales = "free_y", space = "free")

Figure 6: AMCEs from logistic regression model

Four things are immediately apparent from Figure 6:

Teachers prefer higher incomes.
Teachers prefer certainty over their retirement income.
Teachers prefer money today over money in the future.
Teachers exhibit loss aversion.

The first is intuitively obvious, but the other three deserve further discussion.

The value of certainty

In general, people prefer certainty and are willing to pay a premium for it. They will accept a lower expected income if it is guaranteed. In the context of pensions, a defined benefit scheme, like the TPS, guarantees a certain level of income in retirement. A defined contribution scheme, on the other hand, depends on the performance of the pension scheme’s assets, which fluctuates.

Figure 6 shows that teachers are 22 percentage points less likely to switch to a compensation package that depends on the stock market compared to one that guarantees their final retirement income. To put it in context, this is equivalent to a teacher being willing to accept a 10 per cent lower salary to ensure their retirement income is guaranteed.

It is worth noting that the phrasing of this attribute referred to ‘stock market performance’. In reality, schemes are likely to invest in a blend of equities, bonds, and other assets, and the performance of the scheme will depend on the mix of assets. The phrasing was chosen to be simple and clear to respondents but it is possible that respondents were reacting to the specifics of a relatively risky asset class.

Salary and retirement income

Teachers, unsurprisingly, had a strong preference for more income, both today and in the future. They were 9.1 per cent more likely to choose a compensation package with a 10 per cent higher salary, and 5.7 per cent more likely to choose a compensation package with 10 per cent more retirement income.

However, as those figures show, they do not value salary today as high as retirement income. The respondents valued a 1 per cent increase in retirement income as much as a 0.63 per cent increase in salary, meaning that salary increases are 1.6 times as valuable as pension increases.

This finding is consistent with the only other similar experiment to address this question with teachers in England. Burge, Lu, and Phillips (2021) found that a “1% increase in final pension was valued equivalent to a 0.5% increase in annual pay”.

Loss aversion

Finally, teachers value losses more than gains. This phenomenon is known as loss aversion and means that cutting a benefit is more painful than increasing it is pleasurable. Teachers were 21 percentage points less likely to choose a compensation package with a 10 per cent salary cut, but only 9.1 percentage points more likely to choose a compensation package with a 10 per cent salary increase.

Similarly, a 10 per cent increase in retirement income made a teacher 5.7 per cent more likely to choose a compensation package, but they are 12 per cent less likely to choose a compensation package with a 10 per cent cut. Despite the differences between gains and losses, the trade-off between salary and pension is very similar across both.

Differences by teacher characteristics

The analysis above shows the average effect of the pension attributes on the probability of choosing an option. However, teachers are a diverse group, and their preferences may vary based on their individual characteristics. To explore this, we can estimate interaction effects between the pension attributes and teacher demographics.

Drawing on our previous analysis, we have focused on the following teacher characteristics:

School funding
Age
Salary
Current pension scheme
Career intentions
Financial security

In this section we interact each of these characteristics with the pension attributes to see how preferences vary across different groups of teachers. Each group is considered separately because we do not have the sample size to consider all interactions simultaneously. However, that also means that the results are not independent of each other where demographic variables are correlated.

School funding

Teachers in private schools are already far more likely to have a defined contribution pension scheme than those in state schools, who are typically in the TPS, which is a defined benefit scheme. Figure 2 shows that, in our sample, 43% of teachers in private schools have a defined contribution scheme, compared to only 0.7% of teachers in state schools.

Despite that, Figure 7 shows that teachers in private schools are equally as averse to DC pensions as teachers in state schools. They also appear to care slightly more about salary than teachers in state schools, and to perhaps have slightly less loss aversion.

Show the code

school_type_model <- get_interaction_model(tbl_interactions, "demog_funding")

# regression_table(school_type_model)

Show the code

plot_interaction_effects(
  model = school_type_model,
  demog_var = "demog_funding",
  predictors = choice_vars,
  variable_lookup = variable_lookup,
  choice_lookup = choice_lookup,
  colour_guide_title = "Type of school"
)

Figure 7: Interaction AMCEs by school type

Age

In our earlier analysis, young teachers were far more likely to say they’d be willing to trade pension for salary. This analysis bears that out: Figure 8 shows that, the younger a teacher is, the more valuable they find salary and the less valuable they find their pension. They are also slightly less averse to the idea of a DC pension.

A straightforward explanation is that young people have a higher rate of discounting than older people, however meta-analyses suggest that is unlikely to be the case (Seaman et al. (2022)). If that is not the case, then young teachers simply have different preferences across current and future incomes to older teachers.

TODO: Add more analysis of the relative importance of salary and pension by age group.

Show the code

#|
age_model <- get_interaction_model(tbl_interactions, "demog_age")

# regression_table(age_model)

Show the code

plot_interaction_effects(
  model = age_model,
  demog_var = "demog_age",
  predictors = choice_vars,
  variable_lookup = variable_lookup,
  choice_lookup = choice_lookup,
  colour_guide_title = "Age group"
) + geom_smooth(method = "lm", se = FALSE)

Earnings

One might expect that teachers who earn more would be less concerned about salary than pension. First, they have more income, so any marginal increase is less valuable. Second, they are likely to be older and closer to retirement, so the value of a higher pension is greater. However, Figure 9 does not bear that out: teachers with high earnings are fairly unconcerned about increases in their retirement income and do not appear to feel very differently about salary.

Show the code

salary_model <- get_interaction_model(tbl_interactions, "demog_salary")

# regression_table(salary_model)

Show the code

plot_interaction_effects(
  model = salary_model,
  demog_var = "demog_salary",
  predictors = choice_vars,
  variable_lookup = variable_lookup,
  choice_lookup = choice_lookup,
  colour_guide_title = "Current salary"
)

Current pension scheme

Teachers who are enrolled in a pension scheme other than TPS appear more sensitive to pay increases, and less loss averse. However, this is confounded by the fact that these teachers are more likely to be in private schools, and there is not the sample size to disentangle these effects.

Show the code

pension_scheme_model <- get_interaction_model(tbl_interactions, "demog_pension")

# regression_table(pension_scheme_model)

Show the code

plot_interaction_effects(
  model = pension_scheme_model,
  demog_var = "demog_pension",
  predictors = choice_vars,
  variable_lookup = variable_lookup,
  choice_lookup = choice_lookup,
  colour_guide_title = "Current pension scheme"
)

Figure 10: Interaction AMCEs by pension scheme

Career intentions

This question asked teachers whether they expected to be teaching in three years’ time. One might expect that teachers who have a shorter planning horizon in the profession might be more sensitive to salary, but Figure 11 shows that this is not the case. Teachers who expect to leave the profession may perhaps exhibit slightly less loss aversion in salary than those who expect to stay, but the differences are not large.

Show the code

career_intentions_model <- get_interaction_model(tbl_interactions, "demog_stay_in_teaching")

# regression_table(career_intentions_model)

Show the code

plot_interaction_effects(
  model = career_intentions_model,
  demog_var = "demog_stay_in_teaching",
  predictors = choice_vars,
  variable_lookup = variable_lookup,
  choice_lookup = choice_lookup,
  colour_guide_title = "Career intentions"
)

Figure 11: Interaction AMCEs by career intentions

Financial security

Teachers who are more financially secure are less sensitive to salary increases and more sensitive to pension increases. That is to be expected because the marginal value of money is lower for them, and they may be less myopic about the future.

Show the code

financial_security_model <- get_interaction_model(tbl_interactions, "demog_financial")

# regression_table(financial_security_model)

Show the code

plot_interaction_effects(
  model = financial_security_model,
  demog_var = "demog_financial",
  predictors = choice_vars,
  variable_lookup = variable_lookup,
  choice_lookup = choice_lookup,
  colour_guide_title = "Financial security"
) +
  theme(legend.position = "bottom")

Figure 12: Interaction AMCEs by financial security

Simulating policy changes

One of the key findings from the analysis is that teachers value salary more than pension. This suggests that some teachers would be willing to accept a lower pension in exchange for a higher salary. To test this hypothesis, we can simulate a policy change where teachers are offered a choice between a compensation package with a higher salary and a lower pension, and their current compensation package, and estimate how many teachers would switch.

United Learning has proposed a scheme that would allow teachers to swap their defined benefit pension for a defined contribution pension, in exchange for a 10 per cent increase in their salary and a corresponding decrease in pension contributions. The scheme is designed to attract and retain teachers by offering them more flexibility in their compensation package. Using the coefficients from the regression model, we can estimate how many teachers would switch to the new compensation package. Estimation uses preference shares, as described in Chapman and Feit (2019).

To estimate the difference we need to include the change in retirement income, which depends on the current level of savings and a teacher’s time until retirement. Using online pension calculators suggests that a 10 percentage point reduction in pension contributions would lead to roughly a 20 per cent reduction in retirement income. This will not hold for all teachers but it is a reasonable approximation for the purposes of this analysis.

Show the code

# Define the levels of demog_age
demog_age_levels <- c("Age in 20s", "Age in 30s", "Age in 40s", "Age in 50s+")

# Create the tibble with the fixed set of choice variables
policy_changes <- tibble::tribble(
  ~choice_salary, ~choice_pension, ~choice_pensiontype,
  "10% higher", "20% lower", "Defined contribution",
  "Same", "Same", "Defined benefit"
)

# Cross demog_age_levels with policy_changes
policy_changes_with_age <- tidyr::crossing(
  policy_changes,
  demog_age = demog_age_levels
)

# Calculate the preference shares
age_predictions <- marginaleffects::predictions(
  age_model,
  newdata = policy_changes_with_age,
  type = "link"
) |>
  predictions_to_shares(demog_age)

# Format the preference shares
age_predictions |>
  janitor::adorn_pct_formatting(, , , share) |>
  knitr::kable()

Figure 13: Simulated policy change

choice_salary	choice_pension	choice_pensiontype	demog_age	utility	share
Same	Same	Defined benefit	Age in 20s	2.3544790	81.1%
10% higher	20% lower	Defined contribution	Age in 20s	0.5471176	18.9%
Same	Same	Defined benefit	Age in 30s	2.2855325	83.2%
10% higher	20% lower	Defined contribution	Age in 30s	0.4618039	16.8%
Same	Same	Defined benefit	Age in 40s	2.5353952	86.2%
10% higher	20% lower	Defined contribution	Age in 40s	0.4074504	13.8%
Same	Same	Defined benefit	Age in 50s+	2.4294401	88.7%
10% higher	20% lower	Defined contribution	Age in 50s+	0.3095351	11.3%

The results of the simulation are shown in Figure 13. The table shows the percentage of teachers in each age group who would choose the new compensation package, if choosing between that and the status quo. The results suggest that younger teachers are more likely to switch to the new compensation package, with teachers in their 20s being the most likely to switch. However, even among young teachers, only a minority would choose the new compensation package, largely because they value the defined benefit pension highly. In a hypothetical scenario where teachers were offered a choice between a higher salary and a lower pension, but with the ability to remain in a defined benefit scheme, over a third of teachers in their 20s and 30s would switch (Figure 14).

Show the code

# Create the tibble with the fixed set of choice variables
policy_changes_db_only <- tibble::tribble(
  ~choice_salary, ~choice_pension, ~choice_pensiontype,
  "10% higher", "10% lower", "Defined benefit",
  "Same", "Same", "Defined benefit"
)

# Cross demog_age_levels with policy_changes
policy_changes_with_age_db_only <- tidyr::crossing(
  policy_changes_db_only,
  demog_age = demog_age_levels
)

# Calculate the preference shares
age_predictions_db_only <- marginaleffects::predictions(
  age_model,
  newdata = policy_changes_with_age_db_only,
  type = "link"
) |>
  predictions_to_shares(demog_age)

# Format the preference shares
age_predictions_db_only |>
  janitor::adorn_pct_formatting(, , , share) |>
  knitr::kable()

Figure 14: Simulated policy change with defined benefit only

choice_salary	choice_pension	choice_pensiontype	demog_age	utility	share
Same	Same	Defined benefit	Age in 20s	2.354479	50.2%
10% higher	10% lower	Defined benefit	Age in 20s	2.332319	49.8%
10% higher	10% lower	Defined benefit	Age in 30s	2.342940	50.6%
Same	Same	Defined benefit	Age in 30s	2.285532	49.4%
Same	Same	Defined benefit	Age in 40s	2.535395	54.5%
10% higher	10% lower	Defined benefit	Age in 40s	2.119306	45.5%
Same	Same	Defined benefit	Age in 50s+	2.429440	59.8%
10% higher	10% lower	Defined benefit	Age in 50s+	1.635220	40.2%

Conclusion

Lessons for policymakers:

Teachers value certainty in their retirement income.
Teachers value salary more than pension.
Teachers are loss averse.

Implications for policy design:

TO DO

Appendices

Appendix A: Sample demographics

Comparison of the demographics of the analysis sample with the population of teachers in England (Figure 15).

Show the code

demog_chart_data <- analysisdata |>
  select(weight_group_name, ends_with("proportion")) |>
  distinct() |>
  drop_na() |>
  rename(
    "Expected" = expected_proportion,
    "Sample" = observed_proportion
  ) |>
  mutate(weight_group_name = fct_reorder(weight_group_name, Expected))

# pull out the data to make the embedded bar labels
demog_chart_labels <- demog_chart_data |>
  arrange(desc(Expected)) |>
  slice(1) |>
  pivot_longer(cols = -weight_group_name, names_to = "variable", values_to = "value")

demog_chart_data |>
  pivot_longer(cols = -weight_group_name, names_to = "variable", values_to = "value") |>
  drop_na() |>
  ggplot(aes(x = weight_group_name, y = value, fill = variable)) +
  geom_col(position = "dodge") +
  geom_text(aes(label = scales::percent(value, accuracy = 0.1)),
    position = position_dodge(width = 0.9),
    vjust = 0.5, hjust = -0.1
  ) +
  geom_text(
    data = demog_chart_labels,
    aes(label = variable, y = 0),
    position = position_dodge(width = 0.9),
    vjust = 0.5, hjust = -0.1, color = "white"
  ) +
  scale_y_continuous(labels = scales::percent) +
  coord_flip() +
  guides(fill = "none") +
  labs(
    title = "Comparison of sample demographics with population of teachers in England",
    subtitle = "Proportion of teachers in each demographic group, for sample and population",
    x = "Demographic group",
    y = "Proportion"
  )

Figure 15: Comparison of sample demographics with population of teachers in England

Appendix B: Implied discount rate

Knowing how much a teacher values a 1 per cent increase in their retirement income, relative to a 1 per cent increase in their salary today, we can roughly calculate the implied discount rate. This is the rate at which a teacher is indifferent between receiving a 1 per cent increase in their salary today and a 1 per cent increase in their retirement income in the future. Figure 16 shows the implied discount rate for teachers in different age groups.

Show the code

discount_rates <- purrr::map_dfr(
  list("choice_salary", "choice_pension"),
  \(x) {
    age_model |>
      marginaleffects::avg_comparisons(
        variables = x,
        by = "demog_age",
        wts = "sample_weight",
        newdata = analysisdata
      ) |>
      tidy_amces(variable_lookup, choice_lookup) |>
      dplyr::filter(choice == 10) |>
      dplyr::select(demog_age, estimate, term)
  }
) |>
  pivot_wider(names_from = term, values_from = estimate) |>
  dplyr::mutate(
    marg_rate_substitution = choice_pension / choice_salary,
    demog_age_midpoint = dplyr::case_match(
      demog_age,
      "Age in 20s" ~ 25,
      "Age in 30s" ~ 35,
      "Age in 40s" ~ 45,
      "Age in 50s+" ~ 58,
      .default = NA
    ),
    implied_discount_rate = -1 * cagr(1, marg_rate_substitution, (67 - demog_age_midpoint))
  )

discount_rates |>
  janitor::adorn_pct_formatting(, , , implied_discount_rate) |>
  janitor::adorn_rounding(digits = 2) |>
  dplyr::select(-demog_age_midpoint) |>
  knitr::kable(
    col.names = c("Age Group", "Value of 10% more salary", "Value of 10% more pension", "Marginal rate of substitution", "Implied discount rate")
  )

Figure 16: Implied discount rate

Age Group	Value of 10% more salary	Value of 10% more pension	Marginal rate of substitution	Implied discount rate
Age in 20s	0.10	0.02	0.23	3.4%
Age in 30s	0.12	0.06	0.52	2.0%
Age in 40s	0.08	0.05	0.69	1.7%
Age in 50s+	0.06	0.09	1.60	-5.3%

Appendix C: Inattention

Inattention in a discrete choice experiment can lead to biased estimates if teachers are not paying attention to the survey. There are several ways to check for inattention:

We can look at the distribution of the time taken to complete the survey. If teachers are not paying attention, we would expect the time taken to complete the survey to be very short. Alternatively, taking a long time to complete the survey could indicate that teachers are not taking the survey seriously.
We can look at the distribution of the responses to the choice sets. If teachers are not paying attention, we would expect the responses to be either random or to follow a pattern where they always choose the same option.

Time taken to complete the survey

Show the code

ttc_data <- rawdata |>
  dplyr::filter(selected == 1) |>
  summarise(
    # convert from milliseconds to seconds
    duration = sum(response_duration) / 1000,
    .by = respondent
  )

# Define the cutoffs
cutoff_lower <- quantile(ttc_data$duration, 0.025)
cutoff_upper <- quantile(ttc_data$duration, 0.975)
ttc_data <- ttc_data |>
  dplyr::mutate(is_outlier = duration < cutoff_lower | duration > cutoff_upper)

# how many responses are excluded
n_excluded <- sum(ttc_data$is_outlier)
message("Excluding ", n_excluded, " respondents of ", nrow(ttc_data), " (", scales::percent(n_excluded / nrow(ttc_data)), ")")

# Filter the data
inatt_ttc_data <- analysisdata |>
  dplyr::left_join(ttc_data, by = "respondent") |>
  dplyr::filter(!is_outlier)

The survey recorded the time taken to complete the survey in milliseconds. The median time taken to complete the survey was 81 seconds. Figure 17 below shows the distribution of time taken to complete the survey.

Show the code

ttc_data |>
  ggplot(ggplot2::aes(x = duration)) +
  geom_histogram() +
  geom_segment(
    aes(x = cutoff_lower, xend = cutoff_lower, y = 0, yend = 300),
    linetype = "dashed",
    color = "red"
  ) +
  geom_segment(
    aes(x = cutoff_upper, xend = cutoff_upper, y = 0, yend = 300),
    linetype = "dashed",
    color = "red"
  ) +
  # Place text labels just above the segment endpoints
  annotate(
    "text",
    x = cutoff_lower,
    y = 320,
    vjust = 0,
    label = "Bottom 2.5%\ncutoff",
    color = "red"
  ) +
  annotate(
    "text",
    x = cutoff_upper,
    y = 320,
    vjust = 0,
    label = "Top 2.5%\ncutoff",
    color = "red"
  ) +
  scale_x_log10(labels = scales::comma) +
  scale_y_continuous(labels = scales::comma) +
  labs(
    title = "Time taken to complete the survey",
    subtitle = "Histogram of sum of response_duration by respondent",
    x = "Duration (seconds)",
    y = "Number of teachers"
  )

Figure 17: Time taken to complete the survey

To check whether this is biasing the results, we can re-estimate the core results after dropping the quickest and slowest 2.5 per cent of responses, which retains responses ranging from 19 to 573 seconds in duration.

Show the code

# Create the survey design object
inatt_ttc_svy_design <- create_survey_design(inatt_ttc_data, sample_weight)

# Fit the generalized linear model
model_inatt_ttc <- fit_glm(
  selected ~ choice_salary + choice_pension + choice_pensiontype,
  inatt_ttc_svy_design
)

Dominated responses

Inattention can also be detected by looking at the distribution of responses to the choice sets. If teachers are not paying attention, we would expect the responses to be either random or to follow a pattern where they always choose the same option.

Random responses are hard to detect because they can be indistinguishable from true preferences. However, we can look for choice sets where a teacher chooses a strictly dominated option. A strictly dominated option is one where there is another option that is better in every respect. If a teacher chooses a strictly dominated option, it suggests that they are not paying attention.

Figure 18 below shows the number of dominated options chosen by respondents.

Show the code

# relies on the level indices being sorted by attractiveness, so we can simply look for tasks where the minimum index was selected for all attributes.
dominated_data <- analysisdata |>
  dplyr::group_by(respondent, task) |>
  dplyr::mutate(
    is_selected = selected == 1,
    is_minimum = purrr::map_lgl(
      row_number(),
      \(row) {
        all(
          dplyr::across(
            ends_with("_index"),
            \(x) x == min(x, na.rm = TRUE)
          )[row, ]
        )
      }
    ),
    is_dominated = is_selected & is_minimum
  ) |>
  dplyr::ungroup()

# chart the frequency of dominated options, grouped by respondent
dominated_data |>
  dplyr::group_by(respondent) |>
  dplyr::summarise(
    dominated = sum(is_dominated),
    total = sum(selected),
    prop_dominated = dominated / total
  ) |>
  dplyr::count(dominated) |>
  knitr::kable(
    col.names = c("Number of dominated options chosen (of 5 total)", "Number of respondents"),
  )

Figure 18: Number of dominated options chosen by respondents

Number of dominated options chosen (of 5 total)	Number of respondents
0	4503
1	940
2	251
3	52
4	5

With 1,618 dominated options chosen by respondents, it is possible that inattention is a significant issue in this survey.

Show the code

# Filter the data
inatt_dom_data <- dominated_data |>
  dplyr::filter(!is_dominated)

# Create the survey design object
inatt_dom_svy_design <- create_survey_design(inatt_dom_data, sample_weight)

# Fit the generalized linear model
model_inatt_dom <- fit_glm(
  selected ~ choice_salary + choice_pension + choice_pensiontype,
  inatt_dom_svy_design
)

Straightlining

Straightlining is a form of inattention where respondents always choose the same option. We can check for straightlining by looking at the distribution of responses to the choice sets. If teachers are straightlining, we would expect the responses to follow a pattern where they always choose the same option.

Figure 19 below shows the number of respondents who always chose the same option.

Show the code

# proportion of times each respondent always chose the same profile
straightlining_data <- analysisdata |>
  dplyr::filter(selected == 1) |>
  dplyr::group_by(respondent) |>
  dplyr::summarise(straightlining = n_distinct(profile) == 1)

straightlining_data |>
  janitor::tabyl(straightlining) |>
  janitor::adorn_pct_formatting(digits = 2) |>
  knitr::kable(
    col.names = c("Always chose the same option", "Number of respondents", "Proportion of respondents"),
  )

Figure 19: Number of respondents who always chose the same option

Always chose the same option	Number of respondents	Proportion of respondents
FALSE	5081	88.35%
TRUE	670	11.65%

If respondents chose randomly, we would expect the proportion of respondents who always chose the same option to be around 1 in 32, or 3.1%. If respondents are straightlining, we would expect this proportion to be higher, which it is, at 11.7%. That suggests straightlining may be affecting up to 670 respondents, though it is also possible they have legitimately chosen those options.

Show the code

# Filter the data
inatt_straightlining_analysisdata <- analysisdata |>
  dplyr::left_join(straightlining_data, by = "respondent") |>
  dplyr::filter(!straightlining)

# Create the survey design object
inatt_straightlining_svy_design <- create_survey_design(
  inatt_straightlining_analysisdata, sample_weight
)

# Fit the generalized linear model
model_inatt_straightlining <- fit_glm(
  selected ~ choice_salary + choice_pension + choice_pensiontype,
  inatt_straightlining_svy_design
)

Comparison of inattention results

For each of the three types of inattention, we have re-estimated the core results using only the unaffected responses. The results in Figure 20 show that the core results are robust to inattention. The estimates of the coefficients are similar across all models, suggesting that inattention is not a significant issue in this survey.

Show the code

# Create a tibble of all the tidied results
inatt_models <- tibble(
  model_name = c("Core", "Time taken", "Dominated", "Straightlining"),
  inatt_models = list(model_svy, model_inatt_ttc, model_inatt_dom, model_inatt_straightlining),
  gtsummary_results = purrr::map(inatt_models, regression_table) # ,
  # tidyresults = purrr::map(inatt_models, broom::tidy)
)

# Create a gtsummary table of the model coefficients
gtsummary::tbl_merge(
  inatt_models$gtsummary_results,
  tab_spanner = inatt_models$model_name
)

Figure 20: Comparison of regression results with and without inattention

Characteristic	Core			Time taken			Dominated			Straightlining
Characteristic	N	log(OR)¹	SE	N	log(OR)¹	SE	N	log(OR)¹	SE	N	log(OR)¹	SE
choice_salary	57,439			54,342			55,789			50,730
Same		—	—		—	—		—	—		—	—
10% lower		-1.0***	0.036		-1.0***	0.036		-1.1***	0.037		-1.1***	0.038
5% lower		-0.49***	0.034		-0.52***	0.035		-0.55***	0.035		-0.53***	0.036
5% higher		0.13***	0.034		0.13***	0.035		0.16***	0.035		0.14***	0.036
10% higher		0.43***	0.035		0.43***	0.036		0.50***	0.036		0.45***	0.037
choice_pension	57,439			54,342			55,789			50,730
Same		—	—		—	—		—	—		—	—
20% lower		-1.1***	0.036		-1.2***	0.038		-1.3***	0.038		-1.2***	0.039
10% lower		-0.55***	0.034		-0.57***	0.035		-0.58***	0.035		-0.59***	0.037
10% higher		0.26***	0.035		0.27***	0.036		0.31***	0.036		0.28***	0.037
20% higher		0.50***	0.034		0.52***	0.035		0.58***	0.035		0.53***	0.037
choice_pensiontype	57,439			54,342			55,789			50,730
Defined benefit		—	—		—	—		—	—		—	—
Defined contribution		-1.0***	0.025		-1.0***	0.025		-1.1***	0.025		-1.1***	0.026
Abbreviations: CI = Confidence Interval, OR = Odds Ratio, SE = Standard Error
¹ p<0.05; p<0.01; **p<0.001

References

Burge, Peter, Hui Lu, and William Phillips. 2021. “Understanding Teaching Retention: Using a Discrete Choice Experiment to Measure Teacher Retention in England,” February.

Chapman, Chris, and Elea McDonnell Feit. 2019. R For Marketing Research and Analytics. Use R! Cham: Springer International Publishing. https://doi.org/10.1007/978-3-030-14316-9.

Seaman, Kendra L., Sade J. Abiodun, Zöe Fenn, Gregory R. Samanez-Larkin, and Rui Mata. 2022. “Temporal Discounting Across Adulthood: A Systematic Review and Meta-analysis.” Psychology and Aging 37 (1): 111–24. https://doi.org/10.1037/pag0000634.