| After | Before | Difference | |
|---|---|---|---|
| Treated group | 6.57 | 7.12 | -0.55 |
| Control group | 7.42 | 7.22 | 0.20 |
| Difference | -0.85 | -0.10 | -0.75 |
Data Analytics for Finance
| Assignment | Release | Deadline |
|---|---|---|
| A1 | 11.02.2026 | 25.02.2026 |
| A2 | 11.02.2026 | 25.02.2026 |
| A3 | 18.02.2026 | 04.03.2026 |
| A4 | 25.02.2026 | 11.03.2026 |
| A5 | 04.03.2026 | TBA |
| A6 | 11.03.2026 | TBA |
Important Note on Assignment Deadlines
Deadlines for A5-A6 will be announced in the coming weeks. Please stay tuned for updates and check Canvas regularly if you do not attend lectures.
We did not answer this question in the previous lecture!
exam_scoretreated (1 if student uses LLM, 0 otherwise)treated and exam_score:
ability (confounder)study_hours (mediator)attendance_rate (mediator)age (confounder)gender (confounder) “To fix a Confounder, you MUST include it. To avoid a Collider, you MUST exclude it.”
Takeaway: Be deliberate about controls
DiD with two-way fixed effects largely sidesteps the confounder problem by absorbing time-invariant differences between units. But mediators remain a live concern — if you control for a variable that lies on the causal path from your treatment to the outcome, you will underestimate the treatment effect.
What is Difference‑in‑Differences (DID)?
Difference‑in‑Differences (DID) is a quasi‑experimental method that exploits within‑group variation over time and cross‑group variation to identify a causal effect when random assignment is infeasible.
Parallel Trends Assumption
DiD only recovers the causal effect if the “parallel trends assumption” holds!
Compare outcomes before and after treatment implementation, e.g. pre- and post-policy change
All variation in Treatment is explained by Time!
Compare outcomes between treated and control groups, e.g. those affected by a policy change vs those not affected
Differences between Treated and Control groups may be driven by time-invariant confounders, e.g. ability, demographics, location, etc.
Combining both allows us to isolate the causal impact of the treatment a.k.a. average treatment effect on the treated (ATT)
Compare grade changes in allowed vs. banned courses, before and after LLMs became available
DiD isolates the treated group’s response, conditional on the assumption that the untreated group’s changes represent the non-treatment counterfactual for the treated group
| (1) After | (2) Before | (1) - (2) | |
|---|---|---|---|
| (a) Treatment | Y\(_{treated,\ after}\) | Y\(_{treated,\ before}\) | \(\Delta_{treated}\) |
| (b) Control | Y\(_{control,\ after}\) | Y\(_{control,\ before}\) | \(\Delta_{control}\) |
| (a) - (b) | \(\Delta_{after}\) | \(\Delta_{before}\) | DiD |
\[Y = \beta_0 + \beta_1 Treated + \beta_2 After + \beta_3 Treated \times After + \epsilon\]
The difference-in-differences regression gives you the same estimate as if you took differences in the group averages
It takes also care of any unobserved constant differences between subjects and time trends!
| (1) After | (2) Before | (1) - (2) | |
|---|---|---|---|
| (a) Treatment | \(\beta_0 + \beta_1+\beta_2+\beta_3\) | \(\beta_0 + \beta_1\) | \(\beta_2+\beta_3\) |
| (b) Control | \(\beta_0 + \beta_2\) | \(\beta_0\) | \(\beta_2\) |
| (a) - (b) | \(\beta_1+\beta_3\) | \(\beta_1\) | \(\beta_3\) |
\[Y = \beta_0 + \beta_1 Treated + \beta_2 After + \beta_3 Treated \times After + \epsilon\]
The difference-in-differences regression gives you the same estimate as if you took differences in the group averages
It takes also care of any unobserved constant differences between subjects and time trends!
treated: Indicator for whether the course allows LLM use (1 = yes, 0 = no)after: Indicator for whether the observation is from the post-LLM period (1 = after, 0 = before)exam_score: The student’s exam score| After | Before | Difference | |
|---|---|---|---|
| Treated group | 6.57 | 7.12 | -0.55 |
| Control group | 7.42 | 7.22 | 0.20 |
| Difference | -0.85 | -0.10 | -0.75 |
Canonical: \(Y = \underbrace{\beta_0 + \beta_1 Treated}_{\alpha_i} + \underbrace{\beta_2 After}_{\alpha_t} + \beta_3 Treated \times After + \epsilon\)
TWFE: \(Y = \alpha_i + \alpha_t + \beta_{DiD}\ Treated \times After + \epsilon\)
ability drops out automatically because it is constant per student and is fully absorbed by the student fixed effectability and female drop out automatically: once you subtract each student’s own mean, a constant value cancels out completelyFixed effects solve endogeneity from time-invariant omitted variables. They do not solve endogeneity from time-varying confounders — if something unobserved changes over time and also affects both treatment and outcome, fixed effects won’t help. This is exactly what the parallel trends assumption guards against in DiD.
Why do standard errors matter?
Limitations
Limitations arise in rollout (staggered) designs, where treatment timing varies across groups; TWFE can perform poorly in such settings…
| Before | After | |
|---|---|---|
| Treated | ✓ | ✓ |
| Control | ✓ | ✓ |
Staggered (rollout) DiD
When units receive treatment at different points in time, we call this a staggered adoption or rollout design. This is extremely common in finance and economics research.
The standard Two-Way Fixed Effects (TWFE) estimator implicitly uses all available 2×2 comparisons, including some you probably don’t want:
The negative weights problem
When treatment effects are heterogeneous, TWFE can assign negative weights to some group-time comparisons. This means TWFE can produce a negative estimate even when every single unit has a positive treatment effect. Goodman-Bacon (2021) decomposed TWFE into its component 2×2 DiDs and showed this explicitly.
The problem is not staggered adoption per se — it is staggered adoption combined with treatment effect heterogeneity.
New approaches share a common idea: only compare treated units to clean controls
| Approach | Key idea | Reference |
|---|---|---|
| Callaway & Sant’Anna | Estimate separate ATTs for each (group, time) cell, then aggregate | Callaway and Sant’Anna (2021) |
| Sun & Abraham | Interaction-weighted estimator; heterogeneity-robust | Sun and Abraham (2021) |
| Borusyak, Jaravel & Spiess | Imputation-based; efficient | Borusyak, Jaravel, and Spiess (2024) |
| Stacked DiD | Manually construct clean 2×2 datasets and stack them | Cengiz et al. (2019) and Gormley and Matsa (2011) |
A way to extend your MSc replication project is to apply some of the new DiD methods to your data and compare the results to the standard TWFE approach. This can provide insights into the robustness of your findings and demonstrate your ability to apply cutting-edge econometric techniques.
Thank You for Your Attention!
See You in the Next One!
The parallel trends assumption states that, in the absence of treatment, the average change in the outcome variable would have been the same for both the treatment and control groups. Back
\[Y_{i,t} = \gamma_{State} + \theta_{Month} + \beta_3Ban_{i} \times Post_{t} + \epsilon_{i,t}\]
Data Analytics for Finance