Data Analytics for Finance

BM17FI · Rotterdam School of Management

RSM Logo

ASSIGNMENT 02

Data Visualization

Learning Objectives¶

In this assignment, you will learn to:

  1. Create time series plots with twoway line
  2. Customize graph appearance (colors, labels, legends, titles)
  3. Add reference lines and shaded regions for events
  4. Create multi-panel graphs comparing firms
  5. Examine distributions with histograms
  6. Export publication-quality figures as PNG files

Context: Visualizing the Dieselgate Impact¶

On September 18, 2015, the U.S. Environmental Protection Agency issued a Notice of Violation to Volkswagen for emissions cheating. In this assignment, you'll create visualizations to explore how this event affected stock prices of German automakers.

Effective data visualization is crucial in finance for:

  • Communication: Presenting findings to stakeholders
  • Pattern recognition: Identifying trends and anomalies
  • Hypothesis generation: Visualizations often reveal unexpected relationships

Exercises¶

📝 Assignment Tasks
  1. Load cleaned data and verify panel structure
  2. Create a time series plot of VW stock prices (2013-2017)
  3. Plot all German automakers on the same graph for comparison
  4. Create an event window visualization with reference lines
  5. Examine the distribution of daily returns with a histogram
  6. Compare return distributions across firms
  7. Calculate and plot cumulative returns over time
  8. Export all graphs as publication-quality PNG files

Setup¶

Clear Environment¶

We start by clearing Stata's memory and disabling pagination.

✅ The environment is cleared and ready.

Set Graph Scheme¶

We'll use the stcolor scheme for clean, publication-ready graphs.

Stata Stata Tip: Graph Schemes
Stata offers several built-in schemes to control graph appearance:
  • stcolor — Clean, minimalist (recommended for this assignment)
  • s2color — Stata's default color scheme
  • economist — Mimics The Economist magazine style
  • s1rcolor — White background with color
  • plottig — Tight layout with grid
  • lean2 — Minimalist black and white
Feel free to experiment with different schemes to see which you prefer!
✅ Graph scheme set to stcolor.

Set File Paths¶

We define global macros for all data and output directories.

/Users/casparm4/Github/rsm-data-analytics-in-finance-private/private/assignment
> s/02-assignment
📁 Base directory: /Users/casparm4/Github/rsm-data-analytics-in-finance-private
> /private/assignments/02-assignment
📁 Raw data folder: /Users/casparm4/Github/rsm-data-analytics-in-finance-privat
> e/private/assignments/02-assignment/data/raw
📁 Processed data folder: /Users/casparm4/Github/rsm-data-analytics-in-finance-
> private/private/assignments/02-assignment/data/processed
📁 Output directory: /Users/casparm4/Github/rsm-data-analytics-in-finance-priva
> te/private/assignments/02-assignment/output
📁 Figures folder: /Users/casparm4/Github/rsm-data-analytics-in-finance-private
> /private/assignments/02-assignment/output/figures

Section 1: Load Data and Setup¶

Task 1.1: Load Cleaned Dataset¶

What you'll do: Load the cleaned dataset from Assignment 1 (auto_firms_g_daily_clean.dta located in the "raw" folder) containing daily stock prices for 4 German automakers.

Why this matters: This dataset is the foundation for all our visualizations. It contains the panel data structure (firms × time) that we'll explore graphically.

What to expect: The dataset contains ~5,200 observations across 4 firms over the 2013-2017 period.

Stata Stata Tip
Use use "$raw/filename.dta", clear to load Stata datasets. The clear option removes any data currently in memory.
---- CHECKPOINT: data loaded ----
Number of observations: 5216
Number of variables: 13

Task 1.2: Examine Data Structure¶

What you'll do: Use describe and summarize to understand the data structure and verify key variables.

Why this matters: Before creating visualizations, you should always verify that the data matches your expectations.

Key variables to check:

  • gvkey: Firm identifier
  • conm: Company name
  • date: Trading date (Stata date format)
  • prccd: Closing price in EUR
  • ret: Daily log return

What to expect: 4 firms, dates formatted as "01jan2013" style, prices in EUR.

Contains data from /Users/casparm4/Github/rsm-data-analytics-in-finance-private
> /private/assignments/02-assignment/data/raw/auto_firms_g_daily_clean.dta
 Observations:         5,216                  
    Variables:            13                  6 Jan 2026 15:28
-------------------------------------------------------------------------------
Variable      Storage   Display    Value
    name         type    format    label      Variable label
-------------------------------------------------------------------------------
gvkey           long    %12.0g                Compustat company identifier
conm            str28   %28s                  Company name
date            float   %td                   Trading date
year            float   %9.0g                 Year
month           float   %9.0g                 Month
prccd           float   %9.0g                 Closing price (local currency)
ret             float   %9.0g                 Daily log return
ajexdi          float   %9.0g                 
sic             int     %8.0g                 
naics           long    %12.0g                
fic             str3    %9s                   
isin            str12   %12s                  
sedol           str7    %9s                   
-------------------------------------------------------------------------------
Sorted by: gvkey  date

    Variable |        Obs        Mean    Std. dev.       Min        Max
-------------+---------------------------------------------------------
       prccd |      5,216    100.8507    39.92719     38.645     247.55
         ret |      4,172    .0003003    .0144726  -.1842681   .0801751
        date |      5,216     20270.7    527.0549      19359      21182

     +--------------------------------+
     | gvkey                     conm |
     |--------------------------------|
  1. | 17828   MERCEDES BENZ GROUP AG |
  2. | 17828   MERCEDES BENZ GROUP AG |
  3. | 17828   MERCEDES BENZ GROUP AG |
  4. | 17828   MERCEDES BENZ GROUP AG |
  5. | 17828   MERCEDES BENZ GROUP AG |
     |--------------------------------|
  6. | 17828   MERCEDES BENZ GROUP AG |
  7. | 17828   MERCEDES BENZ GROUP AG |
  8. | 17828   MERCEDES BENZ GROUP AG |
  9. | 17828   MERCEDES BENZ GROUP AG |
 10. | 17828   MERCEDES BENZ GROUP AG |
     |--------------------------------|
 11. | 17828   MERCEDES BENZ GROUP AG |
 12. | 17828   MERCEDES BENZ GROUP AG |
 13. | 17828   MERCEDES BENZ GROUP AG |
 14. | 17828   MERCEDES BENZ GROUP AG |
 15. | 17828   MERCEDES BENZ GROUP AG |
     |--------------------------------|
 16. | 17828   MERCEDES BENZ GROUP AG |
 17. | 17828   MERCEDES BENZ GROUP AG |
 18. | 17828   MERCEDES BENZ GROUP AG |
 19. | 17828   MERCEDES BENZ GROUP AG |
 20. | 17828   MERCEDES BENZ GROUP AG |
     +--------------------------------+
---- CHECKPOINT: data structure examined ----
Variables confirmed
✅ Test passed: Data loaded with expected dimensions.

Task 1.3: Declare Panel Structure¶

What you'll do: Set the panel structure using xtset to identify firms and time periods.

Why this matters: Even though we're not using time-series operators in this assignment, declaring the panel structure:

  • Verifies the data is properly sorted
  • Allows Stata's panel-aware commands to work
  • Makes the data structure explicit

Panel structure:

  • Panel variable: gvkey (identifies which firm)
  • Time variable: date (identifies when)

What to expect: Stata will report the number of panels (firms) and time periods per panel.

Stata Stata Tip
Panel Structure
Syntax: xtset panelvar timevar
  • panelvar: Variable identifying panel units (firms) → gvkey
  • timevar: Variable identifying time periods → date
After xtset, use xtdescribe to examine panel characteristics.
Panel variable: gvkey (strongly balanced)
 Time variable: date, 01jan2013 to 29dec2017, but with gaps
         Delta: 1 day

   gvkey:  17828, 100022, ..., 100737                        n =          4
    date:  01jan2013, 02jan2013, ..., 29dec2017              T =       1304
           Delta(date) = 1 day
           Span(date)  = 1824 periods
           (gvkey*date uniquely identifies each observation)

Distribution of T_i:   min      5%     25%       50%       75%     95%     max
                      1304    1304    1304      1304      1304    1304    1304

     Freq.  Percent    Cum. |  Pattern*
 ---------------------------+--------------------------------------------------
> ------------------------------------------------------
        4    100.00  100.00 |  XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
> XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX5
 ---------------------------+--------------------------------------------------
> ------------------------------------------------------
        4    100.00         |  XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
> XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
 ------------------------------------------------------------------------------
> ------------------------------------------------------
 *Each column represents 18 periods.

---- CHECKPOINT: panel structure declared ----
Panel data structure set successfully
✅ Test passed: Panel structure correctly declared.

Section 2: Basic Time Series Plot¶

Task 2.1: Create VW Price Time Series¶

What you'll do: Create a line plot showing Volkswagen's stock price over the entire 2013-2017 period.

Why this matters: Time series plots are the foundation of financial data visualization. They reveal:

  • Trends: Long-term price movements
  • Volatility: Periods of stability vs. turbulence
  • Events: Sharp changes that warrant investigation

What to expect: You'll see VW's price trajectory from 2013-2017, with a dramatic drop in September 2015.

Stata Stata Tip: twoway line
The twoway line command creates line plots:
twoway line yvar xvar if condition, ///
        title("Graph title") ///
        subtitle("Graph subtitle") ///
        xtitle("X-axis label") ytitle("Y-axis label")
Key options:
  • if — Filter observations (e.g., if gvkey == 100737)
  • title() — Main graph title
  • subtitle() — Subtitle
  • xtitle() — X-axis label
  • ytitle() — Y-axis label
  • note() — E.g. data source(s)
  • /// — Line continuation for readability
---- CHECKPOINT: VW price plot created ----
Graph displayed
No description has been provided for this image
✅ Test passed: Time series plot created successfully.

Task 2.2: Improve Graph Appearance¶

What you'll do: Enhance the graph with better formatting and save it to a file.

Why this matters: Small improvements in graph formatting dramatically increase readability:

  • Descriptive titles help readers understand context
  • Proper axis labels include units
  • Saving graphs allows you to use them in reports

What to add...:

  • Title: "Volkswagen Stock Price (2013-2017)"
  • Subtitle: "Daily Closing Prices (2013-2017)"
  • X-axis label: "Date"
  • Y-axis label: "Price (EUR)"

What to expect: A polished graph saved as vw_price_timeseries.png in your figures folder.

Stata Stata Tip: Saving Graphs
Use graph export to save graphs:
graph export "$figures/filename.png", replace width(1200)
Options:
  • replace — Overwrite existing file
  • width() — Image width in pixels (1200-1600 recommended)
  • Supported formats: .png, .pdf, .eps, .svg
file /Users/casparm4/Github/rsm-data-analytics-in-finance-private/private/assig
> nments/02-assignment/output/figures/vw_price_timeseries.png written in PNG fo
> rmat
---- CHECKPOINT: graph exported ----
Saved to: /Users/casparm4/Github/rsm-data-analytics-in-finance-private/private/
> assignments/02-assignment/output/figures/vw_price_timeseries.png
No description has been provided for this image
✅ Test passed: Graph exported successfully.

Section 3: Multi-Firm Comparison Plot¶

Task 3.1: Plot All Firms on Same Graph¶

What you'll do: Create a single graph showing stock prices for all 4 German automakers simultaneously.

Why this matters: Comparing multiple series on one graph reveals:

  • Relative performance: Which firms outperformed or underperformed
  • Correlation: Do prices move together or independently?
  • Peer effects: Did the Dieselgate scandal affect other German automakers?

What to expect: Four colored lines on one graph, one for each firm (VW, BMW, Mercedes-Benz, MAN).

Stata Stata Tip: Multiple Lines with twoway
To plot multiple lines, use parentheses for each series:
twoway ///
    (line prccd date if gvkey == 100737, lcolor(navy)) ///
    (line prccd date if gvkey == 100022, lcolor(maroon)) ///
    (line prccd date if gvkey == 017828, lcolor(forest_green)) ///
    (line prccd date if gvkey == 100042, lcolor(orange)), ///
    legend(label(1 "Volkswagen AG") label(2 "BMW AG") ///
           label(3 "Mercedes-Benz Group") label(4 "MAN SE"))
Each set of parentheses creates one line. Use lcolor() to assign distinct colors, and legend(label(...)) to label each line with the full company name.

Firm identifiers:
  • 100737 — Volkswagen AG → navy
  • 100022 — BMW AG → maroon
  • 017828 — Mercedes-Benz Group → forest_green
  • 100042 — MAN SE → orange
Don't forget to add axis labels with xtitle("Date") and ytitle("Price (EUR)").
---- CHECKPOINT: multi-firm plot created ----
Graph displayed with 4 firms
No description has been provided for this image

Task 3.2: Refine Legend and Save¶

What you'll do: Improve the legend placement and save the comparison graph.

Why this matters: A well-placed legend enhances readability without obscuring data. Legend options include:

  • Position (inside or outside the plot area)
  • Number of columns
  • Title

What to expect: A polished multi-firm comparison saved as german_autos_comparison.png. Keep the same colors, axis labels, and legend labels (full company names) from Task 3.1.

Stata Stata Tip: Legend Positioning
Control legend appearance with these options:
  • position(6) — Position: 6=bottom, 3=right, 11=top-left
  • cols(2) — Number of columns in legend
  • size(small) — Legend text size
  • off — Hide legend entirely
file /Users/casparm4/Github/rsm-data-analytics-in-finance-private/private/assig
> nments/02-assignment/output/figures/german_autos_comparison.png written in PN
> G format
---- CHECKPOINT: comparison graph exported ----
Saved to: /Users/casparm4/Github/rsm-data-analytics-in-finance-private/private/
> assignments/02-assignment/output/figures/german_autos_comparison.png
No description has been provided for this image
✅ Test passed: Comparison graph exported.

Section 4: Event Window Visualization¶

Task 4.1: Define Event Date¶

What you'll do: Define the Dieselgate event date (September 18, 2015) as a local macro for use in graphs.

Why this matters: Using macros for important dates makes your code:

  • Reusable: Change the date once, update all graphs
  • Readable: event_date is clearer than "td(18sep2015)"
  • Standard practice: Professional Stata code uses macros for key parameters

What to expect: A local macro containing the numeric Stata date for September 18, 2015.

Stata Stata Tip: Local Macros and Dates
Local macros store values for temporary use:
local varname = value        // Store a value
display `varname'           // Use the value (note backtick)
Stata date functions:
  • td(18sep2015) — Convert date to Stata numeric
  • td(1aug2015) — August 1, 2015
  • td(30nov2015) — November 30, 2015
These convert human-readable dates to Stata's internal date format (days since 1960-01-01).
---- CHECKPOINT: event dates defined ----
Event date: 18sep2015
Window: 01aug2015 to 30nov2015

Task 4.2: Create Event Window Plot with Vertical Line¶

What you'll do: Zoom the graph to August-November 2015 and add a vertical line at the event date.

Why this matters: Event studies require zooming to the relevant period to see:

  • Pre-event stability: Normal price behavior before the shock
  • Event impact: Immediate market reaction
  • Post-event dynamics: Recovery or continued decline

A vertical line clearly marks when the event occurred.

What to expect: A focused plot showing the 3-month period around September 18, 2015, with a vertical line marking the EPA announcement. Use the same firm colors, legend labels, and axis labels from Section 3. Restrict the date range using the window_start and window_end local macros you defined in Task 4.1.

Stata Stata Tip: Vertical Reference Lines
Add vertical lines with xline():
twoway ..., ///
    xline(`event_date', lpattern(dash) lcolor(red)) ///
    ttext(100 `event_date' "Event", placement(e))
Options:
  • xline(value) — Draw vertical line at x=value
  • lpattern(dash) — Dashed line
  • lcolor(red) — Line color
  • if date >= ... & date <= ... — Restrict date range
---- CHECKPOINT: event window plot created ----
Graph shows Aug-Nov 2015 with event marker
No description has been provided for this image

Task 4.3: Export Event Window Plot¶

What you'll do: Save the event window visualization.

What to expect: event_window_zoom.png saved to your figures folder.

file /Users/casparm4/Github/rsm-data-analytics-in-finance-private/private/assig
> nments/02-assignment/output/figures/event_window_zoom.png written in PNG form
> at
---- CHECKPOINT: event window exported ----
Saved to: /Users/casparm4/Github/rsm-data-analytics-in-finance-private/private/
> assignments/02-assignment/output/figures/event_window_zoom.png
✅ Test passed: Event window graph exported.

Section 5: Returns Distribution¶

Task 5.1: Create Histogram of Daily Returns¶

What you'll do: Create a histogram of daily log returns across all firms and time periods.

Why this matters: Histograms reveal the distribution shape:

  • Normality: Are returns normally distributed? (important for statistical tests)
  • Fat tails: Are extreme returns more common than normal distribution predicts?
  • Skewness: Is the distribution symmetric?

What to expect: A bell-shaped distribution centered near zero, with some outliers.

Stata Stata Tip: Histograms
The histogram command creates distribution plots:
histogram varname, ///
    normal ///
    bin(30) ///
    title("Title")
Key options:
  • normal — Overlay normal density curve
  • bin(n) — Number of bins (default ~20-30)
  • frequency — Show counts (default is density)
  • percent — Show percentages
(bin=50, start=-.1842681, width=.00528886)
---- CHECKPOINT: returns histogram created ----
Histogram displayed with normal overlay
No description has been provided for this image

Task 5.2: Export Returns Histogram¶

What you'll do: Save the returns distribution histogram as returns_histogram.png in your figures folder.

file /Users/casparm4/Github/rsm-data-analytics-in-finance-private/private/assig
> nments/02-assignment/output/figures/returns_histogram.png written in PNG form
> at
---- CHECKPOINT: histogram exported ----
Saved to: /Users/casparm4/Github/rsm-data-analytics-in-finance-private/private/
> assignments/02-assignment/output/figures/returns_histogram.png
✅ Test passed: Histogram exported.

Section 6: Firm-Specific Returns Distribution¶

Task 6.1: Compare Return Distributions by Firm¶

What you'll do: Create histograms of returns for each firm separately to compare volatility.

Why this matters: Different firms may have different volatility profiles:

  • VW likely has more extreme returns (especially around September 2015)
  • BMW and Mercedes may show more stable distributions
  • Comparing distributions reveals these differences

What to expect: Four separate histograms (or overlaid) showing that VW has wider tails than peers.

Stata Stata Tip: Histograms by Group
Create separate histograms by group with by():
histogram ret, by(conm, note("")) normal bin(40) ///
    xtitle("Daily Log Return") ytitle("Density")
This creates a panel of histograms, one for each value of conm (company name). Use bin(40) for a good level of detail, and add axis labels for clarity.
---- CHECKPOINT: firm-specific histograms created ----
Panel of 4 histograms displayed
No description has been provided for this image

Section 7: Cumulative Returns Plot¶

ℹ️ Background: What are Cumulative Log Returns?
Cumulative log returns measure the total return over a period by summing daily log returns:

Formula: Cumulative Returnt = Σ(daily log returns from start to day t)

Why cumulative log returns?
  • Additivity: Log returns sum correctly across time periods (simple returns don't)
  • Comparability: Starting all firms at 0 shows relative performance clearly
  • Interpretation: A cumulative return of 0.50 means ~50% total growth (approximately)
Note: This is a simplified calculation for visualization. In Assignment 5 (Event Study), you'll learn more sophisticated methods for calculating cumulative abnormal returns (CARs) that account for expected returns.

Task 7.1: Calculate Cumulative Returns¶

What you'll do: Calculate cumulative log returns for each firm, normalized to start at 0.

Why this matters: Cumulative returns show total performance over time:

  • Which firm performed best/worst over the full sample?
  • When did major divergences occur?
  • How much wealth would an investor have gained/lost?

Formula: Cumulative return = sum of daily log returns within each firm

What to expect: A new variable cumret showing each firm's total return from start of sample.

Stata Stata Tip: Cumulative Sums by Group
Calculate cumulative sums within groups:
bysort gvkey (date): gen cumret = sum(ret)
This sums ret within each gvkey, sorted by date. The cumulative sum resets for each firm.
    Variable |        Obs        Mean    Std. dev.       Min        Max
-------------+---------------------------------------------------------
      cumret |      5,216    .2332495    .1720732  -.1559657   .8104563
---- CHECKPOINT: cumulative returns calculated ----
Cumret variable created for all firms
✅ Test passed: Cumulative returns calculated.

Task 7.2: Plot Cumulative Returns¶

What you'll do: Create a time series plot showing cumulative returns for all 4 firms, then save it.

Why this matters: Cumulative returns show total investment performance over time. A horizontal reference line at zero helps distinguish gains from losses.

What to expect: Four lines starting near 0 in 2013, diverging over time, with VW showing a sharp drop in September 2015. Save the graph as cumulative_returns.png in your figures folder.

Use the same firm colors (navy, maroon, forest_green, orange), legend labels (full company names), and legend positioning from Section 3. Add axis labels: "Date" for the x-axis and "Cumulative Log Return" for the y-axis. Include a horizontal dashed reference line at zero using yline(0, lpattern(dash) lcolor(gs10)).

file /Users/casparm4/Github/rsm-data-analytics-in-finance-private/private/assig
> nments/02-assignment/output/figures/cumulative_returns.png written in PNG for
> mat
---- CHECKPOINT: cumulative returns plot exported ----
Saved to: /Users/casparm4/Github/rsm-data-analytics-in-finance-private/private/
> assignments/02-assignment/output/figures/cumulative_returns.png
No description has been provided for this image
✅ Test passed: Cumulative returns plot exported.

Section 8: Publication-Quality Export¶

Task 8.1: Review Graph Export Best Practices¶

What you'll do: Review the key principles for creating publication-quality graphs.

Why this matters: Professional presentations and publications require high-quality visualizations that are:

  • Readable: Clear labels, appropriate font sizes
  • Scalable: High resolution for printing
  • Consistent: Uniform style across all figures
  • Informative: Titles and notes provide context
Stata Stata Tip: Publication-Quality Graphs
Best practices for graph export:
  • Resolution: Use width(1200) or higher (1200-1600 pixels)
  • Format: PNG for web/presentations, PDF/EPS for print publications
  • Scheme: Use consistent scheme across all figures (stcolor recommended)
  • Labels: Always include axis labels with units
  • Titles: Descriptive title + subtitle if needed
  • Legend: Clear labels, positioned to not obscure data
  • Notes: Data source and key details in note

Task 8.2: Verify All Figures Exported¶

What you'll do: List all files in your figures folder to confirm all 5 required graphs were created.

Expected files:

  1. vw_price_timeseries.png
  2. german_autos_comparison.png
  3. event_window_zoom.png
  4. returns_histogram.png
  5. cumulative_returns.png
Stata Stata Tip: Shell commands
Easy way to list files in a directory:
  • List files in a directory (here: Figures): Use !ls -lh "$figures"
    • ls = "list" command that shows files and directories
    • -l = long format (shows detailed information like permissions, size, date)
    • -h = human-readable sizes (e.g., "1.2M" instead of "1234567")
---- CHECKPOINT: figures exported ----
All figures should appear in the list above
total 1168
-rw-r--r--@ 1 casparm4  staff   183K Feb 10 09:24 cumulative_returns.png
-rw-r--r--@ 1 casparm4  staff   115K Feb 10 09:24 event_window_zoom.png
-rw-r--r--@ 1 casparm4  staff   156K Feb 10 09:24 german_autos_comparison.png
-rw-r--r--@ 1 casparm4  staff    47K Feb 10 09:24 returns_histogram.png
-rw-r--r--@ 1 casparm4  staff    80K Feb 10 09:24 vw_price_timeseries.png
✅ Test passed: All 5 figures exported successfully.

Summary¶

Congratulations! You have successfully:

✅ Loaded and verified the cleaned panel dataset from Assignment 1
✅ Created a time series plot showing VW stock prices over 2013-2017
✅ Compared all 4 German automakers on a multi-line plot
✅ Zoomed to the event window and added a vertical reference line
✅ Examined the distribution of daily returns with histograms
✅ Compared return distributions across firms
✅ Calculated and plotted cumulative returns over time
✅ Exported all 5 graphs as publication-quality PNG files

Key Concepts Learned¶

Time Series Visualization:

  • Creating line plots with twoway line
  • Multi-line plots with color-coded legends
  • Date range filtering with if date >= ... & date <= ...

Event Visualization:

  • Adding vertical reference lines with xline()
  • Defining dates with td() function
  • Using local macros for reusable parameters

Distribution Analysis:

  • Histograms with histogram command
  • Normal density overlays
  • Panel histograms with by() option

Publication Quality:

  • Consistent graph schemes (stcolor)
  • Descriptive titles, axis labels, and notes
  • High-resolution export with graph export
  • Cumulative calculations with bysort and sum()

These visualization skills are fundamental for:

  • Exploratory data analysis: Understanding patterns before modeling
  • Communicating findings: Presenting results to stakeholders
  • Event studies: Visualizing market reactions to corporate events

The graphs you created today will be essential for Assignment 3, where you'll use regression analysis to quantify the effects you've visualized.


References¶

  • Stata Graphics Reference Manual
  • twoway Documentation
  • Stata Graph Schemes
  • EPA Notice of Violation (2015)

Data Analytics for Finance

BM17FI · Academic Year 2025–26

Erasmus University Rotterdam

Created by: Caspar David Peter

© 2026 Rotterdam School of Management