✅ Graph scheme set to stcolor.
✅ The environment is cleared and ready.

/Users/casparm4/Github/rsm-data-analytics-in-finance-private/private/assignment
> s/06-assignment
📁 Base directory: /Users/casparm4/Github/rsm-data-analytics-in-finance-private
> /private/assignments/06-assignment
📁 Raw data folder: /Users/casparm4/Github/rsm-data-analytics-in-finance-privat
> e/private/assignments/06-assignment/data/raw
📁 Processed data folder: /Users/casparm4/Github/rsm-data-analytics-in-finance-
> private/private/assignments/06-assignment/data/processed
📁 Output directory: /Users/casparm4/Github/rsm-data-analytics-in-finance-priva
> te/private/assignments/06-assignment/output
📁 Tables folder: /Users/casparm4/Github/rsm-data-analytics-in-finance-private/
> private/assignments/06-assignment/output/tables
📁 Figures folder: /Users/casparm4/Github/rsm-data-analytics-in-finance-private
> /private/assignments/06-assignment/output/figures

Contains data from /Users/casparm4/Github/rsm-data-analytics-in-finance-private
> /private/assignments/06-assignment/data/raw/crime.dta
 Observations:       172,344                  
    Variables:             5                  06 Jan 2026 22:55
-------------------------------------------------------------------------------
Variable      Storage   Display    Value
    name         type    format    label      Variable label
-------------------------------------------------------------------------------
crime_type      str21   %-9s                  Type of Crime
municipality~de str6    %-9s                  Municipality Code
municipality~me str29   %-9s                  Municipality Name
reported_crimes long    %12.0g                
date            double  %td                   Date (Year-Month)
-------------------------------------------------------------------------------
Sorted by: 

     +-------------------------------------------------------------------+
     | crime_type          munic~de   municipa~me   report~s        date |
     |-------------------------------------------------------------------|
  1. | Totaal misdrijven   GM1680     Aa en Hunze         84   01jan2012 |
  2. | Totaal misdrijven   GM1680     Aa en Hunze        103   01feb2012 |
  3. | Totaal misdrijven   GM1680     Aa en Hunze        101   01mar2012 |
  4. | Totaal misdrijven   GM1680     Aa en Hunze         71   01apr2012 |
  5. | Totaal misdrijven   GM1680     Aa en Hunze         78   01may2012 |
     |-------------------------------------------------------------------|
  6. | Totaal misdrijven   GM1680     Aa en Hunze         50   01jun2012 |
  7. | Totaal misdrijven   GM1680     Aa en Hunze         69   01jul2012 |
  8. | Totaal misdrijven   GM1680     Aa en Hunze         99   01aug2012 |
  9. | Totaal misdrijven   GM1680     Aa en Hunze         73   01sep2012 |
 10. | Totaal misdrijven   GM1680     Aa en Hunze         77   01oct2012 |
     +-------------------------------------------------------------------+

        Type of Crime |      Freq.     Percent        Cum.
----------------------+-----------------------------------
1.4.2 Moord, doodslag |     57,448       33.33       33.33
   1.4.5 Mishandeling |     57,448       33.33       66.67
    Totaal misdrijven |     57,448       33.33      100.00
----------------------+-----------------------------------
                Total |    172,344      100.00

    Variable |        Obs        Mean    Std. dev.       Min        Max
-------------+---------------------------------------------------------
reported_c~s |    126,324    101.4997     377.645          1       9239
---- CHECKPOINT: Crime data loaded ----
Observations: 172344
Variables: 5

✅ Task 2.1 tests passed

Contains data from /Users/casparm4/Github/rsm-data-analytics-in-finance-private
> /private/assignments/06-assignment/data/raw/income_data.dta
 Observations:           355                  
    Variables:             2                  06 Jan 2026 22:55
-------------------------------------------------------------------------------
Variable      Storage   Display    Value
    name         type    format    label      Variable label
-------------------------------------------------------------------------------
municipality_~e str6    %-9s                  Municipality Code
avg_median_st~e double  %10.0g                Average Median Standardized
                                                Income
-------------------------------------------------------------------------------
Sorted by: 

     +----------------------+
     | munici~e   avg_med~e |
     |----------------------|
  1. | GM0003     20.606186 |
  2. | GM0010     21.084211 |
  3. | GM0014     19.855506 |
  4. | GM0024     22.255462 |
  5. | GM0034     22.317833 |
     |----------------------|
  6. | GM0037     20.437981 |
  7. | GM0047      20.88774 |
  8. | GM0050     23.638929 |
  9. | GM0059     20.745074 |
 10. | GM0060     23.382847 |
     +----------------------+

    Variable |        Obs        Mean    Std. dev.       Min        Max
-------------+---------------------------------------------------------
avg_median~e |        355    23.53126    1.867351   19.83227   36.45822
---- CHECKPOINT: Income data loaded ----
Municipalities: 355

✅ Task 2.2 tests passed

Contains data from /Users/casparm4/Github/rsm-data-analytics-in-finance-private
> /private/assignments/06-assignment/data/raw/aex_daily_data.dta
 Observations:         2,609                  
    Variables:             2                  06 Jan 2026 22:55
-------------------------------------------------------------------------------
Variable      Storage   Display    Value
    name         type    format    label      Variable label
-------------------------------------------------------------------------------
date            double  %td                   Date
price           double  %10.0g                AEX Index Closing Price
-------------------------------------------------------------------------------
Sorted by: 

     +----------------------+
     |      date      price |
     |----------------------|
  1. | 03jan2011   359.8629 |
  2. | 04jan2011   358.8617 |
  3. | 05jan2011   357.2818 |
  4. | 06jan2011   356.8858 |
  5. | 07jan2011   356.4425 |
     |----------------------|
  6. | 10jan2011   354.2092 |
  7. | 11jan2011    358.303 |
  8. | 12jan2011   362.4137 |
  9. | 13jan2011   360.7873 |
 10. | 14jan2011   361.3202 |
     +----------------------+

    Variable |        Obs        Mean    Std. dev.       Min        Max
-------------+---------------------------------------------------------
       price |      2,609    451.2762    92.43622   263.4371   629.2317

    Variable |        Obs        Mean    Std. dev.       Min        Max
-------------+---------------------------------------------------------
        date |      2,609     20454.8    1054.616      18630      22280
---- CHECKPOINT: AEX data loaded ----
Trading days: 2609

✅ Task 2.3 tests passed

(1 missing value generated)
(1 missing value generated)

                     Daily simple return
-------------------------------------------------------------
      Percentiles      Smallest
 1%    -.0323815      -.1075333
 5%    -.0172826      -.0764755
10%    -.0116792        -.05704       Obs               2,608
25%    -.0047384       -.052363       Sum of wgt.       2,608

50%     .0004368                      Mean           .0002727
                        Largest       Std. dev.      .0110496
75%     .0058393       .0421586
90%     .0118066       .0452309       Variance       .0001221
95%     .0171069       .0461471       Skewness      -.5256861
99%     .0295844       .0897213       Kurtosis       10.82545
Non-missing returns: 2608
Mean daily return: 0.0003
Std dev: 0.0110

✅ Task 3.1 tests passed

               Trailing 252-day SD of returns
-------------------------------------------------------------
      Percentiles      Smallest
 1%     .0052468       .0011458
 5%     .0058144       .0015453
10%     .0067857       .0016472       Obs               2,606
25%     .0075677       .0021919       Sum of wgt.       2,606

50%     .0084327                      Mean           .0102832
                        Largest       Std. dev.      .0038261
75%     .0130778       .0203429
90%     .0162531       .0203489       Variance       .0000146
95%     .0187518       .0203502       Skewness       .9560908
99%      .020195       .0203506       Kurtosis       2.984837
  3
Missing SD values (first ~252 days): 3
Mean trailing SD: 0.0103

✅ Task 3.2 tests passed

(3 missing values generated)

             Standardized daily return (ret/SD)
-------------------------------------------------------------
      Percentiles      Smallest
 1%    -3.347214      -9.463235
 5%    -1.738406      -7.754665
10%    -1.204699      -5.648883       Obs               2,606
25%    -.5011903       -4.94842       Sum of wgt.       2,606

50%      .039744                      Mean           .0189555
                        Largest       Std. dev.      1.077812
75%     .6185057       3.729102
90%     1.172071       4.049637       Variance       1.161679
95%     1.639769        5.27277       Skewness      -.7527249
99%      2.53291       5.910226       Kurtosis       8.611722
Non-missing standardized returns: 2606
Mean: 0.0190
Std dev: 1.0778

✅ Task 3.3 tests passed

✅ Task 3.4 tests passed

     +--------------------------------------------------------------------+
     | year_m~h   ret_com~d   ret_std~y   volati~y   n_trad~s        date |
     |--------------------------------------------------------------------|
  1. |   2011m1    .0024548    .5534533    .007666         20   01jan2011 |
  2. |   2011m2    .0232263    2.948537   .0068946         20   01feb2011 |
  3. |   2011m3   -.0094981   -2.251385   .0097187         23   01mar2011 |
  4. |   2011m4   -.0155369   -1.979866   .0076843         21   01apr2011 |
  5. |   2011m5   -.0291778   -3.587304   .0083145         22   01may2011 |
     |--------------------------------------------------------------------|
  6. |   2011m6    -.028016   -3.423634   .0093389         22   01jun2011 |
  7. |   2011m7   -.0308594   -3.683877    .010141         21   01jul2011 |
  8. |   2011m8   -.1100779   -13.34819   .0223073         23   01aug2011 |
  9. |   2011m9   -.0435389   -3.706579    .023562         22   01sep2011 |
 10. |  2011m10    .0975143    6.884334   .0174383         21   01oct2011 |
     |--------------------------------------------------------------------|
 11. |  2011m11   -.0254433   -1.531934   .0208009         22   01nov2011 |
 12. |  2011m12    .0427059    2.665466   .0124832         22   01dec2011 |
     +--------------------------------------------------------------------+

    Variable |        Obs        Mean    Std. dev.       Min        Max
-------------+---------------------------------------------------------
  year_month |        120       671.5    34.78505        612        731
ret_compound |        120    .0054344    .0408573  -.1100779   .1351352
ret_std_mo~y |        120    .4116507    4.289544  -13.34819    8.55268
  volatility |        120    .0098128    .0052895    .003573   .0422672
n_trading_~s |        120    21.73333    .9850422         20         23
-------------+---------------------------------------------------------
        date |        120    20438.45    1058.793      18628      22250
---- CHECKPOINT: Monthly aggregation complete ----
Months: 120
Mean monthly return: 0.0054
SD of monthly returns: 0.0409

✅ Task 3.5 tests passed

file
    /Users/casparm4/Github/rsm-data-analytics-in-finance-private/private/assi
    > gnments/06-assignment/data/processed/aex_monthly.dta saved

✅ Task 3.6 tests passed

Contains data from /Users/casparm4/Github/rsm-data-analytics-in-finance-private
> /private/assignments/06-assignment/data/raw/crime.dta
 Observations:       172,344                  
    Variables:             5                  06 Jan 2026 22:55
-------------------------------------------------------------------------------
Variable      Storage   Display    Value
    name         type    format    label      Variable label
-------------------------------------------------------------------------------
crime_type      str21   %-9s                  Type of Crime
municipality~de str6    %-9s                  Municipality Code
municipality~me str29   %-9s                  Municipality Name
reported_crimes long    %12.0g                
date            double  %td                   Date (Year-Month)
-------------------------------------------------------------------------------
Sorted by: 

        Type of Crime |      Freq.     Percent        Cum.
----------------------+-----------------------------------
1.4.2 Moord, doodslag |     57,448       33.33       33.33
   1.4.5 Mishandeling |     57,448       33.33       66.67
    Totaal misdrijven |     57,448       33.33      100.00
----------------------+-----------------------------------
                Total |    172,344      100.00

     +--------------------------------------------------------+
     | municipa~me        date   crime_type          report~s |
     |--------------------------------------------------------|
  1. | Aa en Hunze   01jan2012   Totaal misdrijven         84 |
  2. | Aa en Hunze   01feb2012   Totaal misdrijven        103 |
  3. | Aa en Hunze   01mar2012   Totaal misdrijven        101 |
  4. | Aa en Hunze   01apr2012   Totaal misdrijven         71 |
  5. | Aa en Hunze   01may2012   Totaal misdrijven         78 |
     |--------------------------------------------------------|
  6. | Aa en Hunze   01jun2012   Totaal misdrijven         50 |
  7. | Aa en Hunze   01jul2012   Totaal misdrijven         69 |
  8. | Aa en Hunze   01aug2012   Totaal misdrijven         99 |
  9. | Aa en Hunze   01sep2012   Totaal misdrijven         73 |
 10. | Aa en Hunze   01oct2012   Totaal misdrijven         77 |
     |--------------------------------------------------------|
 11. | Aa en Hunze   01nov2012   Totaal misdrijven         67 |
 12. | Aa en Hunze   01dec2012   Totaal misdrijven         53 |
 13. | Aa en Hunze   01jan2013   Totaal misdrijven         76 |
 14. | Aa en Hunze   01feb2013   Totaal misdrijven         65 |
 15. | Aa en Hunze   01mar2013   Totaal misdrijven         56 |
     +--------------------------------------------------------+
---- CHECKPOINT: Crime data structure examined ----
Total observations (municipality × month × crime_type): 172344
Crime types: 3

✅ Task 4.1 tests passed

(172,344 missing values generated)
(57,448 real changes made)
(57,448 real changes made)
(57,448 real changes made)

           |          Type of Crime
  crime_id | 1.4.2 M..  1.4.5 M..  Totaal .. |     Total
-----------+---------------------------------+----------
     total |         0          0     57,448 |    57,448 
   assault |         0     57,448          0 |    57,448 
    murder |    57,448          0          0 |    57,448 
-----------+---------------------------------+----------
     Total |    57,448     57,448     57,448 |   172,344

✅ Task 4.2 tests passed

Duplicates in terms of municipality_code date crime_id

--------------------------------------
   Copies | Observations       Surplus
----------+---------------------------
        1 |       172344             0
--------------------------------------
(j = 1 2 3)

Data                               Long   ->   Wide
-----------------------------------------------------------------------------
Number of observations          172,344   ->   57,448      
Number of variables                   5   ->   6           
j variable (3 values)          crime_id   ->   (dropped)
xij variables:
                        reported_crimes   ->   reported_crimes1 reported_crimes
> 2 reported_crimes3
-----------------------------------------------------------------------------

Contains data
 Observations:        57,448                  
    Variables:             6                  
-------------------------------------------------------------------------------
Variable      Storage   Display    Value
    name         type    format    label      Variable label
-------------------------------------------------------------------------------
municipality~de str6    %-9s                  Municipality Code
date            double  %td                   Date (Year-Month)
total_crimes    long    %12.0g                Total crimes reported
assault_crimes  long    %12.0g                Assault crimes reported
murder_crimes   long    %12.0g                Murder/manslaughter crimes
                                                reported
municipality~me str29   %-9s                  Municipality Name
-------------------------------------------------------------------------------
Sorted by: municipality_code  date
     Note: Dataset has changed since last saved.

     +-------------------------------------------------------+
  1. | munic~de |      date | total_~s | assaul~s | murder~s |
     | GM0014   | 01jan2012 |     1562 |      111 |        5 |
     |-------------------------------------------------------|
     |                 municipality_name                     |
     |                 Groningen (gemeente)                  |
     +-------------------------------------------------------+

     +-------------------------------------------------------+
  2. | munic~de |      date | total_~s | assaul~s | murder~s |
     | GM0014   | 01feb2012 |     1373 |       95 |        4 |
     |-------------------------------------------------------|
     |                 municipality_name                     |
     |                 Groningen (gemeente)                  |
     +-------------------------------------------------------+

     +-------------------------------------------------------+
  3. | munic~de |      date | total_~s | assaul~s | murder~s |
     | GM0014   | 01mar2012 |     1663 |      105 |        8 |
     |-------------------------------------------------------|
     |                 municipality_name                     |
     |                 Groningen (gemeente)                  |
     +-------------------------------------------------------+

     +-------------------------------------------------------+
  4. | munic~de |      date | total_~s | assaul~s | murder~s |
     | GM0014   | 01apr2012 |     1520 |       91 |        3 |
     |-------------------------------------------------------|
     |                 municipality_name                     |
     |                 Groningen (gemeente)                  |
     +-------------------------------------------------------+

     +-------------------------------------------------------+
  5. | munic~de |      date | total_~s | assaul~s | murder~s |
     | GM0014   | 01may2012 |     1728 |       90 |        3 |
     |-------------------------------------------------------|
     |                 municipality_name                     |
     |                 Groningen (gemeente)                  |
     +-------------------------------------------------------+

     +-------------------------------------------------------+
  6. | munic~de |      date | total_~s | assaul~s | murder~s |
     | GM0014   | 01jun2012 |     1547 |       98 |        8 |
     |-------------------------------------------------------|
     |                 municipality_name                     |
     |                 Groningen (gemeente)                  |
     +-------------------------------------------------------+

     +-------------------------------------------------------+
  7. | munic~de |      date | total_~s | assaul~s | murder~s |
     | GM0014   | 01jul2012 |     1598 |       86 |        4 |
     |-------------------------------------------------------|
     |                 municipality_name                     |
     |                 Groningen (gemeente)                  |
     +-------------------------------------------------------+

     +-------------------------------------------------------+
  8. | munic~de |      date | total_~s | assaul~s | murder~s |
     | GM0014   | 01aug2012 |     1450 |       83 |        5 |
     |-------------------------------------------------------|
     |                 municipality_name                     |
     |                 Groningen (gemeente)                  |
     +-------------------------------------------------------+

     +-------------------------------------------------------+
  9. | munic~de |      date | total_~s | assaul~s | murder~s |
     | GM0014   | 01sep2012 |     1742 |      100 |        6 |
     |-------------------------------------------------------|
     |                 municipality_name                     |
     |                 Groningen (gemeente)                  |
     +-------------------------------------------------------+

     +-------------------------------------------------------+
 10. | munic~de |      date | total_~s | assaul~s | murder~s |
     | GM0014   | 01oct2012 |     1703 |      108 |        . |
     |-------------------------------------------------------|
     |                 municipality_name                     |
     |                 Groningen (gemeente)                  |
     +-------------------------------------------------------+
---- CHECKPOINT: Data reshaped wide ----
Observations (municipality × month): 57448

    Variable |        Obs        Mean    Std. dev.       Min        Max
-------------+---------------------------------------------------------
total_crimes |     57,243    212.5807    539.8091          1       9239
assault_cr~s |     51,940    11.79105    28.66676          1        463
murder_cri~s |     17,141    2.372032    3.827666          1         48

✅ Task 4.3 tests passed

  205
Replacing 205 missing values in total_crimes with 0
(205 real changes made)
  5,508
Replacing 5508 missing values in assault_crimes with 0
(5,508 real changes made)
  40,307
Replacing 40307 missing values in murder_crimes with 0
(40,307 real changes made)

    Variable |        Obs        Mean    Std. dev.       Min        Max
-------------+---------------------------------------------------------
total_crimes |     57,448    211.8221    538.9942          0       9239
assault_cr~s |     57,448    10.66055    27.47804          0        463
murder_cri~s |     57,448    .7077531     2.35568          0         48
  0
Remaining missing values: 0

✅ Task 4.4 tests passed

file
    /Users/casparm4/Github/rsm-data-analytics-in-finance-private/private/assi
    > gnments/06-assignment/data/processed/crime_wide.dta saved
✅ Wide crime data saved to: /Users/casparm4/Github/rsm-data-analytics-in-financ
> e-private/private/assignments/06-assignment/data/processed/crime_wide.dta
Observations: 57448

✅ Task 4.5 tests passed

Observations: 57448
Variables: 6
Time period: 01jan2012 to 01nov2025

✅ Task 5.1 tests passed

    Result                      Number of obs
    -----------------------------------------
    Not matched                         1,187
        from master                     1,169  (_merge==1)
        from using                         18  (_merge==2)

    Matched                            56,279  (_merge==3)
    -----------------------------------------

   Matching result from |
                  merge |      Freq.     Percent        Cum.
------------------------+-----------------------------------
        Master only (1) |      1,169        2.03        2.03
         Using only (2) |         18        0.03        2.07
            Matched (3) |     56,279       97.93      100.00
------------------------+-----------------------------------
                  Total |     57,466      100.00
(1,187 observations deleted)
Observations after merge: 56279

    Variable |        Obs        Mean    Std. dev.       Min        Max
-------------+---------------------------------------------------------
avg_median~e |     56,279    23.50222    1.863046   19.83227   36.45822

✅ Task 5.2 tests passed

---- CHECKPOINT: Year-month variable created ----
Unique months: 167

     +---------------------------------+
     | munic~de        date   year_m~h |
     |---------------------------------|
  1. | GM0014     01jan2012     2012m1 |
  2. | GM0014     01feb2012     2012m2 |
  3. | GM0014     01mar2012     2012m3 |
  4. | GM0014     01apr2012     2012m4 |
  5. | GM0014     01may2012     2012m5 |
     +---------------------------------+

✅ Task 5.3 tests passed

    Result                      Number of obs
    -----------------------------------------
    Not matched                        19,895
        from master                    19,883  (_merge==1)
        from using                         12  (_merge==2)

    Matched                            36,396  (_merge==3)
    -----------------------------------------

   Matching result from |
                  merge |      Freq.     Percent        Cum.
------------------------+-----------------------------------
        Master only (1) |     19,883       35.32       35.32
         Using only (2) |         12        0.02       35.34
            Matched (3) |     36,396       64.66      100.00
------------------------+-----------------------------------
                  Total |     56,291      100.00
(19,895 observations deleted)
---- CHECKPOINT: AEX returns merged ----
Observations after merge: 36396

    Variable |        Obs        Mean    Std. dev.       Min        Max
-------------+---------------------------------------------------------
ret_compound |     36,396    .0072072    .0393381  -.1037125   .1351352
ret_std_mo~y |     36,396    .6468432    4.149869  -12.38631    8.55268

✅ Task 5.4 tests passed

Panel variable: muni_id (strongly balanced)
 Time variable: year_month, 2012m1 to 2020m12
         Delta: 1 month

 muni_id:  1, 2, ..., 337                                    n =        337
year_month:  2012m1, 2012m2, ..., 2020m12                    T =        108
           Delta(year_month) = 1 month
           Span(year_month)  = 108 periods
           (muni_id*year_month uniquely identifies each observation)

Distribution of T_i:   min      5%     25%       50%       75%     95%     max
                       108     108     108       108       108     108     108

     Freq.  Percent    Cum. |  Pattern
 ---------------------------+--------------------------------------------------
> ------------------------------------------------------------
      337    100.00  100.00 |  111111111111111111111111111111111111111111111111
> 111111111111111111111111111111111111111111111111111111111111
 ---------------------------+--------------------------------------------------
> ------------------------------------------------------------
      337    100.00         |  XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
> XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
file
    /Users/casparm4/Github/rsm-data-analytics-in-finance-private/private/assi
    > gnments/06-assignment/data/processed/analysis_panel.dta saved
✅ Merged analysis panel saved to: /Users/casparm4/Github/rsm-data-analytics-in-
> finance-private/private/assignments/06-assignment/data/processed/analysis_pan
> el.dta
Municipalities: 337
Time periods: .
Total observations: 36396

✅ Task 5.5 tests passed

Panel variable: muni_id (strongly balanced)
 Time variable: year_month, 2012m1 to 2020m12
         Delta: 1 month

✅ Task 6.1 tests passed

       Income |
      tercile |
      (1=low, |
       2=mid, |
      3=high) |      Freq.     Percent        Cum.
--------------+-----------------------------------
   Low income |     12,204       33.53       33.53
Medium income |     12,096       33.23       66.77
  High income |     12,096       33.23      100.00
--------------+-----------------------------------
        Total |     36,396      100.00

-------------------------------------------------------------------------------
-> income_tercile = Low income

    Variable |        Obs        Mean    Std. dev.       Min        Max
-------------+---------------------------------------------------------
avg_median~e |     12,204     21.7271    .7816501   19.83227   22.80175

-------------------------------------------------------------------------------
-> income_tercile = Medium income

    Variable |        Obs        Mean    Std. dev.       Min        Max
-------------+---------------------------------------------------------
avg_median~e |     12,096    23.34742    .2878872   22.82476   23.89005

-------------------------------------------------------------------------------
-> income_tercile = High income

    Variable |        Obs        Mean    Std. dev.       Min        Max
-------------+---------------------------------------------------------
avg_median~e |     12,096      25.448    1.659939   23.91478   36.45822

---- CHECKPOINT: Income terciles created ----
Tercile groups: 3

✅ Task 6.2 tests passed

✅ Task 6.3 tests passed

    Variable |        Obs        Mean    Std. dev.       Min        Max
-------------+---------------------------------------------------------
   ret_x_low |     36,396     .216894    2.422353  -12.38631    8.55268
   ret_x_mid |     36,396    .2149746    2.411696  -12.38631    8.55268
  ret_x_high |     36,396    .2149746    2.411696  -12.38631    8.55268
---- CHECKPOINT: Interaction terms created ----
Interactions ready for regression analysis

✅ Task 6.4 tests passed

Variable      Storage   Display    Value
    name         type    format    label      Variable label
-------------------------------------------------------------------------------
municipality~de str6    %-9s                  Municipality Code
municipality~me str29   %-9s                  Municipality Name
date            double  %td                   Date (Year-Month)
year_month      float   %tm                   Year-month
total_crimes    long    %12.0g                Total crimes reported
assault_crimes  long    %12.0g                Assault crimes reported
murder_crimes   long    %12.0g                Murder/manslaughter crimes
                                                reported
avg_median_st~e double  %10.0g                Average Median Standardized
                                                Income
income_tercile  byte    %13.0g     tercile_lbl
                                              Income tercile (1=low, 2=mid,
                                                3=high)
low_income      float   %9.0g                 Low income tercile (1=yes, 0=no)
mid_income      float   %9.0g                 Medium income tercile (1=yes,
                                                0=no)
high_income     float   %9.0g                 High income tercile (1=yes, 0=no)
ret_compound    float   %9.0g                 Monthly compound return
ret_std_monthly float   %9.0g                 Monthly standardized return (sum
                                                of daily std ret)
ret_x_low       float   %9.0g                 Standardized return × Low income
ret_x_mid       float   %9.0g                 Standardized return × Medium
                                                income
ret_x_high      float   %9.0g                 Standardized return × High income

    Variable |        Obs        Mean    Std. dev.       Min        Max
-------------+---------------------------------------------------------
total_crimes |     36,396     223.713     568.559          0       9239
assault_cr~s |     36,396    11.27756    28.59991          0        463
murder_cri~s |     36,396    .7485163      2.4274          0         46
avg_median~e |     36,396    23.50222    1.863055   19.83227   36.45822
ret_std_mo~y |     36,396    .6468432    4.149869  -12.38631    8.55268
-------------+---------------------------------------------------------
   ret_x_low |     36,396     .216894    2.422353  -12.38631    8.55268
   ret_x_mid |     36,396    .2149746    2.411696  -12.38631    8.55268
  ret_x_high |     36,396    .2149746    2.411696  -12.38631    8.55268
file
    /Users/casparm4/Github/rsm-data-analytics-in-finance-private/private/assi
    > gnments/06-assignment/data/processed/analysis_panel.dta saved
Observations: 36396
Variables: 20
File saved: /Users/casparm4/Github/rsm-data-analytics-in-finance-private/privat
> e/assignments/06-assignment/data/processed/analysis_panel.dta

✅ Part A: Data Preparation COMPLETE
 Dataset ready for regression analysis in Part B

✅ Task 6.5 tests passed

Panel variable: muni_id (strongly balanced)
 Time variable: year_month, 2012m1 to 2020m12
         Delta: 1 month

Summary for variables: total_crimes assault_crimes murder_crimes
Group variable: income_tercile (Income tercile (1=low, 2=mid, 3=high))

income_tercile |         N      Mean        SD       Min       Max
---------------+--------------------------------------------------
    Low income |  12204.00    451.41    927.86     12.00   9239.00
               |  12204.00     23.77     46.21      0.00    463.00
               |  12204.00      1.63      3.93      0.00     46.00
---------------+--------------------------------------------------
 Medium income |  12096.00    131.25    132.65      0.00   1159.00
               |  12096.00      6.37      7.19      0.00     58.00
               |  12096.00      0.39      0.78      0.00      8.00
---------------+--------------------------------------------------
   High income |  12096.00     86.45     82.52      1.00    964.00
               |  12096.00      3.58      3.83      0.00     39.00
               |  12096.00      0.22      0.55      0.00      8.00
---------------+--------------------------------------------------
         Total |  36396.00    223.71    568.56      0.00   9239.00
               |  36396.00     11.28     28.60      0.00    463.00
               |  36396.00      0.75      2.43      0.00     46.00
------------------------------------------------------------------

✅ Task 7.1 tests passed

Compound Monthly Returns:

                   Monthly compound return
-------------------------------------------------------------
      Percentiles      Smallest
 1%     -.101367      -.1037125
 5%    -.0590024      -.1037125
10%    -.0470165      -.1037125       Obs              36,396
25%    -.0185647      -.1037125       Sum of wgt.      36,396

50%     .0115142                      Mean           .0072072
                        Largest       Std. dev.      .0393381
75%     .0340431       .1351352
90%     .0509977       .1351352       Variance       .0015475
95%     .0623475       .1351352       Skewness      -.1718706
99%     .0973073       .1351352       Kurtosis       3.780191

Standardized Monthly Returns:

     Monthly standardized return (sum of daily std ret)
-------------------------------------------------------------
      Percentiles      Smallest
 1%    -9.791107      -12.38631
 5%    -7.349111      -12.38631
10%     -4.55779      -12.38631       Obs              36,396
25%    -1.753273      -12.38631       Sum of wgt.      36,396

50%     1.219513                      Mean           .6468432
                        Largest       Std. dev.      4.149869
75%      3.87236        8.55268
90%     5.401238        8.55268       Variance       17.22141
95%     6.930266        8.55268       Skewness       -.625603
99%     7.930677        8.55268       Kurtosis       3.264927

✅ Task 7.2 tests passed

(Note: Below code run with echo to enable preserve/restore functionality.)

. di as txt ""


. di as txt "Creating Panel A: Crime Statistics by Income Tercile..."
Creating Panel A: Crime Statistics by Income Tercile...

. preserve

. estpost tabstat total_crimes assault_crimes murder_crimes, by(income_tercile)
>  statistics(mean sd min max count) columns(statistics)

Summary statistics: mean sd min max count
     for variables: total_crimes assault_crimes murder_crimes
  by categories of: income_tercile

income_terci |   e(mean)      e(sd)     e(min)     e(max)   e(count) 
-------------+-------------------------------------------------------
Low income   |                                                       
total_crimes |  451.4092   927.8566         12       9239      12204 
assault_cr~s |   23.7718   46.20558          0        463      12204 
murder_cri~s |  1.633972   3.933833          0         46      12204 
-------------+-------------------------------------------------------
Medium inc~e |                                                       
total_crimes |  131.2493   132.6499          0       1159      12096 
assault_cr~s |  6.371197   7.192719          0         58      12096 
murder_cri~s |  .3865741   .7776529          0          8      12096 
-------------+-------------------------------------------------------
High income  |                                                       
total_crimes |   86.4475   82.52387          1        964      12096 
assault_cr~s |  3.578125   3.831013          0         39      12096 
murder_cri~s |  .2170966   .5548765          0          8      12096 
-------------+-------------------------------------------------------
Total        |                                                       
total_crimes |   223.713    568.559          0       9239      36396 
assault_cr~s |  11.27756   28.59991          0        463      36396 
murder_cri~s |  .7485163     2.4274          0         46      36396 

. esttab using "$tables/summary_stats_panelA.tex", replace cells("mean(fmt(2)) 
> sd(fmt(2)) min(fmt(0)) max(fmt(0)) count(fmt(0))") noobs nonumber nomtitle ti
> tle("Panel A: Crime Statistics by Income Tercile")
(output written to /Users/casparm4/Github/rsm-data-analytics-in-finance-private
> /private/assignments/06-assignment/output/tables/summary_stats_panelA.tex)

. restore

. di as txt "Creating Panel B: Full Sample Statistics..."
Creating Panel B: Full Sample Statistics...

. estpost summarize total_crimes assault_crimes murder_crimes avg_median_std_in
> come ret_compound ret_std_monthly

             |  e(count)   e(sum_w)    e(mean)     e(Var)      e(sd) 
-------------+-------------------------------------------------------
total_crimes |     36396      36396    223.713   323259.3    568.559 
assault_cr~s |     36396      36396   11.27756   817.9551   28.59991 
murder_cri~s |     36396      36396   .7485163    5.89227     2.4274 
avg_median~e |     36396      36396   23.50222   3.470975   1.863055 
ret_compound |     36396      36396   .0072072   .0015475   .0393381 
ret_std_mo~y |     36396      36396   .6468432   17.22141   4.149869 

             |    e(min)     e(max)     e(sum) 
-------------+---------------------------------
total_crimes |         0       9239    8142258 
assault_cr~s |         0        463     410458 
murder_cri~s |         0         46      27243 
avg_median~e |  19.83227   36.45822   855386.9 
ret_compound | -.1037125   .1351352   262.3129 
ret_std_mo~y | -12.38631    8.55268    23542.5 

. esttab using "$tables/summary_stats_panelB.tex", replace cells("mean(fmt(2)) 
> sd(fmt(2)) min(fmt(2)) max(fmt(2)) count(fmt(0))") noobs nonumber nomtitle ti
> tle("Panel B: Full Sample Statistics") label
(output written to /Users/casparm4/Github/rsm-data-analytics-in-finance-private
> /private/assignments/06-assignment/output/tables/summary_stats_panelB.tex)

. di as txt "---- CHECKPOINT: Summary tables exported ----"
---- CHECKPOINT: Summary tables exported ----

. di as result "✅ Panel A saved: $tables/summary_stats_panelA.tex"
✅ Panel A saved: /Users/casparm4/Github/rsm-data-analytics-in-finance-private/p
> rivate/assignments/06-assignment/output/tables/summary_stats_panelA.tex

. di as result "✅ Panel B saved: $tables/summary_stats_panelB.tex"
✅ Panel B saved: /Users/casparm4/Github/rsm-data-analytics-in-finance-private/p
> rivate/assignments/06-assignment/output/tables/summary_stats_panelB.tex

.

✅ Task 7.3 tests passed

(Note: Below code run with echo to enable preserve/restore functionality.)

. preserve

. collapse (sum) total_crimes assault_crimes murder_crimes, by(year_month)

. twoway (line total_crimes year_month, lcolor(navy) lwidth(medium)), title("To
> tal Crime in the Netherlands", size(medium)) subtitle("Monthly aggregates acr
> oss all municipalities", size(small)) xtitle("Year-Month") ytitle("Total Crim
> es Reported") xlabel(, angle(45) labsize(small)) ylabel(, labsize(small) form
> at(%12.0fc)) graphregion(color(white)) plotregion(color(white)) note("Source:
>  CBS/Politie crime data, 2012-2019", size(vsmall))

. graph export "$figures/crime_timeseries.png", replace width(1200)
file /Users/casparm4/Github/rsm-data-analytics-in-finance-private/private/assig
> nments/06-assignment/output/figures/crime_timeseries.png written in PNG forma
> t

. di as txt "---- CHECKPOINT: National crime time series ----"
---- CHECKPOINT: National crime time series ----

. di as result "✅ Figure saved: $figures/crime_timeseries.png"
✅ Figure saved: /Users/casparm4/Github/rsm-data-analytics-in-finance-private/pr
> ivate/assignments/06-assignment/output/figures/crime_timeseries.png

. di as result "Months plotted: " _N
Months plotted: 108

. restore

.

✅ Task 8.1 tests passed

(Note: Below code run with echo to enable preserve/restore functionality.)

. preserve

. collapse (sum) total_crimes (mean) ret_std_monthly, by(year_month)

. egen mean_crime = mean(total_crimes)

. egen sd_crime = sd(total_crimes)

. gen crime_z = (total_crimes - mean_crime) / sd_crime

. egen mean_ret = mean(ret_std_monthly)

. egen sd_ret = sd(ret_std_monthly)

. gen ret_z = (ret_std_monthly - mean_ret) / sd_ret

. twoway (line crime_z year_month, lcolor(maroon) lwidth(medium) lpattern(solid
> )) (line ret_z year_month, lcolor(navy) lwidth(medium) lpattern(dash)), title
> ("Crime and Stock Returns Over Time", size(medium)) subtitle("Standardized (z
> -scores) for visual comparison", size(small)) xtitle("Year-Month") ytitle("St
> andard Deviations from Mean") xlabel(, angle(45) labsize(small)) ylabel(, lab
> size(small)) legend(label(1 "Total Crime (standardized)") label(2 "AEX Return
> s (standardized)") rows(1) position(6) size(small)) graphregion(color(white))
>  plotregion(color(white)) yline(0, lcolor(gray) lpattern(dot)) note("Both ser
> ies standardized to mean=0, SD=1 for comparability", size(vsmall))

. graph export "$figures/crime_returns_overlay.png", replace width(1200)
file /Users/casparm4/Github/rsm-data-analytics-in-finance-private/private/assig
> nments/06-assignment/output/figures/crime_returns_overlay.png written in PNG 
> format

. di as txt "---- CHECKPOINT: Crime and returns overlay ----"
---- CHECKPOINT: Crime and returns overlay ----

. di as result "✅ Figure saved: $figures/crime_returns_overlay.png"
✅ Figure saved: /Users/casparm4/Github/rsm-data-analytics-in-finance-private/pr
> ivate/assignments/06-assignment/output/figures/crime_returns_overlay.png

. restore

.

✅ Task 8.2 tests passed

(Note: Below code run with echo to enable preserve/restore functionality.)

. preserve

. collapse (sum) total_crimes, by(year_month income_tercile)

. twoway (line total_crimes year_month if income_tercile==1, lcolor(red) lwidth
> (medium) lpattern(solid)) (line total_crimes year_month if income_tercile==2,
>  lcolor(orange) lwidth(medium) lpattern(dash)) (line total_crimes year_month 
> if income_tercile==3, lcolor(green) lwidth(medium) lpattern(longdash)), title
> ("Crime Trends by Income Tercile", size(medium)) subtitle("Total crime aggreg
> ated by municipality income group", size(small)) xtitle("Year-Month") ytitle(
> "Total Crimes Reported") xlabel(, angle(45) labsize(small)) ylabel(, labsize(
> small) format(%12.0fc)) legend(label(1 "Low Income (Tercile 1)") label(2 "Med
> ium Income (Tercile 2)") label(3 "High Income (Tercile 3)") rows(1) position(
> 6) size(small)) graphregion(color(white)) plotregion(color(white)) note("Sour
> ce: CBS/Politie crime data by municipality income tercile", size(vsmall))

. graph export "$figures/crime_by_income.png", replace width(1200)
file /Users/casparm4/Github/rsm-data-analytics-in-finance-private/private/assig
> nments/06-assignment/output/figures/crime_by_income.png written in PNG format

. di as txt "---- CHECKPOINT: Crime by income tercile ----"
---- CHECKPOINT: Crime by income tercile ----

. di as result "✅ Figure saved: $figures/crime_by_income.png"
✅ Figure saved: /Users/casparm4/Github/rsm-data-analytics-in-finance-private/pr
> ivate/assignments/06-assignment/output/figures/crime_by_income.png

. restore

.

✅ Task 8.3 tests passed

Panel variable: muni_id (strongly balanced)
 Time variable: year_month, 2012m1 to 2020m12
         Delta: 1 month

 muni_id:  1, 2, ..., 337                                    n =        337
year_month:  2012m1, 2012m2, ..., 2020m12                    T =        108
           Delta(year_month) = 1 month
           Span(year_month)  = 108 periods
           (muni_id*year_month uniquely identifies each observation)

Distribution of T_i:   min      5%     25%       50%       75%     95%     max
                       108     108     108       108       108     108     108

     Freq.  Percent    Cum. |  Pattern
 ---------------------------+--------------------------------------------------
> ------------------------------------------------------------
      337    100.00  100.00 |  111111111111111111111111111111111111111111111111
> 111111111111111111111111111111111111111111111111111111111111
 ---------------------------+--------------------------------------------------
> ------------------------------------------------------------
      337    100.00         |  XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
> XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
---- CHECKPOINT: Panel structure verified ----
Panel variable: muni_id
Time variable: year_month
Total observations: 36396

Variable      Storage   Display    Value
    name         type    format    label      Variable label
-------------------------------------------------------------------------------
total_crimes    long    %12.0g                Total crimes reported
ret_compound    float   %9.0g                 Monthly compound return
ret_std_monthly float   %9.0g                 Monthly standardized return (sum
                                                of daily std ret)
income_tercile  byte    %13.0g     tercile_lbl
                                              Income tercile (1=low, 2=mid,
                                                3=high)

✅ Task 9.1 tests passed

========================================
Baseline Model: Compound Returns
========================================


Fixed-effects (within) regression               Number of obs     =     36,396
Group variable: muni_id                         Number of groups  =        337

R-squared:                                      Obs per group:
     Within  = 0.0002                                         min =        108
     Between =      .                                         avg =      108.0
     Overall = 0.0000                                         max =        108

                                                F(1, 336)         =      36.31
corr(u_i, Xb) = 0.0000                          Prob > F          =     0.0000

                              (Std. err. adjusted for 337 clusters in muni_id)
------------------------------------------------------------------------------
             |               Robust
total_crimes | Coefficient  std. err.      t    P>|t|     [95% conf. interval]
-------------+----------------------------------------------------------------
ret_compound |   33.71929   5.595892     6.03   0.000     22.71189    44.72668
       _cons |     223.47   .0403306  5540.95   0.000     223.3906    223.5493
-------------+----------------------------------------------------------------
     sigma_u |  562.89334
     sigma_e |  86.074038
         rho |  .97715169   (fraction of variance due to u_i)
------------------------------------------------------------------------------

---- CHECKPOINT: Baseline regression (compound) ----
Coefficient on ret_compound: 33.719288
Standard error: 5.5958915
t-statistic: 6.0257223

Interpretation:
 → Positive coefficient: Higher returns associated with MORE crime

✅ Task 9.2 tests passed

========================================
Baseline Model: Standardized Returns
========================================


Fixed-effects (within) regression               Number of obs     =     36,396
Group variable: muni_id                         Number of groups  =        337

R-squared:                                      Obs per group:
     Within  = 0.0010                                         min =        108
     Between =      .                                         avg =      108.0
     Overall = 0.0000                                         max =        108

                                                F(1, 336)         =      43.37
corr(u_i, Xb) = -0.0000                         Prob > F          =     0.0000

                              (Std. err. adjusted for 337 clusters in muni_id)
------------------------------------------------------------------------------
             |               Robust
total_crimes | Coefficient  std. err.      t    P>|t|     [95% conf. interval]
-------------+----------------------------------------------------------------
ret_std_mo~y |   .6651736   .1009989     6.59   0.000     .4665037    .8638435
       _cons |   223.2827   .0653305  3417.74   0.000     223.1542    223.4112
-------------+----------------------------------------------------------------
     sigma_u |  562.89334
     sigma_e |  86.039671
         rho |  .97716951   (fraction of variance due to u_i)
------------------------------------------------------------------------------

---- CHECKPOINT: Baseline regression (standardized) ----
Coefficient on ret_std_monthly: .6651736
Standard error: .10099893
t-statistic: 6.5859472

Interpretation:
 → A 1 SD increase in returns → change of .6651736 crimes

✅ Task 9.3 tests passed

Variable      Storage   Display    Value
    name         type    format    label      Variable label
-------------------------------------------------------------------------------
ret_x_high      float   %9.0g                 Standardized return × High income
ret_x_mid       float   %9.0g                 Standardized return × Medium
                                                income
ret_x_low       float   %9.0g                 Standardized return × Low income

========================================
Interaction Model: Standardized Returns
========================================


Fixed-effects (within) regression               Number of obs     =     36,396
Group variable: muni_id                         Number of groups  =        337

R-squared:                                      Obs per group:
     Within  = 0.0015                                         min =        108
     Between = 0.0835                                         avg =      108.0
     Overall = 0.0010                                         max =        108

                                                F(3, 336)         =      55.37
corr(u_i, Xb) = 0.0256                          Prob > F          =     0.0000

                              (Std. err. adjusted for 337 clusters in muni_id)
------------------------------------------------------------------------------
             |               Robust
total_crimes | Coefficient  std. err.      t    P>|t|     [95% conf. interval]
-------------+----------------------------------------------------------------
   ret_x_low |    1.30699   .2843909     4.60   0.000     .7475796    1.866401
   ret_x_mid |   .4399819   .0600953     7.32   0.000     .3217714    .5581924
  ret_x_high |   .2428179   .0254009     9.56   0.000     .1928532    .2927826
       _cons |   223.2827   .0632572  3529.76   0.000     223.1583    223.4072
-------------+----------------------------------------------------------------
     sigma_u |  562.80676
     sigma_e |   86.02041
         rho |  .97717264   (fraction of variance due to u_i)
------------------------------------------------------------------------------

---- CHECKPOINT: Interaction model (standardized) ----

Effect by income group (per 1 SD return):
 Low-income: 1.3069905
 Mid-income: .43998192
 High-income: .24281791

✗ Both positive: Returns increase crime in all groups

✅ Task 10.2 tests passed

========================================
Two-Way Fixed Effects Model
========================================

Note: Main return effect absorbed by time FE
 (all municipalities have same return each month)

(MWFE estimator converged in 2 iterations)

HDFE Linear regression                            Number of obs   =     36,396
Absorbing 2 HDFE groups                           F(   2,    336) =      11.10
Statistics robust to heteroskedasticity           Prob > F        =     0.0000
                                                  R-squared       =     0.9805
                                                  Adj R-squared   =     0.9802
                                                  Within R-sq.    =     0.0006
Number of clusters (muni_id) =        337         Root MSE        =    79.9471

                              (Std. err. adjusted for 337 clusters in muni_id)
------------------------------------------------------------------------------
             |               Robust
total_crimes | Coefficient  std. err.      t    P>|t|     [95% conf. interval]
-------------+----------------------------------------------------------------
  ret_x_high |   -.197164   .0653383    -3.02   0.003    -.3256876   -.0686404
   ret_x_low |   .8670085   .2910952     2.98   0.003     .2944098    1.439607
       _cons |   223.5673   .0672407  3324.88   0.000     223.4351    223.6996
------------------------------------------------------------------------------

Absorbed degrees of freedom:
-----------------------------------------------------+
 Absorbed FE | Categories  - Redundant  = Num. Coefs |
-------------+---------------------------------------|
     muni_id |       337         337           0    *|
  year_month |       108           1         107     |
-----------------------------------------------------+
* = FE nested within cluster; treated as redundant for DoF computation

---- CHECKPOINT: Two-way FE model ----

Effect by income group (relative to mid-income):
 High-income: -.19716402
 Low-income: .86700855

Fixed effects absorbed:
 Municipalities: .
 Time periods: .

✅ Task 11.1 tests passed

========================================
Comparison: One-Way FE vs Two-Way FE
========================================


Effect of Standardized Returns on Crime by Income Group
----------------------------------------------------
                              (1)             (2)   
                       One-Way FE      Two-Way FE   
----------------------------------------------------
Monthly standardiz~m                                
                                                    

Standardized retur~e        1.307***        0.867***
                          (0.284)         (0.291)   

Standardized retur~c        0.440***                
                          (0.060)                   

Standardized retur~m        0.243***       -0.197***
                          (0.025)         (0.065)   
----------------------------------------------------
Observations               36,396          36,396   
Adjusted R²                 0.001           0.980   
----------------------------------------------------
Standard errors in parentheses
* p<0.10, ** p<0.05, *** p<0.01

---- CHECKPOINT: Model comparison ----

Key differences:
 • One-Way FE: Controls for municipality characteristics only
 • Two-Way FE: Controls for municipality + time shocks

Questions to consider:
 1. Do coefficients change substantially?
 2. Do significance levels change?
 3. Which specification is more credible?

✅ Task 11.2 tests passed

========================================
Robustness Check 1: Log Specification
========================================

(MWFE estimator converged in 2 iterations)

HDFE Linear regression                            Number of obs   =     36,396
Absorbing 2 HDFE groups                           F(   2,    336) =       0.54
Statistics robust to heteroskedasticity           Prob > F        =     0.5830
                                                  R-squared       =     0.9676
                                                  Adj R-squared   =     0.9672
                                                  Within R-sq.    =     0.0000
Number of clusters (muni_id) =        337         Root MSE        =     0.1909

                              (Std. err. adjusted for 337 clusters in muni_id)
------------------------------------------------------------------------------
             |               Robust
log_total_~s | Coefficient  std. err.      t    P>|t|     [95% conf. interval]
-------------+----------------------------------------------------------------
  ret_x_high |   .0001186   .0005546     0.21   0.831    -.0009723    .0012095
   ret_x_low |  -.0003916   .0004915    -0.80   0.426    -.0013584    .0005752
       _cons |   4.680233   .0001938  2.4e+04   0.000     4.679852    4.680615
------------------------------------------------------------------------------

Absorbed degrees of freedom:
-----------------------------------------------------+
 Absorbed FE | Categories  - Redundant  = Num. Coefs |
-------------+---------------------------------------|
     muni_id |       337         337           0    *|
  year_month |       108           1         107     |
-----------------------------------------------------+
* = FE nested within cluster; treated as redundant for DoF computation

---- CHECKPOINT: Log specification ----

Interpretation:
 Coefficients now measure % changes in crime
 High-income: 0.0001 (1 SD return →   0.01% change)
 Low-income: -0.0004 (1 SD return →  -0.04% change)

✅ Task 12.1 tests passed

========================================
Robustness Check 2: Assault Crimes
========================================


    Variable |        Obs        Mean    Std. dev.       Min        Max
-------------+---------------------------------------------------------
assault_cr~s |     36,396    11.27756    28.59991          0        463
(MWFE estimator converged in 2 iterations)

HDFE Linear regression                            Number of obs   =     36,396
Absorbing 2 HDFE groups                           F(   2,    336) =       1.41
Statistics robust to heteroskedasticity           Prob > F        =     0.2466
                                                  R-squared       =     0.9677
                                                  Adj R-squared   =     0.9672
                                                  Within R-sq.    =     0.0001
Number of clusters (muni_id) =        337         Root MSE        =     5.1757

                              (Std. err. adjusted for 337 clusters in muni_id)
------------------------------------------------------------------------------
             |               Robust
assault_cr~s | Coefficient  std. err.      t    P>|t|     [95% conf. interval]
-------------+----------------------------------------------------------------
  ret_x_high |  -.0009849   .0071113    -0.14   0.890    -.0149731    .0130033
   ret_x_low |    .026294   .0166023     1.58   0.114    -.0063635    .0589514
       _cons |   11.27207   .0042482  2653.40   0.000     11.26371    11.28042
------------------------------------------------------------------------------

Absorbed degrees of freedom:
-----------------------------------------------------+
 Absorbed FE | Categories  - Redundant  = Num. Coefs |
-------------+---------------------------------------|
     muni_id |       337         337           0    *|
  year_month |       108           1         107     |
-----------------------------------------------------+
* = FE nested within cluster; treated as redundant for DoF computation

---- CHECKPOINT: Assault crimes specification ----

Question: Does focusing on violent crime change results?
 High-income effect: -.00098488
 Low-income effect: .02629398

✓ Pattern holds for violent crime (assault)

✅ Task 12.2 tests passed

    Variable |        Obs        Mean    Std. dev.       Min        Max
-------------+---------------------------------------------------------
  volatility |     36,396    .0094554    .0050592    .003573   .0422672
    vol_high |     36,396    .0031425     .005324          0   .0422672
     vol_low |     36,396    .0031705    .0053394          0   .0422672
(MWFE estimator converged in 2 iterations)

HDFE Linear regression                            Number of obs   =     36,396
Absorbing 2 HDFE groups                           F(   2,    336) =       4.31
Statistics robust to heteroskedasticity           Prob > F        =     0.0142
                                                  R-squared       =     0.9805
                                                  Adj R-squared   =     0.9802
                                                  Within R-sq.    =     0.0006
Number of clusters (muni_id) =        337         Root MSE        =    79.9474

                              (Std. err. adjusted for 337 clusters in muni_id)
------------------------------------------------------------------------------
             |               Robust
total_crimes | Coefficient  std. err.      t    P>|t|     [95% conf. interval]
-------------+----------------------------------------------------------------
    vol_high |   56.42181   51.53063     1.09   0.274    -44.94149    157.7851
     vol_low |  -769.2239   298.4051    -2.58   0.010    -1356.201   -182.2463
       _cons |   225.9745   .9781892   231.01   0.000     224.0504    227.8987
------------------------------------------------------------------------------

Absorbed degrees of freedom:
-----------------------------------------------------+
 Absorbed FE | Categories  - Redundant  = Num. Coefs |
-------------+---------------------------------------|
     muni_id |       337         337           0    *|
  year_month |       108           1         107     |
-----------------------------------------------------+
* = FE nested within cluster; treated as redundant for DoF computation

---- CHECKPOINT: Volatility specification ----

Question: Does market uncertainty (not direction) affect crime?
 High-income: 56.42181
 Low-income: -769.2239

High volatility → more crime in high-income areas (stress channel)

✅ Task 12.3 tests passed

========================================
Main Regression Results Table
========================================


Table 1: Effect of Stock Returns on Crime by Income Group
-------------------------------------------------------------------------------
> -------------------------------------
                         Compound    Std. Basel~e    Interactions            TW
> FE             Log         Assault   
-------------------------------------------------------------------------------
> -------------------------------------
Monthly compound r~n        33.72***                                           
>                                      
                          (5.596)                                              
>                                      

Monthly standardiz~m                        0.665***                           
>                                      
                                          (0.101)                              
>                                      

Standardized retur~e                                        1.307***        0.8
> 67***    -0.000392          0.0263   
                                                          (0.284)         (0.29
> 1)         (0.000)         (0.017)   

Standardized retur~c                                        0.440***           
>                                      
                                                          (0.060)              
>                                      

Standardized retur~m                                        0.243***       -0.1
> 97***     0.000119       -0.000985   
                                                          (0.025)         (0.06
> 5)         (0.001)         (0.007)   
-------------------------------------------------------------------------------
> -------------------------------------
Observations               36,396          36,396          36,396          36,3
> 96          36,396          36,396   
Adjusted R²                 0.000           0.001           0.001           0.9
> 80           0.967           0.967   
-------------------------------------------------------------------------------
> -------------------------------------
Standard errors in parentheses
* p<0.10, ** p<0.05, *** p<0.01

Notes for table:
 • Dependent variable: Total crimes (Columns 1-4), Log(total crimes) (Column 5)
> , Assault crimes (Column 6)
 • Column 1 uses compound returns; Columns 2-6 use standardized returns (Huck 2
> 024)
 • All models include municipality fixed effects
 • Columns 4-6 include time (year-month) fixed effects
 • Standard errors clustered at municipality level in parentheses
 • * p<0.10, ** p<0.05, *** p<0.01
(output written to /Users/casparm4/Github/rsm-data-analytics-in-finance-private
> /private/assignments/06-assignment/output/tables/main_results.tex)

---- CHECKPOINT: Regression table created ----
✅ Table exported to: /Users/casparm4/Github/rsm-data-analytics-in-finance-priva
> te/private/assignments/06-assignment/output/tables/main_results.tex
 Models included: 6 (Compound baseline through Assault)

✅ Task 13.1 tests passed

Data Analytics for Finance

Replication Exercise

Learning Objectives¶

Setup¶

Environment Setup¶

Set File Paths¶

Part A: Data Preparation¶

Section 1: Introduction and Paper Context¶

Research Question¶

What We're Testing¶

Why This Assignment Matters for Your Thesis¶

Assignment Roadmap¶

Section 2: Load and Examine Raw Data¶

Task 2.1: Load and Examine Crime Data¶

Task 2.2: Load and Examine Income Data¶

Task 2.3: Load and Examine AEX Daily Price Data¶

Section 3: Calculate AEX Returns (Following Huck 2024 Methodology)¶

Task 3.1: Calculate Daily Returns¶

Task 3.2: Calculate Trailing 252-Day Standard Deviation¶

Task 3.3: Calculate Standardized Daily Returns¶

Task 3.4: Create Year-Month Variable¶

Task 3.5: Aggregate to Monthly Frequency¶

Task 3.6: Save Monthly Returns Data¶

Section 4: Reshape Crime Data (Long to Wide)¶

Task 4.1: Examine Current Structure¶

Task 4.2: Create Numeric Crime Type Identifier¶

Task 4.3: Reshape Wide¶

Task 4.4: Handle Missing Values¶

Task 4.5: Save Wide Crime Data¶

Section 5: Merge Datasets¶

Task 5.1: Start with Crime Panel¶

Task 5.2: Merge Income Data¶

Task 5.3: Create Year-Month Variable for Merging¶

Task 5.4: Merge AEX Returns¶

Task 5.5: Verify Panel Structure and Save¶

Section 6: Create Analysis Variables¶

Task 6.1: Load Analysis Panel¶

Task 6.2: Create Income Terciles¶

Task 6.3: Create Tercile Dummy Variables¶

Task 6.4: Create Interaction Terms¶

Task 6.5: Save Final Analysis Dataset¶

Part A Summary¶

Part B: Analysis¶

Section 7: Descriptive Statistics¶

Task 7.1: Summary Statistics by Income Tercile¶

Task 7.2: Summary Statistics for Returns¶

Task 7.3: Create Formatted Summary Table¶

Section 8: Visualize Crime and Returns Over Time¶

Task 8.1: National Crime Time Series¶

Task 8.2: Overlay Crime and Returns¶

Task 8.3: Crime Time Series by Income Tercile¶

Section 9: Baseline Panel Regression¶

Task 9.1: Load Analysis Panel and Verify Structure¶

Task 9.2: Estimate Baseline with Compound Returns¶

Task 9.3: Estimate Baseline with Standardized Returns¶

Section 10: Differential Effects by Income (Core Test)¶

Task 10.1: Estimate Interactions with Standardized Returns¶

Task 10.2: Interpret the Differential Effects¶

Section 11: Two-Way Fixed Effects¶

Task 11.1: Estimate TWFE with reghdfe¶

Task 11.2: Compare One-Way vs Two-Way Fixed Effects¶

Section 12: Robustness Checks¶

Task 12.1: Log Specification¶

Task 12.2: Assault Crimes Only¶

Task 12.3: Volatility Instead of Returns¶

Section 13: Regression Table¶

Task 13.1: Create Publication-Quality Regression Table¶

Section 14: Interpretation and Limitations (Ungraded and voluntary)¶

Task 14.1: Summarize Your Findings¶

Task 14.2: Discuss Limitations¶

Task 14.3: What Would Strengthen This Analysis?¶

Part B Summary¶

Part C: Thesis Connection and Reflection¶

Section 15: Thesis Module 2/3 Connection¶

Section 16: Conclusion¶