Advances in Financial Machine Learning by Marco Lopez De Prado
Chap 11: The dangers of a backtesting
Key takeaways:
Marco’s 2nd Law: “Backtesting while researching is like drinking and driving. Do not research under the influence of a backtest.”
11.2 Mission impossible: the flawless backtest
- A backtest is NOT an experiment, it’s fictional, hypothetical, simulative, thus it does not prove anything.
- Read “Seven Sins of Quantitative Investing” (Luo et al. [2014])
7 sins of quant:
- Survivorship Bias
- Look-ahead bias
- Storytelling(ex-post story v.s. ex-ante, explain afterwards)
- Data Mining and data snooping
- Transaction costs
- Outliers
- Shorting
The list can go on and on…
11.3 Even if your backtest if flawless, it is probably wrong
Let me reframe it:
Even if you deployed your models from the backtest and made 50 trillion out of it, your backtest is PROBABLY wrong.
11.4 Backtesting is not a research tool
Key takeaway:
- Feature importance is a true research tool
- The purpose of a backtest is to discard bad models, not to improve them.
- Adjusting your model based on the backtest results is a waste of time, and it’s dangerous.
- Never backtest until your model has been fully specified.
11.5 A few general recommendations
Key takeaways:
- Backtest overfitting can be defined as slection bias on multiple backtests.
- Every backtested strategy is overfit to some extent and a result of selection bias.
- How to address backtest overfitting is arguably the most fundamental question in quant.
Must follow:
- Develop models for entire asset class or investment universes, rather than for specific securities.
- Apply bagging(Chap 6).
- Do not backtest until all your research is complete(Chap 1-10)
- Record every backtest conducted on a dataset so that the probability of backtest overfitting maybe estimated.
- Simulate scenarios rather than history. Your strategy should be profitable under a wide range of scenarios.
- If the backtest fails to identify a profitable strat, start all over again.
11.6 Strategy Selection
Key Takeaways:
- One disadvantage of the WFOV is it can be easily overfit.
- Some randomization is needed to avoid backtest optimization(overfitting), on top of avoiding the leakage from test set to the training set.