Potential Issue: Survivorship Bias in Long-Term Backtesting #164

hjia8901 · 2024-07-09T12:48:20Z

hjia8901
Jul 9, 2024

In the backtest examples (e.g., dow30.py), the stock universe is derived directly from a static file. While this approach poses no problems for forward testing, it introduces a significant risk of survivorship bias when conducting backtests over extended historical periods. This is because the stocks currently in the index are those that have survived and demonstrated successful past performance.

Would it be possible to adjust the backtesting framework to account for historical index components, thereby providing a more accurate representation of past market conditions?

Answered by enzbus

Jul 9, 2024

Hello! Yes, you are right. The problem is that I'm not aware of public historical data for stocks that don't exist any more, and the examples are designed to run on anyone's computer, in the simplest way possible. (And avoid shipping a frozen dataset, real or synthetic, with the examples suite.) The universes are actually updated periodically, downloading index components from Wikipedia, so they should be the current components, with maybe some delay. If you bought historical data that includes stocks that don't trade any more, or in any case want to update your trading universe over time, you can use the universe_selection_in_time optional argument to UserProvidedMarketData. That is more…

View full answer

enzbus · 2024-07-09T13:03:22Z

enzbus
Jul 9, 2024
Maintainer

Hello! Yes, you are right. The problem is that I'm not aware of public historical data for stocks that don't exist any more, and the examples are designed to run on anyone's computer, in the simplest way possible. (And avoid shipping a frozen dataset, real or synthetic, with the examples suite.) The universes are actually updated periodically, downloading index components from Wikipedia, so they should be the current components, with maybe some delay. If you bought historical data that includes stocks that don't trade any more, or in any case want to update your trading universe over time, you can use the universe_selection_in_time optional argument to UserProvidedMarketData. That is more advanced usage which goes beyond the scope of the examples; for those I prefer simpler settings. Does this answer your question?

1 reply

hjia8901 Jul 9, 2024
Author

Thank you for your detailed response. I appreciate the suggestion to use the universe_selection_in_time optional argument in UserProvidedMarketData for incorporating historical data, including stocks that no longer trade. I will take a closer look at this approach.

Your answer addressed my question perfectly. Thank you very much!!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Potential Issue: Survivorship Bias in Long-Term Backtesting #164

{{title}}

Replies: 1 comment 1 reply

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

Select a reply

Potential Issue: Survivorship Bias in Long-Term Backtesting #164

hjia8901 Jul 9, 2024

Replies: 1 comment · 1 reply

enzbus Jul 9, 2024 Maintainer

hjia8901 Jul 9, 2024 Author

hjia8901
Jul 9, 2024

Replies: 1 comment 1 reply

enzbus
Jul 9, 2024
Maintainer

hjia8901 Jul 9, 2024
Author