Perils of Backtesting

One thing considered absolutely essential when developing trading strategies or models is backtesting. Backtesting is the term used to test trading strategies/models on historic data, and can also be applied to other fields such as oceanography or meteorology[1].

I am a cynic at heart. So, typically I evaluate issues with a method prior to starting an implementation. In backtesting, there are a lot of issues. Unforunately, it’s also the best we have.

In this article, we might break hearts, but we will explain some methods for mitigating issues.

The Multiple Comparisons Problem

In the case of backtesting, likely the largest issue is the Multiple Comparisons Problem, best described as follows:

The more inferences are made, the more likely erroneous inferences occur.

It is a well known issue and has been written about by large companies and small blogs, such as the Price Action Lab Blog:

The fundamental problem of backtesting for the purpose of finding an edge in the markets is that it it introduces a dangerous form of data-mining bias caused by the reuse of data to test many different hypotheses.

Regardless of what name it goes by (Multiple Comparison Problem or Data-Mining Bias), there is an issue with searching historic data for predictors. The more we think we find, the higher likelihood we are just seeing something erroneous.

Problematic Backtesting Example – Technical Analysis

It’s the same reason why “Technical Analysis” of stock trends are pretty questionable. The lines drawn on a chart, are essentially just making things up (might be right some of the time, but we don’t know). Here’s an exacmple, from stockcharts.com:

From Stockcharts.com

These kinds of “Technical Analysis” methods originally developed in the 1920’s and 1930’s[2]. All evaluations had to be done by hand, and they are prone to hindsight bias in their stock selection. Personally, I highly doubt it’s effectiveness overall. The real-life research into the field of “Technical Analysis” has had mixed results and seems to be fairly split on the methodology and outcomes[3].

The Efficient-Market Hypothesis

You are probably asking yourself, what’s the alternative!? I watnt to prove my trading strategy, financial engine or AI financial advisor works!

Unfortunately – there really isn’t a better way to test any sort of model or strategy.

In fact, there is the Efficient-Market Hypothesis which essentially states:

Stocks always trade at their fair value, making it impossible for investors to either purchase undervalued stocks or sell stocks for inflated prices. As such, it should be impossible to outperform the overall market through expert stock selection or market timing, and that the only way an investor can possibly obtain higher returns is by chance or by purchasing riskier investments.

From Wikipedia

The truth is, we know it’s possible to gain some advantage; if you have more information or can execute a trade faster. For instance, you can gather information through ProjectPiglet.com as we track, in real-time, experts, insiders, and the public. That’s really the key to ProjectPiglet.com‘s success, although you might not make money on every trade, we improve your chances by providing more information (until we saturate the market).

Which brings us to the next issue related to backtesting…

Today, have more information now about any historic event or price than we did at the time. This is one of the causes of the Data-Mining Bias, where we use data that we didn’t have or didn’t even exist.

More important, historic stock prices were set using prior information and prior algorithms and strategies. Meaning, the strategy being backtested may have already been discovered (at a date after our backtesting data)! There is no way we would know, and it would effectively make the strategy or algorithm in question obsolete.

For reference, this has been dubed anti-inductive[3][4] – meaning as soon as a pattern is discovered the market creates a more complex pattern, correcting for said pattern. Personally, I like to call this market entropy.

Mitigating the Issues

Although there issues with backtesting, there are ways to mitigate the issue(s).

One such mitigation is called a Bonferroni Correction, and it is used to mitigate the Multiple Comparisons Problem. The purpose of the Bonferroni Correction is to reduce the likelihood of rejecting the null hypothesis prematurely (due to the increased number of tests). In other words, it’s a way to reduce the likelihood your strategy isn’t just getting lucky.

In addition, there are also methods for mitigating the Efficient-Market Hypothesis (although honestly, to a minimal extent). This can be accomplished identifying / indicating causality between two timeseries events, such as Granger Causality:

Granger CausilityFrom Wikipedia

This is useful for indicating there is a predictable event, and is useful over various time ranges to ensure the inefficiency in the market is still present. An interesting application of this method is also discovering when the market identifies some pattern. Granger Causality works best on linear regression, and is difficult to use for the nonlinear case (AKA it’s most useful for basic causality comparisons).

Conclusion

Backtesting is difficult, error prone, and will not gurantee success; however it is the best we have. Generally, it’s the only way to test if a strategy has any merit, but any insights should be reviewed carefully.

Over the next several articles we will explain our methods for mitigating the issues, as well as share our results. To-date our results typically beat or match what’s called “buy and hold strategies”, but we bias towards reducing risk, as opposed to maximizing potential profit; another whole domain.

If you’re interested in using ProjectPiglet.com, use the coupon code: pigletblog2018

It’s 25% off for 6 months!

One thought on “Perils of Backtesting

Leave a Reply

Your email address will not be published. Required fields are marked *