Computerized back testing has been a boon for us traders. At the click of a few buttons we can evaluate new trading strategies and ideas across different instruments and time frames and through optimization determine the strategy parameters that yield the best results. For those new to trading, the meaning of back testing is:
Backtesting is the process of testing a trading strategy on relevant historical data to ensure its viability before the trader risks any actual capital. A trader can simulate the trading of a strategy over an appropriate period of time and analyze the results for the levels of profitability and risk. – Investopedia
Unfortunately with computerized back testing we also need to deal with the curse of curve fitting. Curve fitting occurs when the strategy parameters are tuned so that they produce optimized results for the specific set of historical data that was tested.
With any other set of testing data the results might be radically different. For example we might run a test over a period that saw a huge price swing due to a major news event.
A curve fitted strategy may be tuned to capture the maximum profit from those swings thus inflating its overall results. Take away that swing and the same parameters would yield drastically reduced or even negative results. This image shows what a curve fitted system looks like.
Trading System Back Tested Using Curve Fitted Parameters
To the left of the vertical line we see the optimized results, to the right we see the subsequent system performance using those same values. After an initial run up the system falls apart and all the initial gains are lost as you can see once the equity curve line hits its peak.
How Do You Deal With Curve Fitting During Back Testing?
Curve fitting is a potentially destructive process and you must find ways to eliminate it during your testing of any trading system or you run the risk of trading an inferior system.
There are three backtesting strategies we can use to alleviate the curve fitting issue:
- Optimize one variable at a time and look for ranges of variable values that all produce profitable results, then pick a value from the middle of the range. This value may not have the optimal result but ensures that small variances will still be profitable.
- Optimize over several different historical data sets and identify those strategy variables that produce profitable results across all of them. Look for overlaps and select the variable mix that has good results in each of the test periods.
- Optimize the variables on a historical set of data and then validate that they continue to perform well by applying them to a different set of data. This is called out of sample testing.
Of course you can apply all three of the above to your strategy but for this article we’ll focus on the out of sample testing.
Out Of Sample Testing
In out of sample testing we separate the available historical test data into two sets.
The first set will be used for the computerized back testing and optimization. This is called the in-sample period. The optimized test results are then applied to the remaining, untested data set, referred to as the out of sample period.
The out of sample data can come from the beginning of the historical data or from the end, although typically we use the most recent data for the out of sample testing.
The ratio of in-sample to out of sample typically ranges from 2:1 to 4:1, in other words 33% to 20% of the total data will be reserved for the out of sample test. The image below shows a 2:1 split applied to six months of daily data.
Performing an out of sample test in TradeStation (which is a powerful piece of backtesting software) is extremely easy. Begin by adding your strategies and setting the optimization parameters.
For this example I used three of the strategies that come with TradeStation and you can see the optimization parameters in the figure below:
Once that’s done and before we run the optimization we click on the Advanced Settings button to open the Advanced Settings window as shown below:
The Advanced Settings window is where you define the size of the out of sample period as a percent of the total historical data set. Note that you can set separate out of sample periods: one at the beginning and one at the end of the data block. Alternatively you could specify dates for defining the out of sample periods.
For simplicity I set aside the last 30% of data for out of sample testing.
After you run your optimization you can pull up the Strategy Performance Report to see how well the system performed overall and in the out of sample period. At the top of the report you can click the drop-down menu and select “All data”, “In sample” or “Out of sample” to filter the results for the selected data period.
You can see the drop-down menu below where I placed a red vertical line to show the overall results, in sample to the left of the line and out of sample to the right.
Compare this equity curve with first equity curve that appeared at the beginning of this article.
We can see that the trading system continued to perform well unlike our first test, even after the data optimization period and this gives us great confidence in trading it going forward.
This approach works well to test the robustness of the system. The drawback is that the testing is performed over a limited period of time. During that testing period the market may have experienced different levels of price volatility or prices may have changed significantly, and the strategy parameters don’t reflect this evolution.
In practice a more accurate assessment of overall performance would look at the results of consecutive or rolling out of sample tests. This is called walk forward testing and this will be covered in an upcoming article.
For more updates from Mark and the team at NetPicks, be sure to visit their trading tips blog at NetPicks.com.