Education

Backtesting vs Live Performance: Why Transparency Matters

April 4, 2026 · 7 min read

Every trading strategy looks profitable in a backtest. That is not an exaggeration. With enough parameter optimization, any strategy can be made to produce beautiful historical equity curves. The question that matters is whether those results survive contact with the live market. The gap between backtested performance and live performance is one of the most important concepts in algorithmic trading, and it is the reason why transparency from strategy vendors should be non-negotiable.

Why Backtests Always Look Better Than Live

There are structural reasons why backtested results overstate real-world performance. These are not bugs in backtesting software — they are fundamental limitations of simulating a complex adaptive system.

Hindsight Bias (Overfitting)

When you develop a strategy, you test it against historical data. If the strategy loses money, you adjust the parameters until it profits. Then you test again. Each adjustment is informed by knowledge of what the market did, which means the strategy is not predicting the future — it is memorizing the past. This process is called overfitting, and it is the primary reason backtests exaggerate performance.

A strategy that uses 15 parameters optimized against two years of data will almost certainly fit that data well. But the market in the next two years will be different. Correlations will shift, volatility regimes will change, and the parameter set that worked perfectly in hindsight will degrade in real-time.

Perfect Fill Assumptions

Backtesting engines typically assume that your orders fill at the exact price specified. In reality, market orders experience slippage (filling 1 to 2 ticks worse than expected), and limit orders may not fill at all if the market only touches your price briefly. NinjaTrader's backtester offers a slippage setting, but even with it enabled, the simulation does not capture the full complexity of real order book dynamics.

Over hundreds of trades, even 1 tick of unaccounted slippage per trade adds up significantly. On ES, 1 tick is $12.50. Over 200 trades per year, that is $2,500 in performance degradation that the backtest does not show.

Survivorship in Data

Historical data has been cleaned and corrected. Bad ticks have been removed, data gaps have been filled, and exchange outages have been smoothed over. The live market has none of these corrections. Your strategy will encounter bad ticks, data gaps, and exchange halts in real time, and it needs to handle them gracefully without entering phantom trades or missing real ones.

Market Impact

In a backtest, your orders do not affect the market. In live trading, especially with larger position sizes, your order can move the market against you. This is less of a concern for retail traders on ES (where daily volume is massive), but it becomes significant on less liquid instruments or during low-volume periods.

How to Evaluate Backtest Results Honestly

Backtests are not useless. They are a necessary first step in strategy development. But you need to interpret them correctly. Here is what to look for:

Number of trades — a backtest with 50 trades is statistically meaningless. You need at least 200 trades to have any confidence in the results. More is better.
Time period — the backtest should cover multiple market regimes: trending markets, range-bound markets, high-volatility events (like 2020 or 2022), and low-volatility periods. A strategy that only works in bull markets is not robust.
Profit factor — total gross profit divided by total gross loss. A profit factor of 1.3 to 1.8 is realistic for a day trading strategy. A profit factor above 2.5 in a backtest almost certainly indicates overfitting.
Maximum drawdown — what is the largest peak-to-trough decline? Is it something you could psychologically and financially survive? If the backtest shows a $5,000 max drawdown, expect $7,500 or more in live trading.
Drawdown duration — how long does the strategy stay underwater? A drawdown that lasts 3 weeks is very different from one that lasts 3 months, even if the dollar amount is the same.
Parameter sensitivity — change each parameter by 10% to 20% and re-run the backtest. If the results collapse, the strategy is overfitted to the exact parameter values. A robust strategy produces positive results across a range of reasonable parameter values.

The Out-of-Sample Test

The single most important validation step is the out-of-sample test. This means developing the strategy on one period of data (the "in-sample" period) and testing it on a completely separate period that was not used during development (the "out-of-sample" period).

For example, if you develop the strategy using 2022 to 2024 data, test it on 2025 data that you did not look at during development. If the strategy is profitable out of sample, you have stronger evidence that the edge is real rather than a product of overfitting.

An even more rigorous approach is walk-forward analysis, where you repeatedly optimize on a rolling window of data and test on the subsequent period. This simulates the real-world experience of running a strategy forward in time.

What Honest Vendors Do

The algo trading vendor space is filled with vendors who show only backtest results. These results are always impressive because they can be endlessly optimized. An honest vendor does things differently:

Shows both backtest and live results — backtest results demonstrate the strategy's edge in historical conditions. Live or forward-test results demonstrate that the edge persists in real-time. Both are necessary.
Includes losing periods — every strategy has drawdowns. If a vendor only shows winning months, they are hiding the full picture. The losing periods are more informative than the winning ones because they show how the strategy behaves under stress.
Provides trade-level data — aggregate statistics like profit factor and win rate can be misleading. Trade-level data (entry time, exit time, direction, P&L) lets you verify the numbers and check for anomalies like suspiciously perfect fills or unrealistic trade frequency.
Discloses the backtest methodology — what bar type was used? What slippage was applied? What commission was assumed? What date range? Without this information, the backtest results are unverifiable.
Updates results regularly — performance that has not been updated in months is stale. Markets change, and results should be current.

How HuntersAlgo Approaches Transparency

We publish our backtest results on the Results page with full trade-level data that anyone can download and verify. Each result card shows the strategy name, date range, instrument, number of trades, net profit, profit factor, maximum drawdown, and a link to the trade-by-trade CSV file.

We label results honestly. Backtest results are labeled as backtests. We include the bar type, slippage settings, and commission assumptions so you can replicate the test in your own NinjaTrader environment. When a strategy underperforms or enters a drawdown, we do not hide it or remove the result.

This approach costs us some marketing appeal. A vendor who only shows cherry-picked winning periods will always look better in a side-by-side comparison. But we believe that traders who make informed decisions based on complete data become better long-term customers than traders who are sold an illusion and churn out when reality does not match the screenshot.

What You Should Demand from Any Vendor

Before subscribing to any algo trading product, ask these questions:

Can I see trade-level data, not just summary statistics?
Are the results from a backtest, forward test, or live account?
What slippage and commission assumptions were used?
How often are results updated?
Can I see the losing periods, not just the highlights?

If the vendor cannot or will not answer these questions, that tells you everything you need to know.

Ready to automate your trading?

HuntersAlgo strategies handle entries, exits, and risk management — so you don't have to.

Start Free Trial →