How Portfoliowiser Backtests Work

How It Works10 min read

When you view a strategy's performance on Portfoliowiser, you are looking at a historical simulation — a reconstruction of how that strategy would have performed if it had been followed mechanically over a defined historical period. This is called a backtest, and it is the primary tool the investment industry uses to evaluate whether a strategy's underlying logic has held up across different market environments.

Backtests are powerful and informative. They are also frequently misused and misunderstood. This article explains exactly how Portfoliowiser constructs its backtests, what the results represent, and — critically — what they do not tell you. Reading this before you make any decisions based on backtest data is not a legal formality. It will make you a meaningfully better interpreter of the results.

What a Backtest Is and Why It Matters

A backtest applies a strategy's rules to historical data and calculates what would have happened. If a strategy says "hold the top two momentum assets from a universe of six ETFs, rebalance monthly," the backtest applies that rule to every month in the historical data set, records the resulting returns, and aggregates them into performance metrics.

The value of a backtest is that it provides a window into how a strategy's logic responds to real market conditions that have already occurred — bull markets, bear markets, inflationary regimes, deflationary shocks, rate cycles, and geopolitical events. A strategy that has performed poorly across multiple historical stress periods has revealed a structural weakness. A strategy that has navigated varied conditions with consistent risk-adjusted returns has demonstrated a degree of robustness.

Backtests cannot tell you what will happen in the future. But they are the most rigorous tool available for stress-testing a strategy's logic before committing real capital.

How Portfoliowiser Runs Backtests

Monthly Rebalancing

All strategies on Portfoliowiser use monthly rebalancing as the standard cadence. At the end of each month, the strategy's signals are evaluated using data available as of that date, and the portfolio is rebalanced to the new target allocation.

Monthly rebalancing was chosen because it matches the cadence of the academic research underlying most tactical asset allocation strategies, it is practical for individual investors to implement, and it avoids the transaction cost and tax drag associated with daily or weekly rebalancing. Strategies that require daily rebalancing to work tend to be more fragile and less accessible to retail investors in practice.

Real ETF Price Data

Portfoliowiser backtests use actual historical price data for the ETFs in each strategy's universe. This is not synthetic or modeled data — it is the real total return (price plus dividends) for each ETF over its trading history.

We source price data from established market data providers and maintain both daily and monthly price series. Daily price data is used for signals that require intra-month precision, such as simple moving average trend filters. Monthly price data is used for end-of-month signal calculation and return attribution.

The use of real ETF data is important because it captures what actually happened in the market, including the specific behavior of instruments during stress periods. A strategy that uses SPY (the S&P 500 ETF) has its backtest anchored to what SPY actually did — including its 55% drawdown during 2008-2009 and its 34% decline in early 2020.

Signal Calculation and Trade Execution

For each month in the backtest:

1. The strategy calculates its signals using data available as of the last trading day of the month.
2. The resulting target allocation is recorded.
3. The portfolio is assumed to rebalance to the new target allocation at the close of the last trading day of the month (or the open of the first trading day of the following month, depending on the strategy configuration).
4. The return for the following month is then calculated based on the new allocation.

This "signal at month end, execute at month end" approach is the most common convention in TAA backtesting research and is consistent with how most individual investors can actually implement these strategies.

What Happens When an ETF Has Limited History

Some ETFs in Portfoliowiser's strategy library have launch dates that do not extend to the full backtest period. In these cases, the platform uses a clearly disclosed proxy — a related ETF or index fund with a longer history — for the period before the ETF began trading. Proxy usage is always documented in the strategy description, and the use of proxies is one reason to treat very early backtest periods with additional caution.

What the Results Show

Portfoliowiser presents backtest results across several views designed to give a comprehensive picture of how a strategy has performed.

Equity Curve

The equity curve plots the growth of a hypothetical $10,000 investment over the backtest period. It shows the compound growth path of the strategy, including all peaks, troughs, and recovery periods. Reading an equity curve tells you not just the final return, but the journey — whether growth was steady or volatile, whether there were extended flat periods, and how quickly the strategy recovered from drawdowns.

Drawdown Chart

The drawdown chart shows the percentage decline from each historical peak to the subsequent trough. Maximum drawdown — the single largest peak-to-trough decline in the backtest period — is one of the most important risk metrics for evaluating whether a strategy is suitable for your risk tolerance and investment timeline.

A strategy with a 15% maximum drawdown is fundamentally different in character from one with a 35% maximum drawdown, even if their long-term CAGRs are similar.

Monthly Return Heatmap

The monthly heatmap displays the return for every month in the backtest period in a color-coded grid — typically green for positive months and red for negative ones, with shading to indicate magnitude. The heatmap is one of the most useful tools for understanding the texture of a strategy's returns: how it behaves in specific market environments, whether losses tend to cluster in certain periods, and how it compares to a benchmark across individual months.

Annual Return Table

The annual return table shows performance by calendar year, allowing direct comparison to benchmarks and to other strategies across specific historical periods. It reveals consistency (or lack thereof) in a strategy's return profile and highlights the years in which the strategy significantly over- or underperformed.

Trade Log

The trade log records every allocation change over the backtest period — which ETFs were bought, which were sold, and at what prices. This is a transparency feature that allows users to verify the strategy's behavior and understand its activity level. A strategy that traded frequently in certain years but rarely in others reveals its responsiveness to different market regimes.

Critical Limitations Every User Must Understand

Backtest results are simulated historical data. They are not a record of actual trading, and they are subject to several well-documented limitations that investors must understand before drawing any conclusions.

Hindsight Bias

The most fundamental limitation of any backtest is that the strategy is being evaluated on data that already happened. The researcher who designed the strategy — or who selected its parameters — had the benefit of knowing how history unfolded when constructing the rules.

Even when great care is taken to avoid deliberately fitting a strategy to past data, the process of reviewing and refining a strategy against historical results introduces some degree of hindsight bias. A strategy that "works" in backtests may have been implicitly shaped by knowledge of how markets behaved during the test period.

Portfoliowiser addresses this by using well-established, academically grounded strategy designs with published research behind them, rather than novel strategies developed primarily through data mining. The underlying logic of momentum rotation, trend following, and macro regime models has been documented in peer-reviewed literature over decades, which provides some independent validation of the approach beyond a single backtest.

No Transaction Costs Included

Portfoliowiser backtests do not deduct brokerage commissions, bid-ask spreads, or slippage from results. In the current environment of zero-commission trading and tight ETF spreads, transaction costs for monthly-rebalancing strategies are relatively small — typically 0.01% to 0.05% per trade for liquid ETFs. However, they are not zero, and over a 20-year period they will reduce actual returns somewhat relative to the backtest figures.

For strategies with higher turnover, the impact of transaction costs is proportionally larger. Users should apply a modest reduction (perhaps 0.1% to 0.3% per year, depending on strategy turnover) when thinking about what backtest returns imply for real-world performance.

Survivorship Bias

Survivorship bias refers to the tendency of historical databases to include only ETFs and funds that survived to the present, while excluding those that were closed or merged away — typically because they performed poorly. If a strategy's historical universe included ETFs that were subsequently delisted, the backtest may be missing some of the poorest performers from the period, which would cause it to overstate returns.

Portfoliowiser uses major, well-established ETFs with long track records and high liquidity. These instruments are not immune to survivorship bias at the underlying holdings level, but the ETFs themselves — SPY, AGG, GLD, and similar instruments — are not at risk of closure, which limits the survivorship bias concern relative to strategies built around smaller, more speculative funds.

Look-Ahead Bias

Look-ahead bias occurs when a backtest uses information that would not have been available at the time the trading decision was made. For example, if a backtest used a month's final closing price as the signal to rebalance and then also used that same price as the entry price, it would assume the ability to trade at a price that only became known at the market close.

Portfoliowiser's backtesting engine uses end-of-month prices consistently — the same price for signal generation and execution within each period — following established conventions in TAA research that are designed to avoid look-ahead bias. The trade log can be reviewed to verify execution prices against the signal dates.

Past Performance Does Not Predict Future Results

This statement appears on nearly every investment document as a legal disclosure, and it is frequently ignored as a result. It deserves genuine consideration.

A strategy that produced a 12% CAGR over the past 15 years did so in a specific set of market conditions — declining interest rates, globalization, particular patterns of economic growth, and specific monetary policy regimes. Some of those conditions will persist; others will not. The future will present new conditions that the backtest did not encounter.

The appropriate use of backtest data is not to project that the same return will continue. It is to understand the strategy's logic, verify that it has held up across a range of historical conditions, and make a judgment about whether that logic is likely to remain relevant in the environments you might face.

How to Interpret Backtest Results Responsibly

Focus on Risk-Adjusted Metrics

Absolute return numbers are less informative than risk-adjusted metrics. A strategy that returned 10% CAGR with a 12% maximum drawdown is a fundamentally different proposition than one that returned 12% CAGR with a 40% maximum drawdown. The Sharpe ratio (return per unit of volatility) and Calmar ratio (CAGR divided by maximum drawdown) provide more useful comparisons than raw return alone.

Examine Multiple Market Environments

Rather than evaluating a backtest by its overall statistics, examine performance across specific periods: 2008-2009 (credit crisis), 2011 (European debt crisis), 2018 (Q4 drawdown), 2020 (pandemic), and 2022 (rate shock). A strategy that holds up reasonably across all of these distinct environments has demonstrated greater robustness than one that excelled in some and collapsed in others.

Compare to a Relevant Benchmark

Every backtest result should be contextualized against a benchmark. If a strategy returned 9% CAGR, that result means something very different if the buy-and-hold S&P 500 returned 12% over the same period (the strategy lagged badly) versus 6% (it added meaningful value). Portfoliowiser displays benchmark comparison alongside strategy results for this reason.

Stress-Test Your Assumptions

Consider what happens to the backtest conclusions if the future looks different from the past. If long-term expected returns are lower than historical averages, what does that imply for the strategy's absolute performance? If volatility is higher, what might that do to drawdowns? Developing a range of scenarios is more useful than anchoring to the central backtest estimate.

The Difference Between Backtested and Live Performance

A backtest represents perfect implementation — every trade executed at exactly the right price, on exactly the right day, with no emotional interference. Live performance is different.

In live trading, prices may gap against you before you can execute. You may be traveling when a monthly rebalance is due. Unexpected events — platform outages, personal emergencies — can disrupt the discipline that a backtest assumes is perfect.

The gap between backtested and live performance tends to be driven less by market factors and more by implementation consistency. Investors who follow their strategy mechanically, rebalancing on schedule regardless of current news flow, tend to achieve results closer to the backtest. Those who override the strategy based on current sentiment — skipping a rebalance because the market feels dangerous, or adding exposure because recent performance has been good — tend to diverge significantly.

Portfoliowiser's monthly heatmaps and trade logs are designed to support disciplined implementation: by making the expected trades transparent and verifiable, they reduce the temptation to second-guess the system at individual decision points.

Why Transparency Matters

Portfoliowiser is built on the principle that every parameter of every strategy should be visible. There are no black boxes. The strategy descriptions explain what signals are used, which ETFs are in the universe, how allocation decisions are made, and what the rebalancing rules are.

This transparency serves two purposes. First, it allows users to make informed judgments about whether the strategy's underlying logic makes sense to them — not just whether the historical numbers look good. Second, it allows users to implement the strategy accurately in their own accounts, with confidence that what they are doing matches what the backtest assumes.

A strategy you can understand and explain is one you are more likely to follow consistently through difficult periods. That consistency is the single most important factor in translating backtest results into real-world outcomes.

Explore Strategies with Full Backtest Data

Portfoliowiser gives you access to a library of tactical allocation strategies, each with complete backtest history, equity curves, drawdown charts, monthly heatmaps, and trade logs. You can view the full backtest data, compare strategies side by side, and explore blended portfolios — all before committing any capital.

Explore the strategy library on Portfoliowiser and see the full backtest data for strategies that match your objectives. If you have questions about what the results mean or how to apply them to your situation, the platform's AI Assistant can walk you through the methodology and help you interpret the outputs in context.

Understanding what backtests can and cannot tell you is the foundation of making sound tactical allocation decisions. The results are a starting point for analysis — not an ending point.

Summary

Portfoliowiser backtests apply strategy rules to real historical ETF price data, with monthly rebalancing, to calculate what a strategy would have returned over a defined historical period. The results are presented as equity curves, drawdown charts, monthly heatmaps, annual return tables, and trade logs — giving users a comprehensive view of both performance and risk.

The limitations are real and must be taken seriously: hindsight bias, no transaction costs included, survivorship bias, look-ahead bias risk, and the fundamental reality that past performance does not predict future results. These limitations do not make backtests useless — they make them a tool to be used carefully, as one input among several in forming a view about whether a strategy is worth implementing.

The appropriate posture is informed confidence: understanding the strategy's logic, verifying its robustness across varied historical environments, applying reasonable assumptions about the gap between backtest and live performance, and committing to consistent implementation once a decision is made.