Backtest Like a Quant: A No‑Fluff VectorBT Walkthrough

Backtest Like a Quant: A No‑Fluff VectorBT Walkthrough

Some links in this article are affiliate links. We may earn a small
commission if you make a purchase or fund an account through these
links — at no extra cost to you. This helps fund our independent
research and testing.

For trading and crypto content specifically: information is for
educational purposes only and is NOT investment advice. Past
performance does not predict future results. Trading and crypto
involve substantial risk of loss including total loss of capital.
Crypto specifically is highly volatile and may lose 100% of value;
EU readers note MiCA regulation; US readers note rules vary by
state. Do your own research, never invest more than you can afford
to lose.

Why VectorBT Still Needs a Pragmatic Lens

Most articles treat vectorbt as a magic bullet that turns any CSV into alpha. In production, the magic wears off the moment you hit data‑quality issues, memory limits, or latency constraints. This tutorial embraces vectorbt’s strengths while pointing out where the library’s abstractions can mislead a seasoned practitioner.

If you’re looking for a glossy UI demo, stop reading now. If you want concrete, reproducible code you can ship to a live environment, keep going.

Installation & Environment Setup

VectorBT lives on top of pandas, numpy, and numba. A clean virtual environment prevents version clashes that are hard to debug later.

python -m venv vbt_env
source vbt_env/bin/activate
pip install vectorbt[full] pandas==2.2.0 numpy==2.0.0 numba==0.58.1

Tip: pin pandas==2.2.0 – newer releases occasionally break the internal indexing logic vectorbt relies on.

Core Data Structures You Must Master

VectorBT exposes three high‑level objects that replace most custom code: vbt.Indicator, vbt.Signal, and vbt.Portfolio. Understanding their contract is more valuable than memorising every method.

vbt.Indicator – Vectorized Technicals

Indicators are pure functions that accept a pd.Series/DataFrame and return a transformed object. Because they are vectorized, you can compute dozens of variants in a single line without loops.

vbt.Signal – Binary Masks, Not Trade Orders

A signal is a Boolean DataFrame where True means “enter” or “exit”. The power comes from chaining signals with logical operators (&, |) to enforce multi‑factor rules.

vbt.Portfolio – The Backtest Engine

The portfolio takes entry/exit signals, applies commission models, slippage, and returns a fully‑featured performance report. It also caches intermediate calculations, which can be a memory hog if you’re not careful.

A Minimal Viable Strategy – Step by Step

Below is a complete, production‑ready example that you can copy‑paste into a script, schedule with cron, and monitor with your existing logging stack.

1. Data Ingestion

import vectorbt as vbt
import yfinance as yf

# Pull 5‑years of daily OHLC for SPY
price = yf.download('SPY', period='5y', interval='1d')[['Open','High','Low','Close']]
price = price.dropna()

Pro tip: store the raw CSV in a version‑controlled data lake and load it with pd.read_parquet for faster I/O in production.

2. Build Entry & Exit Signals

# 20‑day simple moving average
sma20 = vbt.Indicator.run(price['Close'], lambda x: x.rolling(20).mean())
# 50‑day simple moving average
sma50 = vbt.Indicator.run(price['Close'], lambda x: x.rolling(50).mean())

# Long when SMA20 crosses above SMA50, exit on opposite cross
entries = sma20 > sma50
exits = sma20 < sma50

Notice the lack of for loops – vectorbt leverages NumPy broadcasting under the hood, giving you C‑speed with Python syntax.

3. Run the Backtest

portfolio = vbt.Portfolio.from_signals(
    price['Close'],
    entries, exits,
    init_cash=10_000,
    fees=0.001,          # 0.1% commission per trade
    slippage=0.0005     # 5bps slippage model
)

# Quick performance snapshot
print(portfolio.stats())

The stats() call returns a dictionary that includes Sharpe, Sortino, max‑drawdown, and many other metrics – all computed on the fly without extra loops.

Common Pitfalls & Performance Tweaks

Even a simple strategy can explode memory usage if you leave default settings untouched.

  • Data Types: Cast price series to np.float32 before feeding them to vectorbt. The reduction from 64‑bit to 32‑bit halves RAM consumption.
  • Caching: VectorBT caches intermediate results for each indicator. If you re‑run the same script many times in a long‑running service, call vbt.settings.cache_enabled = False after the first backtest to free memory.
  • Look‑ahead Bias: Always verify that your signals are generated using only historical data. A common mistake is to use shift(1) on the entry mask instead of the price series, which inadvertently peeks at tomorrow’s price.
  • Parallelism: For multi‑asset portfolios, wrap the backtest in vbt.MultiThread or use Dask clusters. The library’s vectorized core already parallelises across rows, but you still need to split assets across processes to saturate CPU.

From Notebook to Production Service

In the field, I moved this script into a Flask micro‑service that receives ticker symbols via HTTP, runs the backtest, and returns a JSON payload with the Sharpe ratio and equity curve. The key steps were:

  1. Pre‑load the data cache at service start‑up – avoid pulling from Yahoo Finance on every request.
  2. Serialize the portfolio object with pickle only after you disable caching to keep the payload lean.
  3. Expose the performance metrics via a REST endpoint; downstream systems (e.g., a monitoring dashboard) pull them every 15 minutes.

This architecture proved robust on a 4‑core VM handling ~200 requests per hour without exceeding 1 GB RAM.

Conclusion – Your Next Move

VectorBT is a powerful toolbox, but it’s not a replacement for disciplined data engineering. Use the library to shave days off indicator development, double‑check every signal for look‑ahead bias, and keep an eye on memory when you scale to dozens of assets.

If you’ve built a strategy that survived a week of live paper‑trading, consider expanding the backtest horizon, adding transaction‑cost models, and finally integrating the script into your automated pipeline.

Ready to try it yourself? Clone the repo from GitHub, run the script, and let the numbers speak. Drop a comment below with your results – data‑driven discussion beats hype any day.


// BetterQuants is editorial. Information only — not investment advice. See /disclosure.