Quick start

Minimal flow from zero → data → backtest. Run from the project root.

See also: Database setup and Backtest & DSL guide.

1) Install requirements

pip install -r requirements.txt

2) Set up environment variables

Copy .env.example to a new file named .env in the project root, then set your Polygon credentials.

Required keys Set POLYGON_API_KEY, POLYGON_S3_ACCESS_KEY_ID, and POLYGON_S3_SECRET_ACCESS_KEY.

3) Get tickers + adjustments

Build a point-in-time ticker universe and download split/dividend adjustments (Step 4 is included by default).

python build_historical_tickers.py --start-date 2018-01-01 --end-date 2026-02-08

4) Download flat files (build data)

Download Polygon/Massive S3 flat files, stream 1m → 5m + daily, and build derived layers. Requires the S3 env vars in step 2.

python download_flatfiles.py --start-date 2018-01-01 --end-date 2026-02-08

Order Run tickers/adjustments first, then flat files.

5) Run a backtest

Use any strategy file (.json, .yaml, or .py).

python run_backtest.py <strategy_file>

Example:

python run_backtest.py strategies/short_touch.py

Extra columns in trades.xlsx: optional strategy.attributes in the strategy JSON (values computed on the entry bar). See Backtest & DSL — Strategy attributes.

Daily updates (schedule)

Recommended: Monday–Friday, tickers/adjustments first then flat files. Example: 2:00 AM ET and 5:00 AM ET. On Linux/macOS, cron uses the machine’s local clock—shift hours if the host is not ET. On Windows, use Task Scheduler and match your desired wall time (ET or local).

# 2:00 AM ET: tickers + adjustments; 5:00 AM ET: flat files (adjust hours if server is not ET)
0 2 * * 1-5 cd /path/to/polygon-scanner && python build_historical_tickers.py
0 5 * * 1-5 cd /path/to/polygon-scanner && python download_flatfiles.py --end-date $(date +\%Y-\%m-\%d)

Create two scheduled tasks (tickers at 2:00, flat files at 5:00 — adjust for ET vs your PC clock).

Open Task Scheduler → Create Task… (not “Create Basic Task”) so you can set weekday-only triggers.
General: Name e.g. Kwants — build_historical_tickers. Optional: enable “Run whether user is logged on or not” for an unattended machine.
Triggers: New → Weekly → time 2:00:00 AM → recur every 1 weeks on Monday, Tuesday, Wednesday, Thursday, Friday.
Actions: New → Start a program → Program/script: full path to python.exe (from where python in a terminal) → Add arguments: build_historical_tickers.py → Start in: your repo folder, e.g. C:\path\to\polygon-scanner-faster.

Second task for flat files (5:00 AM weekdays): same pattern, but use Actions so the working directory is correct and --end-date is today’s calendar date in yyyy-MM-dd. Easiest is to run PowerShell once:

Program/script: powershell.exe
Add arguments: -NoProfile -ExecutionPolicy Bypass -Command "Set-Location 'C:\path\to\polygon-scanner-faster'; python download_flatfiles.py --end-date (Get-Date -Format 'yyyy-MM-dd')"

Ensure .env (API and S3 keys) is in the project root so those runs load the same credentials as your interactive shell.

Note If your server timezone isn’t ET, convert the schedule to your local time.

More detail: docs/database-setup-guide.html and docs/backtest-and-dsl-guide.html.

Database setup overview

The “database” is a directory tree under data/. There is no separate DB server: everything is file-based. You need (1) a ticker universe with date ranges, (2) split/dividend adjustments, and (3) bar data (5m + daily) and derived layers. All of it is stored as Parquet (and a little JSON) for fast, reproducible scans and backtests.

Paths are defined in config.py. Key roots: data/polygon/raw/ (tickers, adjustments, flat-file temp, m5_by_date, d_by_date) and data/polygon/processed/ (derived: intraday_daily, daily_features).

Tickers: build and update

The ticker list defines which symbols exist on which dates (point-in-time correct). It is used to filter the scan universe and to avoid survivorship bias. You build it once for history, then update it incrementally for new trading days.

What it does

Downloads daily ticker snapshots from the Polygon/Massive API for each trading day in the range.
Builds a master list with ticker, start_date, end_date, type (e.g. CS, ADRC), active, etc. Ticker recycling (same symbol reused after delisting) is represented as separate date ranges.
Merges duplicate entries where appropriate (e.g. consecutive listing periods).
Optionally downloads adjustments (see below) and runs comparison/retry steps.

Output: data/tickers_historical.parquet (master list) and per-day files in data/polygon/raw/tickers/ (e.g. YYYY-MM-DD.parquet).

How to run

python build_historical_tickers.py [--start-date YYYY-MM-DD] [--end-date YYYY-MM-DD]

By default the script runs in incremental mode: it only downloads new trading days and appends to the master list. For an initial build, pass a range; for daily updates, you can run with no args (it will use the last trading day) or pass --end-date.

Flag	Description
`--start-date`	Start date for initial bulk download (YYYY-MM-DD).
`--end-date`	End date. Defaults to last trading day.
`--skip-download`	Skip downloading ticker lists; only build master from existing files.
`--skip-build`	Only download ticker lists; do not build master.
`--skip-adjustments`	Do not download split/dividend adjustments (Step 4).
`--skip-comparisons`	Skip comparison and retry steps (5–7).
`--no-incremental`	Full rebuild from scratch (ignore existing master).
`--force-rebuild`	Force rebuild of master list from existing daily files.

The script has seven steps: (1) download ticker lists, (2) build master list, (3) merge duplicates, (4) download adjustments, (5) compare tickers vs API, (6) compare adjustments coverage, (7) retry missing adjustments. Use the flags above to skip steps when needed.

Split and dividend adjustments

Adjustments are corporate actions (splits, dividends) used to compute adjusted OHLC so that backtests and scans are split/dividend correct. They are stored per ticker and loaded when processing flat-file bar data.

What they do

Splits — The API returns split events; the code validates them (e.g. price-ratio checks) and marks “fake” splits so they are not applied. Valid splits are used to build an adjustment factor so that historical bars are comparable across the split.
Dividends — Used to adjust closing prices (e.g. for dividend-adjusted close).

When you run the flat-file downloader, it loads all adjustments from data/polygon/raw/adjustments/, filters by ticker entity date ranges (for recycled tickers), and applies them to produce adj_factor and adjusted OHLC in the derived data.

How to get them

Adjustments are downloaded as part of build_historical_tickers.py (Step 4) if you do not use --skip-adjustments. The date range is taken from the ticker files you already have. One Parquet file per ticker is written to config.ADJUSTMENTS_DIR (e.g. AAPL.parquet) with columns such as ticker, date, type, amount.

If you run tickers in incremental mode and include Step 4, new or updated tickers get their adjustment files; the script can also retry missing adjustments (Step 7) after the coverage comparison (Step 6).

Filling the database with data

Bar data comes from Polygon flat files (S3): one compressed CSV per trading day of 1-minute bars. The pipeline downloads them, streams 1m → 5m and daily, writes Parquet, then builds derived layers (intraday_daily, daily_features) used by the scanner and backtester.

One-time / initial backfill

python download_flatfiles.py --start-date 2018-01-01 --end-date 2026-02-08

Step 1 — Download flat files from S3 (or reuse cached temp files), stream each day to m5_by_date and d_by_date (Parquet per date). Verification runs so that all dates have valid temp files before processing (up to 3 attempts for missing dates).
Step 2 — Build d_by_symbol and day index from d_by_date (m5/d already from Step 1).
Step 3 — Run build_derived.py in a subprocess: builds intraday_daily (5m bars per symbol/date) and daily_features (daily OHLCV and derived columns).
Step 4 — Verify adjustments (spot-check latest adjusted closes).

Default mode is “5m+1d only”: no persistent m1_by_date; 1m is streamed to 5m and daily only. Use --full-pipeline if you need full 1m pipeline and m1_by_date.

Paths written

Path (config)	Content
`M5_BY_DATE_DIR`	5m bars per date (Parquet).
`D_BY_DATE_DIR`	Daily bars per date (Parquet).
`FLATFILES_TEMP_DIR`	Temp downloaded CSV.gz; reused as download cache; deleted after successful process.
`INTRADAY_DAILY_DIR`	Per-symbol Parquet with 5m (and optional 1m) arrays per date — used by scanner and backtester.
`DAILY_FEATURES_DIR`	Daily features (OHLCV, etc.) per symbol/date — used by scanner.

Why Parquet

All persistent output (tickers, adjustments, m5_by_date, d_by_date, intraday_daily, daily_features) is stored as Parquet. The project does not use a relational database.

Speed — Columnar format; the scanner and backtester read only the columns and date ranges they need. Fast for vectorized scans over many symbols and dates.
Size — Good compression (e.g. snappy) and no extra index overhead. Fits large histories on disk.
Portability — Single files or directories of files; easy to backup, move, or replicate. No server to run.
Ecosystem — Pandas and Polars read/write Parquet natively; same format from download pipeline through to backtest.

Ticker lists are also saved as tickers_historical.parquet in data/. Adjustments are one .parquet per ticker under data/polygon/raw/adjustments/.

Daily updates (schedule)

After the database is fully backfilled, you keep it up to date by (1) updating the ticker list and adjustments for new days, and (2) downloading flat files for the latest trading day(s) and rebuilding derived layers for those dates.

Recommended daily flow

Tickers (and adjustments) — Run build_historical_tickers.py without a date range (or with --end-date set to “yesterday”). Incremental mode will only fetch new trading days and update the master list and adjustment files.
Flat files + derived — Run download_flatfiles.py with a short range covering the latest trading day(s), e.g. --start-date 2026-02-07 --end-date 2026-02-08. Incremental mode will only download and process dates that are missing from m5_by_date; then Step 2 and Step 3 will update d_by_symbol and derived layers for the new dates.

Schedule example (cron or Task Scheduler)

Example in ET: 2:00 for tickers (and adjustments), 5:00 for flat files. Use the last trading day as --end-date where needed.

Unix cron (adjust paths, Python, and hours if the server is not ET):

# 2:00 AM ET: tickers + adjustments; 5:00 AM ET: flat files (adjust hours if server is not ET)
0 2 * * 1-5 cd /path/to/polygon-scanner && python build_historical_tickers.py
0 5 * * 1-5 cd /path/to/polygon-scanner && python download_flatfiles.py --end-date $(date +\%Y-\%m-\%d)

Create two tasks in Task Scheduler (Create Task…): weekday weekly triggers at 2:00 and 5:00 (adjust for ET vs local PC time).

Tickers task — Trigger: Weekly, Mon–Fri, 2:00 AM. Action: Program = full path to python.exe; Arguments = build_historical_tickers.py; Start in = repo root.
Flat files task — Trigger: Weekly, Mon–Fri, 5:00 AM. Action: Program = powershell.exe; Arguments = -NoProfile -ExecutionPolicy Bypass -Command "Set-Location 'C:\path\to\polygon-scanner-faster'; python download_flatfiles.py --end-date (Get-Date -Format 'yyyy-MM-dd')" (fix the path).

Keep .env in the project root so API and S3 keys load for scheduled runs.

If your shell cannot compute “last trading day” easily, you can call a small wrapper that sets --end-date to the previous weekday, or use a fixed offset (e.g. always “yesterday”). The scripts are incremental, so re-running with the same end date is safe.

Order Run tickers (and adjustments) first, then flat files. That way the ticker universe and adjustment files are up to date before new bar data is processed and used in scans/backtests.

Config: config.py. Tickers: build_historical_tickers.py, tickers.py. Adjustments: adjustments.py. Data pipeline: download_flatfiles.py, flatfile_downloader_polars.py, build_derived.py, derived_layer_builder.py.

Part I — Strategy & backtest configuration

The strategy file (JSON, YAML, or Python) defines entry/exit rules, execution, costs, portfolio, optional indicators and attributes, and the scan (daily_conditions, intraday_conditions). run_backtest.py loads it, validates it, then runs scan + backtest.

How to run a backtest

From the project root:

python run_backtest.py <strategy_file> [--normalize] [--no_normalize]

<strategy_file> — Path to a .json, .yaml, or .py strategy file (required).
Default (no flag): --normalize_api — After the main backtest, trades that entered between 07:00–09:00 ET are re-backtested using API-normalized minute data; those trades are replaced. Logs: logs/normalize_api_*.log.
--normalize — Same idea but re-backtest uses chart-normalized (anomaly-smoothed) 7–9 AM data.
--no_normalize — No re-backtest; all trades use the main bar data only.

Data: The backtester reads from config: DAILY_FEATURES_DIR, INTRADAY_DAILY_DIR, and raw M1 when needed. Make sure start_date / end_date in your strategy overlap your data range.

Universe cache: When USE_UNIVERSE_CACHE is enabled in config.py, scan results are cached by date range and condition key. Re-runs or extended ranges reuse the cache and only scan gaps. Optional universe_cache_name in the strategy writes a human-readable label under data/universe_cache/_names/ (see universe_cache.py).

Output: Results are written under a timestamped folder; the script prints the path.

Custom trade columns (reporting)

The usual way to add extra columns to trades.xlsx is strategy.attributes — declarative values evaluated on the entry bar and merged into each trade row by run_backtest.py. Alternatively, you can add columns only in Python at export time (below).

Export-only (Python): edit run_backtest.py in _build_trades_dataframe(), where each Trade becomes a row dict.

Example: add distance_from_high_pct (percent below the day high at entry).

# Pseudocode (export-only)
# day_high = max(high_unadj) for the trade's anchor date
# distance_from_high_pct = (day_high - entry_price) / day_high * 100

No look-ahead note “Day high” uses information from later in the session. That’s fine for reporting, but do not use full-day values inside entry/exit logic. If you want a tradable version, compute the high only up to a time window (e.g. premarket high, or “high so far” up to the entry bar).

File format and top-level keys

Supported formats:

Format	Extension	How loaded
JSON	`.json`	Parsed; `start_date`/`end_date` may be `"YYYY-MM-DD"` strings.
YAML	`.yaml`, `.yml`	`yaml.safe_load`; same structure as JSON.
Python	`.py`	Executed as module; must define `strategy`, `daily_conditions`, `intraday_conditions` (and optionally `types`, `start_date`, `end_date`, `name`, `universe_cache_name`).

Top-level keys:

Key	Required	Description
strategy	Yes	Backtest definition: timeframe, entry, exit, costs, portfolio, optional indicators and attributes.
daily_conditions	Yes	List of DSL strings. All must be true. (See Part II for DSL.)
intraday_conditions	Yes	List of DSL strings. All must be true.
types	No	Security types for universe, e.g. `["CS", "ADRC"]`. If omitted, all symbols.
start_date	No	Scan start (inclusive).
end_date	No	Scan end (inclusive). Must overlap your data.
name	No	Display name for the strategy.
universe_cache_name	No	Optional label recorded under `data/universe_cache/_names/` when the cache is written; lookup uses `condition_key` from conditions/universe/types/timeframe (`universe_cache.py`).

The strategy object

Key	Required	Description
timeframe	Yes	Bar resolution: `"5m"` or `"1d"`.
entry	Yes	Entry rules: side, conditions, execution, time window. See below.
exit	Yes	Exit rules: stop_loss, time_exit, optional take_profit and indicator_exits.
costs	Yes	Slippage and fees.
portfolio	Yes	Account and sizing.
indicators	No	Named indicators (SMA, EMA, RSI, etc.) for entry/exit conditions.
attributes	No	Optional list of extra trade-export columns evaluated at entry (DSL strings, indicator names, or `{ "name", "expr" }`). See Strategy attributes.
debug_symbol_date	No	Optional `[symbol, date]` for debug logging.

Entry (strategy.entry)

Defines how and when to enter a trade on a candidate (symbol, date) that passed the scan. Only one trade per symbol per anchor date. Bar data is in ET.

Field	Required	Description
side	Yes	`"long"` or `"short"`.
conditions	Yes	List of per-bar conditions. All must be true to enter. See below.
execution	Yes	On current close: `"close"` — fill at the bar close where conditions hold. On next open: `"next_open"` — fill at the next bar’s open. Trigger: `"touch"` — fill when price hits trigger (then use order_type).
time_window	No	`{ "start": "HH:MM", "end": "HH:MM" }` in ET. Entry only considered in this window.
order_type	No	Only when `execution: "touch"`. One of: `buy_limit`, `buy_stop`, `sell_limit`, `sell_stop`. Ignored for `close` and `next_open`.
trigger	No	For `"touch"`: explicit trigger price; otherwise inferred from conditions.

Touch order types: buy_limit = fill when price goes down to trigger; buy_stop = fill when price goes up to trigger; sell_limit = fill when price goes up (short the pop); sell_stop = fill when price goes down. If price gaps through the trigger, fill at bar open with slippage.

Entry conditions (strategy.entry.conditions)

Each element is a condition object. All conditions must be true (AND). Supported types:

Comparison — type: "comparison", with lhs, op (>, >=, <, <=, ==, !=), rhs. lhs/rhs can be: numbers; bar fields (open, high, low, close); indicator names; DSL-style refs (same as in Part II); entry-only refs [last_bars][N][REDUCER][field][adjusted] (last N bars) and [intraday][0][-M][0][REDUCER][field][adjusted] (rolling M minutes); or expression dicts (add/sub/mul/div).
Green candle — type: "green_candle" (close > open).
Red candle — type: "red_candle" (close < open).
OR group — type: "or", conditions: list of condition objects; at least one must be true.
AND group — type: "and", conditions: list of condition objects; all must be true.

Exit (strategy.exit)

Exit checks are evaluated in priority order each bar. First match wins.

Priority	Key	Description
1	(gap fill)	When checking stop/TP, if price gapped through the level, fill at bar open with slippage.
2	stop_loss	Stop level. Types: `percent`, `dollar`, `prior_range_high`, `ref`/`expression`. Fields: `value`, `execution` (touch/close), and for prior_range_high: `start_time`, `multiplier`.
3	take_profit	Optional. Same types as stop, plus `risk_multiple` (uses `multiple`).
4	indicator_exits	Optional list. Each item: `condition` (same shape as entry conditions), `execution` (default `close`).
5	time_exit	`time` (ET, e.g. `"16:00"`; default 15:55 if omitted), `after_days` (default 0).

Costs and portfolio

Costs (strategy.costs) — All per-share: entry_slippage, exit_slippage, stop_slippage, locate_fee (shorts; can be numeric or "1%"), fee_per_share (numeric or "N%").

Portfolio (strategy.portfolio) — starting_balance (default 100000), risk_per_trade_pct (default 1.0), position_pct (default 10.0), skip_trade_probability (0–1), ignore_margin_requirements (default true). Balance = starting + realized P&L; sizing uses risk_per_trade_pct when stop is known, else position_pct.

Indicators (strategy.indicators)

Optional. Map of name → definition. Used in entry/exit conditions. Each definition: type (one of sma, ema, vwap, atr, adr, rsi, adx, roc), period (default 14 for RSI/ADX; VWAP can omit for session VWAP), source (e.g. close_adj, default close_adj). No look-ahead; referenced by name in conditions.

Strategy attributes (strategy.attributes)

Optional list on the strategy object: "attributes": [ ... ]. Each item is evaluated once when a trade opens, at the entry bar index (same bar the engine uses for the entry fill). Numeric results are attached to the Trade as a dict and copied into trades.xlsx by run_backtest.py (_build_trades_dataframe). Attributes are for reporting only; they do not change entries, exits, or position size.

Supported item shapes (validated by strategy_validator.py):

DSL string — Must start with [ (e.g. "[daily][-1][close][adjusted]"). Column header in Excel is a sanitized form of the string (brackets and spaces become underscores).
Indicator name — A non-bracket string that names a key in strategy.indicators (e.g. "vwap"). The exported value is that indicator series at the entry bar.
Named expression — { "name": "my_column", "expr": ... } where expr is either a DSL string or the same JSON expression object used in entry conditions (type: mul, div, indicator, add, sub, bar fields like { "close": "unadjusted" }, etc.). This gives stable, readable column names and is ideal for derived metrics.

Expression evaluation uses the same rules as entry conditions (including [intraday] rolling windows and [last_bars] where applicable). Follow the same lookahead discipline as trading logic: full-session aggregates are fine for reporting but do not use “future” intraday data in attributes if you intend them to describe what was knowable at entry.

Failures If an attribute cannot be evaluated for a trade, the column still appears in the export with an empty / null value for that row.

Example (from strategies/touch_5m.json): pct_from_vwap as a named expression combining unadjusted close and the vwap indicator.

"attributes": [
  {
    "name": "pct_from_vwap",
    "expr": {
      "type": "mul",
      "lhs": {
        "type": "sub",
        "lhs": {
          "type": "div",
          "lhs": { "close": "unadjusted" },
          "rhs": { "type": "indicator", "name": "vwap" }
        },
        "rhs": 1
      },
      "rhs": 100
    }
  }
]

Scan section (top-level)

daily_conditions — List of DSL strings. All must be true. Evaluated over daily_features (one row per symbol per date).

intraday_conditions — List of DSL strings. All must be true. Evaluated on candidates that passed daily, using intraday_daily arrays.

types, start_date, end_date — Optional; restrict universe and date range. Strategy timeframe selects the intraday resolution for both scan and backtest.

Validation

Required: strategy, daily_conditions, intraday_conditions. Strategy must have timeframe, entry, exit, costs, portfolio. Entry must have side, conditions (list), execution. Exit must have time_exit. Each DSL condition must parse. If any check fails, validate_strategy() returns errors and the script exits before running the scan + backtest.

Example (minimal strategy)

{
  "strategy": {
    "timeframe": "5m",
    "entry": {
      "side": "short",
      "conditions": [],
      "execution": "close",
      "time_window": { "start": "09:25", "end": "09:29" }
    },
    "exit": {
      "stop_loss": { "type": "percent", "value": 100, "execution": "touch" },
      "time_exit": { "after_days": 0, "time": "16:00" }
    },
    "costs": {
      "entry_slippage": 0.001,
      "exit_slippage": 0.001,
      "stop_slippage": 0.002,
      "locate_fee": 0.02,
      "fee_per_share": 0.005
    },
    "portfolio": {
      "starting_balance": 10000,
      "risk_per_trade_pct": 1,
      "position_pct": 0,
      "skip_trade_probability": 0
    }
  },
  "daily_conditions": ["[daily][0][close][unadjusted] > 0.01"],
  "intraday_conditions": ["[intraday][0][04:00][16:00][SUM][volume][unadjusted] > 0"],
  "types": ["CS", "ADRC"],
  "start_date": "2024-01-01",
  "end_date": "2026-02-24",
  "name": "My strategy"
}

To add entry conditions and indicators: define strategy.indicators (e.g. sma30 with type sma, period 30), then in strategy.entry.conditions use comparisons like close > sma30. Add strategy.exit.take_profit if desired.

Flow summary

Load — load_strategy(path) → config dict.
Validate — validate_strategy(config); exit if errors.
Scan — Run daily_conditions over daily_features, then intraday_conditions over intraday_daily; candidates = (symbol, date) that pass both.
Backtest — For each candidate, load bar data, apply entry and exit rules, costs, and portfolio sizing. Optional strategy.attributes are evaluated at the entry bar and merged into trades.xlsx.

Part II — Scan condition DSL

The data-access language used in daily_conditions and intraday_conditions. It references daily bars, intraday windows, and combines them with constants and operators. The backtester also uses the same daily/intraday ref syntax in entry/exit expressions; entry conditions additionally support [last_bars] and rolling [intraday] (see Part I).

Where the DSL is used

daily_conditions — List of strings. All must be true. Run over daily_features (one row per symbol per date). Fast; no intraday data loaded.
intraday_conditions — List of strings. All must be true. Run only on candidates that passed daily; data from intraday_daily (arrays per symbol/date).
strategy.attributes — Optional list on the strategy object. Each string or { "name", "expr" } is evaluated at the entry bar and written as extra columns in trades.xlsx (see Strategy attributes).

Strategy timeframe determines which data is used: 5m uses intraday 5-minute bars, 1d uses daily bars.

Expression types

An expression is one of:

Constant — e.g. 0, 1.5, 2000000, 70
Daily reference — Single day or multi-day average (see Daily references)
Intraday reference — Time-window aggregate (see Intraday references)
Binary operation — (left op right) with arithmetic or comparison operators

Example combined: (([daily][0][open][adjusted] - [daily][-1][close][adjusted]) / [daily][-1][close][adjusted] * 100) > 70

Daily references

Daily data is one row per (symbol, date) in daily_features. All times are session date (no intraday time).

Single-day form

[daily][offset][field][adjusted]

Part	Values	Description
offset	`0`, `-1`, `-2`, …	`0` = current day (scan date), `-1` = previous trading day. Positive offsets are not supported.
field	`open`, `high`, `low`, `close`, `volume`	OHLCV for that day.
adjusted	`adjusted` or `unadjusted`	Price adjustment for splits/dividends. Use the same adjustment on both sides when comparing two days.

Examples: [daily][0][close][unadjusted] — today’s unadjusted close. [daily][-1][open][adjusted] — yesterday’s adjusted open.

Multi-day range (AVG, MAX, MIN)

[daily][start_offset][end_offset][AVG|MAX|MIN][field][adjusted]

start_offset and end_offset are negative (e.g. -20 to -1 = last 20 days). Supported reducers:

AVG / AVERAGE — Average over the range. E.g. [daily][-5][-1][AVG][close][adjusted] — 5-day average close.
MAX — Maximum over the range. E.g. [daily][-20][-1][MAX][high][adjusted] — highest high in the last 20 days.
MIN — Minimum over the range. E.g. [daily][-20][-1][MIN][low][adjusted] — lowest low in the last 20 days.

Example: 20-day “spike” in % = ([daily][-20][-1][MAX][high][adjusted] / [daily][-20][-1][MIN][low][adjusted] - 1) * 100.

Intraday references

Intraday data is arrays per (symbol, date) in intraday_daily (e.g. 192 bars for 5m, 04:00–19:59 ET). You select a time window and a reducer over that window.

Form

[intraday][day_offset][start_time][end_time][reducer][field][adjusted]

Part	Description
day_offset	`0` = scan date, `-1` = previous trading day.
start_time / end_time	`HH:MM` in ET. Session is 04:00–19:59. `current_time`/`now` = end of session (19:59) in scan.
reducer	MAX, MIN, SUM, AVG/AVERAGE, FIRST, LAST.
field	`open`, `high`, `low`, `close`, `volume`.
adjusted	For OHLC: only `unadjusted`. For volume: `unadjusted` or `adjusted`.

Reducers

Reducer	Meaning	Example
MAX	Maximum over window	Premarket high: `[intraday][0][04:00][09:29][MAX][high][unadjusted]`
MIN	Minimum over window	Premarket low: `[intraday][0][04:00][09:29][MIN][low][unadjusted]`
SUM	Sum	Session volume: `[intraday][0][04:00][16:00][SUM][volume][unadjusted]`
AVG / AVERAGE	Average	Price or volume average in window
FIRST	Value at first bar of window	Session open: `[intraday][0][04:00][16:00][FIRST][open][unadjusted]`
LAST	Value at last bar of window	Session close: `[intraday][0][04:00][16:00][LAST][close][unadjusted]`

MAX and MIN: Must use start_time = 04:00. Other start times are not supported for MAX/MIN.

Timeframe: Strategy timeframe (5m or 1d) selects which data is used. Your strategy timeframe must match your data.

Binary operators and precedence

Lowest to highest precedence:

Comparison: >, <, >=, <=, ==, !=
Additive: +, -
Multiplicative: *, /

Left-associative. Use parentheses to force order. Division by zero is treated as infinity; comparisons with NaN yield false.

Adjustment rules (summary)

Context	open/high/low/close	volume
Daily	`adjusted` or `unadjusted`	`adjusted` or `unadjusted`
Intraday	Only `unadjusted`	`unadjusted` or `adjusted`

What is not in the DSL

No future data — Offsets are only 0 or negative. No [daily][1].
No custom indicators in DSL — Indicators are in strategy.indicators and used in entry conditions, not in scan DSL.
No “current bar” in scan — In scanner conditions, current_time/now = end of session (19:59).
MAX/MIN intraday — Must start at 04:00.

Complete examples

Condition	Meaning
`[daily][0][close][unadjusted] > 0.01`	Today’s close exists (minimal filter).
`[daily][0][open][unadjusted] > 0.15` and `< 15`	Open in $0.15–$15.
`[daily][-1][close][adjusted] > [daily][-1][open][adjusted]`	Yesterday green.
`(([daily][0][open][adjusted] - [daily][-1][close][adjusted]) / [daily][-1][close][adjusted] * 100) > 70`	Gap up > 70%.
`([daily][-20][-1][MAX][high][adjusted] / [daily][-20][-1][MIN][low][adjusted] - 1) * 100 > 20`	20-day range (high/low) > 20%.
`[intraday][0][04:00][16:00][SUM][volume][unadjusted] > 0`	Session has some volume.
`[intraday][0][04:00][09:29][SUM][volume][unadjusted] > 100000`	Premarket volume > 100k.
`[intraday][0][04:00][16:00][MAX][high][unadjusted] > 1.5 * [intraday][0][04:00][16:00][FIRST][open][unadjusted]`	Session high ≥ 50% above session open.

Invariants (engine guarantees)

Daily: only offsets 0 and negative; no future dates.
Intraday OHLC: must be unadjusted; volume may be adjusted or unadjusted.
Intraday window: any HH:MM–HH:MM within 04:00–19:59 ET; MAX/MIN require start 04:00.
When comparing two days in daily expressions, use the same adjustment on both sides.

Implementation: strategy_validator.py (load + validate), backtester_engine.py (BacktesterEngine, ExecutionEngine, PortfolioEngine, ConditionEvaluator), scanner_engine.py (DSLParser, Compiler), run_backtest.py (orchestration).