This page shows the ongoing live performance of every AI model integrated into Suna AI Vision. Model predictions for Gold (XAU/USD) and EUR/USD are evaluated daily against real market ticks, normalized to R-multiples and scored by a rule-based logic (walk-forward methodology, defined thresholds). Per model, the number of evaluated trades, hit rate, average R-multiple and current status (active / de-prioritized) are displayed.
Every model prediction is evaluated only against data the model had not seen at prediction time (walk-forward evaluation). This methodologically eliminates look-ahead bias — the evaluation reflects real-time trading conditions.
Hit rate alone is an insufficient metric: a system with 80 % hit rate and 1:10 risk-reward still loses. Results are therefore normalized to R-multiples (profit or loss expressed in multiples of the initial risk), so models become comparable and any actual edge is clearly visible.
Models that fall below threshold over a defined window are de-weighted in the consensus vote. As soon as live performance recovers above threshold, the model is automatically re-weighted. The mechanism is rule-based and not manually controlled.
Live market data for Gold (XAU/USD) and EUR/USD: Twelve Data (Free tier). Crypto market data: CoinGecko (Demo tier). Historical 1-minute OHLC data: Dukascopy.
Per AI model: number of evaluated trades, hit rate, average R-multiple, current status (active / de-prioritized), evaluation period. The evaluation is rule-based (walk-forward, R-multiple, thresholds) and not hand-picked.
Per AI model: number of evaluated trades, hit rate, average R-multiple, current status (active / de-prioritized), evaluation period. The evaluation is rule-based, using walk-forward methodology with R-multiple normalization.
A model prediction is evaluated only against data the model had not seen at prediction time. This methodologically eliminates look-ahead bias and the evaluation reflects real-time trading conditions.
Hit rate alone is an insufficient metric. A system with 80 % hit rate and 1:10 risk-reward still loses. R-multiples (profit or loss in multiples of the initial risk) make models and strategies fairly comparable.
When live performance falls below a predefined threshold over a defined window. The weight in the consensus vote is then reduced. As soon as performance recovers above threshold, the model is automatically re-weighted upward.
No. The evaluation is rule-based (walk-forward, R-multiple, predefined thresholds) and computed automatically. There is no manual selection of which models are displayed.