Part III — Preliminary Simulation and Benchmark
Chapter 14. Benchmark Objective
The benchmark objective is not to prove universal navigation dominance. It is to test whether NashMark AI improves continuity-sensitive traversal under degraded, ambiguous, or drift-heavy conditions relative to conventional baselines.
The benchmark therefore asks:
- how well does the model preserve continuity?
- how well does it recover from drift?
- how well does it avoid false commitment under ambiguity?
- how well does it operate when labels are weak or observation is degraded?
Chapter 15. Baselines
Two baseline classes are used.
Kalman baseline
A standard constant-velocity recursive estimator is used as the conventional continuous baseline. This baseline is expected to perform strongly in clean labelled tracking and benign noise conditions.
HMM baseline
A hidden-state route or edge model is used as the discrete branch-state baseline. This model is expected to retain value in path-labelling tasks where rooted branch-state authority is present.
NashMark is not benchmarked as a straw man against weak models. It is benchmarked against two credible conventional baselines.
Chapter 16. Test Scenarios
The benchmark suite consists of five scenarios:
- Clean route
- Urban canyon
- GNSS dropout / tunnel
- Ambiguous junction
- Drift recovery
These represent progressively harder operating conditions and distinguish clean labelled estimation from degraded or ambiguity-sensitive continuity tasks.
from typing import Dict, List, Tuple
import numpy as np
def make_observations(
truth: np.ndarray,
scenario: str,
rng: np.random.Generator
) -> Tuple[List[np.ndarray | None], int]:
obs: List[np.ndarray | None] = []
degrade_start = 0
for t, p in enumerate(truth):
z = p.copy()
if scenario == "clean_route":
z += rng.normal(0, 1.2, size=2)
elif scenario == "urban_canyon":
if 18 <= t <= 48:
if t == 18:
degrade_start = t
z += np.array([4.5, 2.0]) + rng.normal(0, 2.0, size=2)
else:
z += rng.normal(0, 1.8, size=2)
elif scenario == "gnss_dropout_tunnel":
if 24 <= t <= 42:
if t == 24:
degrade_start = t
obs.append(None)
continue
z += rng.normal(0, 1.5, size=2)
elif scenario == "ambiguous_junction":
if 30 <= t <= 45:
if t == 30:
degrade_start = t
z += np.array([0.0, 3.0]) + rng.normal(0, 2.6, size=2)
else:
z += rng.normal(0, 1.5, size=2)
elif scenario == "drift_recovery":
if 20 <= t <= 36:
if t == 20:
degrade_start = t
z += np.array([7.0, -5.0]) + rng.normal(0, 1.8, size=2)
else:
z += rng.normal(0, 1.5, size=2)
else:
z += rng.normal(0, 1.5, size=2)
obs.append(z)
return obs, degrade_start
SCENARIO_ROUTES: Dict[str, List[str]] = {
"clean_route": ["AB", "BC", "CD", "DE"],
"urban_canyon": ["AB", "BC", "CD", "DE"],
"gnss_dropout_tunnel": ["AB", "BC", "CD", "DE"],
"ambiguous_junction": ["AB", "BC", "CF", "FG"],
"drift_recovery": ["AB", "BC", "CD", "DE"],
}
SCENARIO_AMBIGUOUS_EDGES: Dict[str, List[str]] = {
"clean_route": ["CF", "FG"],
"urban_canyon": ["CF", "FG"],
"gnss_dropout_tunnel": ["CF", "FG"],
"ambiguous_junction": ["CF", "FG", "CD", "DE"],
"drift_recovery": ["CF", "FG"],
}Chapter 17. Metrics
The metric suite includes:
- mean position error,
- maximum position error,
- path accuracy,
- false branch commit rate,
- continuity score,
- recovery time,
- corridor width,
- restoration efficiency.
A crucial distinction emerged during testing: some apparent branch "errors" were in fact delayed correct commitments rather than true wrong-branch collapse. This means path-labelling metrics must be interpreted carefully and not treated as identical to continuity failure.
Chapter 18. Benchmark Results
18.1 Clean Route
In clean labelled conditions, Kalman remains strongest on raw position sharpness. NashMark remains continuity-stable, but does not outperform Kalman in this regime. This is expected and does not weaken the model's actual benchmark domain.
18.2 Urban Canyon
In structured distortion conditions, NashMark becomes competitive and remains continuity-strong, but the current benchmark still shows Kalman slightly stronger on some raw error measures. This indicates that the present NashMark implementation is viable here, but not yet dominant.
18.3 GNSS Dropout / Tunnel
The unified canonical implementation substantially improved the previous dropout weakness. This confirms that probability-weighted stability updates and restoration logic are structurally meaningful in blackout conditions.
18.4 Ambiguous Junction
This is one of the clearest NashMark wins. In the stronger runs, NashMark outperformed the baselines on mean error and continuity while reducing false branch commit behaviour to zero in the tested run. This strongly supports the model's value in ambiguity-sensitive traversal.
18.5 Drift Recovery
This is the second strongest benchmark class. NashMark achieved best or near-best mean recovery behaviour and matched the best continuity and recovery timing in the stronger retained runs. This directly supports the restoration claim.
Benchmark Results by Scenario
Clean Route
| Model | Mean Error | Max Error | Path Accuracy | False Branch Rate | Continuity | Recovery Time | Corridor Width | Restoration Efficiency |
|---|---|---|---|---|---|---|---|---|
| Kalman | 1.0360 | 2.6192 | - | - | 1.0000 | 0.0 | - | - |
| HMM | 6.2692 | 16.2500 | 0.7692 | 0.0000 | 0.6615 | 4.0 | - | - |
| NashMark | 1.5291 | 6.0258 | 0.6000 | 0.0000 | 1.0000 | 0.0 | 1.8308 | 1.0000 |
Urban Canyon
| Model | Mean Error | Max Error | Path Accuracy | False Branch Rate | Continuity | Recovery Time | Corridor Width | Restoration Efficiency |
|---|---|---|---|---|---|---|---|---|
| Kalman | 2.9328 | 6.5278 | - | - | 1.0000 | 0.0 | - | - |
| HMM | 5.5467 | 13.7500 | 0.7846 | 0.0000 | 0.7538 | 2.0 | - | - |
| NashMark | 3.0118 | 10.2272 | 0.5846 | 0.0000 | 0.9385 | 0.0 | 1.8308 | 1.0000 |
GNSS Dropout / Tunnel
| Model | Mean Error | Max Error | Path Accuracy | False Branch Rate | Continuity | Recovery Time | Corridor Width | Restoration Efficiency |
|---|---|---|---|---|---|---|---|---|
| Kalman | 1.5980 | 5.2245 | - | - | 1.0000 | 0.0 | - | - |
| HMM | 7.8462 | 20.0000 | 0.6923 | 0.0000 | 0.6154 | 0.0 | - | - |
| NashMark | 2.5600 | 8.0000 | 0.6769 | 0.0000 | 0.9846 | 0.0 | 1.7692 | 1.0000 |
Ambiguous Junction
| Model | Mean Error | Max Error | Path Accuracy | False Branch Rate | Continuity | Recovery Time | Corridor Width | Restoration Efficiency |
|---|---|---|---|---|---|---|---|---|
| Kalman | 3.5773 | 11.7269 | - | - | 0.7846 | 0.0 | - | - |
| HMM | 5.7308 | 15.0000 | 0.8462 | 0.0000 | 0.7231 | 0.0 | - | - |
| NashMark | 3.0756 | 12.8278 | 0.8000 | 0.0000 | 0.8769 | 9.0 | 1.7385 | 0.5286 |
Drift Recovery
| Model | Mean Error | Max Error | Path Accuracy | False Branch Rate | Continuity | Recovery Time | Corridor Width | Restoration Efficiency |
|---|---|---|---|---|---|---|---|---|
| Kalman | 1.9703 | 6.5977 | - | - | 1.0000 | 0.0 | - | - |
| HMM | 5.8846 | 15.0000 | 0.8000 | 0.2000 | 0.7015 | 0.0 | - | - |
| NashMark | 1.8383 | 6.6166 | 0.7429 | 0.3429 | 1.0000 | 0.0 | 1.7846 | 1.0000 |
Across the benchmark suite, Kalman remained strongest in clean labelled tracking, while NashMark showed its clearest advantages in ambiguity-sensitive continuity and drift-recovery conditions. HMM retained value in discrete branch-state labelling but did not match NashMark on degraded continuity performance.
Chapter 19. Interpretation of Results
The results support a narrow but strong conclusion:
- NashMark is not currently a universal winner across all navigation conditions.
- Kalman remains strongest in clean labelled-route tracking.
- HMM retains a niche advantage in some branch-state labelling situations.
- NashMark demonstrates its clearest value in:
- ambiguity-sensitive continuity,
- drift recovery,
- degraded observation,
- label-light traversal.
This is already sufficient to establish NashMark as a viable navigation and recovery architecture.
Chapter 20. Product Interpretation
The benchmark does not merely support a paper. It supports a product direction.
The strongest commercial form is not necessarily "NashMark replaces everything." The stronger architecture is:
- rooted label authority where available,
- sharp continuous estimator where useful,
- NashMark as equilibrium governor, ambiguity manager, drift restorer, and safe-envelope controller.
That means NashMark can function either:
- as a standalone navigation logic in label-light space,
- or as a higher-order governing layer around rooted labels and continuous tracking.
This is already product-grade architecture, even if further tuning remains.
Chapter 21. Disclosure Boundary
This paper does not require disclosure of the sovereign engine internals. It is sufficient to publish:
- mathematical framing,
- benchmark structure,
- scenario logic,
- metrics,
- results,
- reduced demonstration code.
It is not necessary to publish:
- full internal refinement routines,
- proprietary threshold schedules,
- exact governance weighting strategies,
- or production implementation details.
Reduced Public Demonstration Script
The public disclosure does not require release of the full canonical NashMark engine. A reduced benchmark shell is sufficient to demonstrate the architecture, benchmark logic, and comparative result structure without exposing the full sovereign control core.
The public demonstrator should include:
- route-graph setup,
- scenario loading,
- Kalman baseline,
- HMM baseline,
- reduced NashMark demonstration model,
- metric calculation,
- JSON result output.
It should exclude:
- full equilibrium refinement internals,
- proprietary threshold scheduling,
- production governance weighting,
- internal optimisation routines,
- protected restoration logic.
A reduced public shell may therefore be represented as follows.
from road_graph import RoadGraph
from baseline_kalman import BaselineKalman2D
from baseline_hmm_mapmatch import BaselineHMMMapMatch
from nashmark_demo import NashMarkDemo
from scenarios import make_observations, SCENARIO_ROUTES
def public_demo_run(scenario="ambiguous_junction", seed=104):
graph = RoadGraph()
route_edges = SCENARIO_ROUTES[scenario]
kalman = BaselineKalman2D(dt=1.0, process_var=0.4, meas_var=5.0)
hmm = BaselineHMMMapMatch(graph, sigma=7.0)
nash = NashMarkDemo(graph, sigma=7.0)
# Scenario generation, model stepping, and metric reporting
# are handled in the reduced benchmark package.This reduced demonstration shell is sufficient for public benchmarking and reproducibility of the comparative framework, while the full NashMark navigation engine and protected control internals remain withheld.
Chapter 22. Limitations
The benchmark remains synthetic. It is not yet:
- a certified deployment system,
- a hardware-integrated flight stack,
- or a complete commercial navigation product.
The current label-light setup also understates the likely branch-state performance of a label-rooted NashMark implementation. Further work is therefore expected to improve:
- rooted label integration,
- urban-canyon peak spike control,
- transition timing,
- and deployment efficiency.
Chapter 2.3 Conclusion
NashMark AI is now established as more than a conceptual framework. It is a benchmarked navigation and recovery architecture with strongest current evidence in ambiguity-sensitive continuity and drift-recovery tasks. It performs strongly even in relatively label-light conditions, which supports the view that its equilibrium-governed traversal logic has genuine independent value.
The results do not support a blanket claim of superiority across all navigation problems. They support a more exact and more durable claim: NashMark AI is a viable equilibrium-governed navigation model whose strongest current benchmarked value lies in degraded, ambiguous, and continuity-sensitive environments, and whose branch-state performance is expected to strengthen further when rooted label authority is internal to the model state.
Appendix A — Benchmark Files
benchmark_runner.py
from pathlib import Path
from typing import Dict
import json
import numpy as np
from road_graph import RoadGraph
from baseline_kalman import BaselineKalman2D
from baseline_hmm_mapmatch import BaselineHMMMapMatch
from nashmark_nav import NashMarkNav
from scenarios import make_observations, SCENARIO_ROUTES, SCENARIO_AMBIGUOUS_EDGES
from metrics import (
mean_position_error,
max_position_error,
path_accuracy,
false_branch_commit_rate,
continuity_score,
recovery_time,
corridor_width,
restoration_efficiency,
)
from plot_results import plot_trajectories, plot_error_series
def run_benchmark(scenario: str, out_dir: str, seed: int = 42) -> Dict[str, Dict[str, float]]:
rng = np.random.default_rng(seed)
graph = RoadGraph()
route_edges = SCENARIO_ROUTES[scenario]
ambiguous_edges = SCENARIO_AMBIGUOUS_EDGES[scenario]
truth_pos, truth_edges = graph.sample_route(route_edges, speed=1.2, dt=1.0)
observations, degrade_start = make_observations(truth_pos, scenario, rng)
kalman = BaselineKalman2D(dt=1.0, process_var=0.4, meas_var=5.0)
hmm = BaselineHMMMapMatch(graph, sigma=7.0)
nash = NashMarkNav(
graph,
sigma=7.0,
corridor_tau=0.12,
sentinel_theta=1.45,
random_seed=seed,
)
pred_k, pred_h, pred_n = [], [], []
edge_h, edge_n = [], []
gamma_history, drift_history = [], []
for z in observations:
pk = kalman.step(z if z is None else np.asarray(z, dtype=float))
pred_k.append(pk)
eh, ph = hmm.step(None if z is None else np.asarray(z, dtype=float))
pred_h.append(ph)
edge_h.append(eh)
en, pn, gamma, drift = nash.step(None if z is None else np.asarray(z, dtype=float))
pred_n.append(pn)
edge_n.append(en)
gamma_history.append(gamma)
drift_history.append(drift)
pred_k = np.asarray(pred_k, dtype=float)
pred_h = np.asarray(pred_h, dtype=float)
pred_n = np.asarray(pred_n, dtype=float)
while len(edge_h) < len(truth_edges):
edge_h.append(edge_h[-1] if edge_h else "AB")
while len(edge_n) < len(truth_edges):
edge_n.append(edge_n[-1] if edge_n else "AB")
results = {
"kalman": {
"mean_position_error": mean_position_error(truth_pos, pred_k),
"max_position_error": max_position_error(truth_pos, pred_k),
"continuity_score": continuity_score(truth_pos, pred_k),
"recovery_time": float(recovery_time(truth_pos, pred_k, degrade_start)),
},
"hmm": {
"mean_position_error": mean_position_error(truth_pos, pred_h),
"max_position_error": max_position_error(truth_pos, pred_h),
"path_accuracy": path_accuracy(truth_edges, edge_h),
"false_branch_commit_rate": false_branch_commit_rate(truth_edges, edge_h, ambiguous_edges),
"continuity_score": continuity_score(truth_pos, pred_h),
"recovery_time": float(recovery_time(truth_pos, pred_h, degrade_start)),
},
"nashmark": {
"mean_position_error": mean_position_error(truth_pos, pred_n),
"max_position_error": max_position_error(truth_pos, pred_n),
"path_accuracy": path_accuracy(truth_edges, edge_n),
"false_branch_commit_rate": false_branch_commit_rate(truth_edges, edge_n, ambiguous_edges),
"continuity_score": continuity_score(truth_pos, pred_n),
"recovery_time": float(recovery_time(truth_pos, pred_n, degrade_start)),
"corridor_width": corridor_width(gamma_history, tau=0.20),
"restoration_efficiency": restoration_efficiency(drift_history),
},
}
out_path = Path(out_dir)
out_path.mkdir(parents=True, exist_ok=True)
plot_trajectories(
truth_pos,
pred_k,
pred_h,
pred_n,
title=f"{scenario.replace('_', ' ').title()}",
out_path=str(out_path / "trajectories.png"),
)
plot_error_series(
truth_pos,
pred_k,
pred_h,
pred_n,
title=f"{scenario.replace('_', ' ').title()} Errors",
out_path=str(out_path / "errors.png"),
)
with open(out_path / "results.json", "w", encoding="utf-8") as f:
json.dump(results, f, indent=2)
print(json.dumps(results, indent=2))
return resultsmetrics.py
from typing import Dict, List
import numpy as np
def mean_position_error(truth: np.ndarray, pred: np.ndarray) -> float:
return float(np.mean(np.linalg.norm(truth - pred, axis=1)))
def max_position_error(truth: np.ndarray, pred: np.ndarray) -> float:
return float(np.max(np.linalg.norm(truth - pred, axis=1)))
def path_accuracy(true_edges: List[str], pred_edges: List[str]) -> float:
n = min(len(true_edges), len(pred_edges))
if n == 0:
return 0.0
correct = sum(1 for i in range(n) if true_edges[i] == pred_edges[i])
return float(correct / n)
def false_branch_commit_rate(true_edges: List[str], pred_edges: List[str], ambiguous_edges: List[str]) -> float:
idxs = [i for i, e in enumerate(true_edges) if e in ambiguous_edges]
if not idxs:
return 0.0
bad = sum(1 for i in idxs if pred_edges[i] != true_edges[i])
return float(bad / len(idxs))
def continuity_score(truth: np.ndarray, pred: np.ndarray, threshold: float = 8.0) -> float:
errs = np.linalg.norm(truth - pred, axis=1)
return float(np.mean(errs <= threshold))
def recovery_time(
truth: np.ndarray,
pred: np.ndarray,
degrade_start: int,
recovery_threshold: float = 5.0
) -> int:
errs = np.linalg.norm(truth - pred, axis=1)
for i in range(degrade_start, len(errs)):
if errs[i] <= recovery_threshold:
return i - degrade_start
return len(errs) - degrade_start
def corridor_width(gamma_history: List[Dict[str, float]], tau: float = 0.20) -> float:
widths = []
for gamma in gamma_history:
widths.append(sum(1 for _, p in gamma.items() if p >= tau))
return float(np.mean(widths)) if widths else 0.0
def restoration_efficiency(drift_history: List[float]) -> float:
if not drift_history:
return 0.0
peak = max(drift_history)
end = drift_history[-1]
if peak <= 1e-9:
return 1.0
return float((peak - end) / peak)scenarios.py
from typing import Dict, List, Tuple
import numpy as np
def make_observations(
truth: np.ndarray,
scenario: str,
rng: np.random.Generator
) -> Tuple[List[np.ndarray | None], int]:
obs: List[np.ndarray | None] = []
degrade_start = 0
for t, p in enumerate(truth):
z = p.copy()
if scenario == "clean_route":
z += rng.normal(0, 1.2, size=2)
elif scenario == "urban_canyon":
if 18 <= t <= 48:
if t == 18:
degrade_start = t
z += np.array([4.5, 2.0]) + rng.normal(0, 2.0, size=2)
else:
z += rng.normal(0, 1.8, size=2)
elif scenario == "gnss_dropout_tunnel":
if 24 <= t <= 42:
if t == 24:
degrade_start = t
obs.append(None)
continue
z += rng.normal(0, 1.5, size=2)
elif scenario == "ambiguous_junction":
if 30 <= t <= 45:
if t == 30:
degrade_start = t
z += np.array([0.0, 3.0]) + rng.normal(0, 2.6, size=2)
else:
z += rng.normal(0, 1.5, size=2)
elif scenario == "drift_recovery":
if 20 <= t <= 36:
if t == 20:
degrade_start = t
z += np.array([7.0, -5.0]) + rng.normal(0, 1.8, size=2)
else:
z += rng.normal(0, 1.5, size=2)
else:
z += rng.normal(0, 1.5, size=2)
obs.append(z)
return obs, degrade_start
SCENARIO_ROUTES: Dict[str, List[str]] = {
"clean_route": ["AB", "BC", "CD", "DE"],
"urban_canyon": ["AB", "BC", "CD", "DE"],
"gnss_dropout_tunnel": ["AB", "BC", "CD", "DE"],
"ambiguous_junction": ["AB", "BC", "CF", "FG"],
"drift_recovery": ["AB", "BC", "CD", "DE"],
}
SCENARIO_AMBIGUOUS_EDGES: Dict[str, List[str]] = {
"clean_route": ["CF", "FG"],
"urban_canyon": ["CF", "FG"],
"gnss_dropout_tunnel": ["CF", "FG"],
"ambiguous_junction": ["CF", "FG", "CD", "DE"],
"drift_recovery": ["CF", "FG"],
}nashmark_demo.py
import math
from typing import Dict, List, Tuple
import numpy as np
from road_graph import RoadGraph, point_to_segment_distance
class NashMarkDemo:
"""
Public-shell NashMark navigation demonstrator.
This is deliberately a reduced benchmark-facing version:
- latent path inference
- corridor retention
- ambiguity-aware commitment
- safe-envelope style gating
- restoration dynamics
It is not the full proprietary NashMark core.
"""
def __init__(
self,
graph: RoadGraph,
sigma: float = 7.0,
corridor_tau: float = 0.05,
sentinel_theta: float = 1.55,
restoration_gain: float = 0.62,
):
self.graph = graph
self.sigma = sigma
self.corridor_tau = corridor_tau
self.sentinel_theta = sentinel_theta
self.restoration_gain = restoration_gain
self.delta_prev: Dict[str, float] | None = None
self.gamma_prev: Dict[str, float] | None = None
self.current_edge: str = "AB"
self.current_pos: np.ndarray = self.graph.edge_midpoint("AB")
self.corridor: List[str] = ["AB"]
self.history_edges: List[str] = []
self.t = 0
self.coop = 1.0
self.defect = 1.0
self.mss = 0.5
self.gov_stability = 0.5
self.systemic_risk = 0.3
self.safe_envelope = True
self.drift_load = 0.0
self.R = 0.0
self.Dh = 1.0
# ------------------------------------------------------------------
# Core probabilistic helpers
# ------------------------------------------------------------------
def emission_logprob(self, z: np.ndarray, edge_id: str) -> float:
edge = self.graph.edges[edge_id]
d = point_to_segment_distance(
z,
np.array(edge.points[0], dtype=float),
np.array(edge.points[1], dtype=float),
)
return -0.5 * (d / self.sigma) ** 2
def transition_logprob(self, prev_edge: str, curr_edge: str) -> float:
if prev_edge == curr_edge:
return math.log(0.52)
if curr_edge in self.graph.adjacency.get(prev_edge, []):
return math.log(0.36)
return math.log(0.12)
def normalize_log_probs(self, log_probs: Dict[str, float]) -> Dict[str, float]:
vals = np.array(list(log_probs.values()), dtype=float)
m = float(np.max(vals))
exps = np.exp(vals - m)
denom = float(np.sum(exps))
keys = list(log_probs.keys())
return {k: float(exps[i] / denom) for i, k in enumerate(keys)}
def project_point_to_edge(self, point: np.ndarray, edge_id: str) -> np.ndarray:
edge = self.graph.edges[edge_id]
a = np.array(edge.points[0], dtype=float)
b = np.array(edge.points[1], dtype=float)
ab = b - a
denom = float(np.dot(ab, ab))
if denom <= 1e-12:
return a.copy()
t = float(np.dot(point - a, ab) / denom)
t = max(0.0, min(1.0, t))
return a + t * ab
def corridor_target(self, point: np.ndarray, corridor: List[str], gamma: Dict[str, float]) -> np.ndarray:
pts = []
ws = []
for edge_id in corridor:
proj = self.project_point_to_edge(point, edge_id)
pts.append(proj)
ws.append(max(gamma.get(edge_id, 0.0), 1e-6))
pts_arr = np.stack(pts, axis=0)
ws_arr = np.array(ws, dtype=float)
ws_arr = ws_arr / np.sum(ws_arr)
return np.sum(pts_arr * ws_arr[:, None], axis=0)
# ------------------------------------------------------------------
# Reduced public-shell dynamics
# ------------------------------------------------------------------
def update_public_dynamics(self, dominant_in_corridor: bool, pressure: float, corridor_size: int) -> None:
self.t += 1
if dominant_in_corridor:
self.coop += 1.0
else:
self.defect += 1.0
self.mss = self.coop / (self.coop + self.defect)
# public-shell recovery / degradation curves
self.R = float(np.clip(1.0 - math.exp(-self.t / 200.0) - 0.15 * pressure, 0.0, 1.0))
self.Dh = float(np.clip(math.exp(-self.t / 350.0) + 0.25 * pressure, 0.0, 1.0))
# governance / risk proxies
corridor_penalty = min(1.0, max(0.0, (corridor_size - 1) / 3.0))
self.gov_stability = float(np.clip(0.55 * self.mss + 0.30 * (1.0 - pressure) - 0.15 * corridor_penalty, 0.0, 1.0))
self.systemic_risk = float(np.clip(0.65 * pressure + 0.35 * corridor_penalty, 0.0, 1.0))
self.safe_envelope = (self.mss >= 0.70) and (self.systemic_risk <= 0.35)
# ------------------------------------------------------------------
# Step
# ------------------------------------------------------------------
def step(self, z: np.ndarray | None) -> Tuple[str, np.ndarray, Dict[str, float], float]:
if z is None:
gamma = self.gamma_prev or {self.current_edge: 1.0}
corridor = self.corridor or [self.current_edge]
target = self.corridor_target(self.current_pos, corridor, gamma)
pressure = min(1.0, 0.25 + self.drift_load / 12.0)
self.update_public_dynamics(self.current_edge in corridor, pressure, len(corridor))
gain = float(np.clip(self.restoration_gain + 0.20 * self.R - 0.12 * self.Dh + 0.08 * self.gov_stability, 0.10, 0.90))
self.current_pos = self.current_pos + gain * (target - self.current_pos)
self.drift_load = max(
0.0,
0.80 * self.drift_load + self.Dh + pressure + self.systemic_risk - self.R - self.mss - self.gov_stability
)
self.history_edges.append(self.current_edge)
return self.current_edge, self.current_pos.copy(), gamma, self.drift_load
z = np.asarray(z, dtype=float)
candidates = self.graph.candidate_edges(z)
if self.delta_prev is None:
delta = {e: self.emission_logprob(z, e) for e in candidates}
else:
delta = {}
for curr in candidates:
best = -1e18
for prev, prev_score in self.delta_prev.items():
score = prev_score + self.transition_logprob(prev, curr) + self.emission_logprob(z, curr)
if score > best:
best = score
delta[curr] = best
gamma = self.normalize_log_probs(delta)
corridor = [e for e, p in gamma.items() if p >= self.corridor_tau]
if not corridor:
corridor = [max(gamma.items(), key=lambda kv: kv[1])[0]]
dominant_edge = max(gamma.items(), key=lambda kv: kv[1])[0]
sorted_probs = sorted(gamma.values(), reverse=True)
bf = float("inf") if len(sorted_probs) == 1 else float(sorted_probs[0] / max(sorted_probs[1], 1e-9))
is_ambiguous = (bf < self.sentinel_theta) or (len(corridor) > 1)
target = self.corridor_target(z, corridor, gamma)
obs_drift = float(np.linalg.norm(z - target))
pressure = obs_drift / (obs_drift + 5.0)
self.update_public_dynamics(dominant_edge in corridor, pressure, len(corridor))
self.drift_load = max(
0.0,
0.68 * self.drift_load + self.Dh + pressure + self.systemic_risk - self.R - self.mss - 0.85 * self.gov_stability
)
if (
(not is_ambiguous)
and bf >= self.sentinel_theta
and self.gov_stability >= 0.45
and self.safe_envelope
):
self.current_edge = dominant_edge
elif self.current_edge not in corridor:
self.current_edge = dominant_edge
if is_ambiguous:
projected_current = target.copy()
projected_target = target.copy()
ambiguity_factor = 1.0
else:
projected_current = self.project_point_to_edge(self.current_pos, self.current_edge)
projected_target = self.project_point_to_edge(target, self.current_edge)
ambiguity_factor = 0.0
obs_gain = float(np.clip(0.18 - 0.08 * ambiguity_factor, 0.06, 0.24))
target_gain = float(np.clip(0.22 + 0.42 * ambiguity_factor + 0.20 * self.gov_stability, 0.20, 0.92))
recovery_gain = float(np.clip(0.08 + 0.12 * self.R + 0.08 * self.gov_stability - 0.05 * self.Dh, 0.04, 0.35))
self.current_pos = (
self.current_pos
+ obs_gain * (z - self.current_pos)
+ target_gain * (projected_target - self.current_pos)
+ recovery_gain * (projected_current - self.current_pos)
)
if self.safe_envelope:
self.current_pos = self.current_pos + 0.12 * (projected_current - self.current_pos)
self.delta_prev = delta
self.gamma_prev = gamma
self.corridor = corridor
self.history_edges.append(self.current_edge)
return self.current_edge, self.current_pos.copy(), gamma, self.drift_loadAppendix B — Frozen Result Tables
Clean Route
| Model | Mean Error | Max Error | Path Accuracy | False Branch Rate | Continuity | Recovery Time | Corridor Width | Restoration Efficiency |
|---|---|---|---|---|---|---|---|---|
| Kalman | 1.0360 | 2.6192 | - | - | 1.0000 | 0.0 | - | - |
| HMM | 6.2692 | 16.2500 | 0.7692 | 0.0000 | 0.6615 | 4.0 | - | - |
| NashMark | 1.5291 | 6.0258 | 0.6000 | 0.0000 | 1.0000 | 0.0 | 1.8308 | 1.0000 |
Urban Canyon
| Model | Mean Error | Max Error | Path Accuracy | False Branch Rate | Continuity | Recovery Time | Corridor Width | Restoration Efficiency |
|---|---|---|---|---|---|---|---|---|
| Kalman | 2.9328 | 6.5278 | - | - | 1.0000 | 0.0 | - | - |
| HMM | 5.5467 | 13.7500 | 0.7846 | 0.0000 | 0.7538 | 2.0 | - | - |
| NashMark | 3.0118 | 10.2272 | 0.5846 | 0.0000 | 0.9385 | 0.0 | 1.8308 | 1.0000 |
GNSS Dropout / Tunnel
| Model | Mean Error | Max Error | Path Accuracy | False Branch Rate | Continuity | Recovery Time | Corridor Width | Restoration Efficiency |
|---|---|---|---|---|---|---|---|---|
| Kalman | 1.5980 | 5.2245 | - | - | 1.0000 | 0.0 | - | - |
| HMM | 7.8462 | 20.0000 | 0.6923 | 0.0000 | 0.6154 | 0.0 | - | - |
| NashMark | 2.5600 | 8.0000 | 0.6769 | 0.0000 | 0.9846 | 0.0 | 1.7692 | 1.0000 |
Ambiguous Junction
| Model | Mean Error | Max Error | Path Accuracy | False Branch Rate | Continuity | Recovery Time | Corridor Width | Restoration Efficiency |
|---|---|---|---|---|---|---|---|---|
| Kalman | 3.5773 | 11.7269 | - | - | 0.7846 | 0.0 | - | - |
| HMM | 5.7308 | 15.0000 | 0.8462 | 0.0000 | 0.7231 | 0.0 | - | - |
| NashMark | 3.0756 | 12.8278 | 0.8000 | 0.0000 | 0.8769 | 9.0 | 1.7385 | 0.5286 |
Drift Recovery
| Model | Mean Error | Max Error | Path Accuracy | False Branch Rate | Continuity | Recovery Time | Corridor Width | Restoration Efficiency |
|---|---|---|---|---|---|---|---|---|
| Kalman | 1.9703 | 6.5977 | - | - | 1.0000 | 0.0 | - | - |
| HMM | 5.8846 | 15.0000 | 0.8000 | 0.2000 | 0.7015 | 0.0 | - | - |
| NashMark | 1.8383 | 6.6166 | 0.7429 | 0.3429 | 1.0000 | 0.0 | 1.7846 | 1.0000 |