Snake API is the production endpoint behind snake.aws.monce.ai. It serves
deterministic, auditable, sub-millisecond article matching across a growing multi-tenant
fleet of production factories in the European glass industry. The system replaces what an LLM call would otherwise do
— mapping unstructured catalog text to a canonical article reference — with
a constructive SAT classifier (Snake) layered with a fuzzy fallback and an arbitrator that
merges both signals into a single probability-summing ranking. No GPU, no model API key,
no stochastic sampling. Same query, same answer, every time, with a full reasoning trail.
This document explains why this EC2 exists, what it does, and how the pieces fit together.
A factory's order processing pipeline receives free-form text describing glass articles: OCR'd PDFs, email body extracts, hand-typed quote requests, multilingual specifications. For every line item the system must answer:
Where num_article is the canonical reference in the factory's catalog. Each
factory has its own catalog (thousands of distinct articles) and its own client list
(hundreds to thousands of distinct buyers). The mapping must be:
/fix wizard) flow back
into snake_learning and trigger targeted rebuilds.An LLM would handle this at 200–2000ms per call, $0.001–$0.01 per call, non-deterministically, with no audit trail, with monthly model deprecations breaking reproducibility. Not viable.
Snake is a SAT-ensembled bucketed multiclass classifier built on the Dana Theorem (Charles Dana, 2024): any indicator function over a finite discrete domain can be encoded as a SAT instance in polynomial time. Decision-tree bucketing reduces this to linear time per training run. Snake constructs a CNF formula directly from data — no backtracking, no exponential search, no NP-hardness penalty.
The algorithm:
oppose() literal that
distinguishes f from at least one member t ∈ C. Take the
disjunction over uncovered subsets to build a clause that excludes f's without
excluding members.Construction cost per layer: O(L · n · b · m) where n = samples, b = bucket size (~125), m = features. Linear in the data. Inference: O(L · clauses_in_bucket) per query, sub-millisecond in practice.
Snake v5.4.6 ships with 30 boolean test types (substring, structural, splits, charclass,
distance, positional, crypto, numeric, exact, affix, vowel ratio) and 7 oppose profiles
that weight them differently per data type. This deployment pins the
monce profile — CHR (charclass) + JAC (Jaccard distance) heavy, with
substring, startswith/endswith, structural and exact contributions, and no Levenshtein
— tuned for short multilingual glass-catalog strings at bucket=125.
A single Snake call gives us a probability distribution. The endpoint exposes that as a three-tier matching cascade:
| Tier | Method | Threshold | Latency | Purpose |
|---|---|---|---|---|
| 1 | Snake SAT vote | ≥ 80% (configurable) | ~3 ms | Production matches. 100% ≡ exact synonym. |
| 2 | Fuzzy (Levenshtein + bigram) | ≥ 50% | ~1 ms | OCR drift, missing chars, separator variants. |
| 3 | LLM (Claude Haiku) | ≥ 70% | ~400 ms | Semantic last-resort, optional. |
v5.4.6 dropped Snake's built-in synonym_hash as a separate path. The reason:
when a query is a known synonym, SAT returns confidence = 1.0 by construction
(every layer votes for the right class). 100% confidence is the exact-match case.
No separate hash lookup needed. Cleaner cascade, fewer code paths, same behavior.
Snake gives a probability distribution that sums to 1.0. Fuzzy gives a similarity ranking, normalized to sum to 1.0 within the returned pool. Both signals are useful but have different failure modes: Snake misses OCR drift, fuzzy misses semantic intent.
The arbitrator merges them. For a query x with candidates from both sources, define:
where Z is a normalizing constant that restores ΣParb = 1.0. Articles voted by both sources accumulate two contributions and naturally outrank single-source winners. This is the “Both wins” magic visible on /ui: a green-outlined chip dominates a higher single-source confidence.
Full derivation and the proof that Σ = 1.0 holds independent of the relative magnitudes of the two pools: see /math.
An article belongs to one of six field buckets (verre, intercalaire, remplissage,
faconnage, croisillons, misc). We could route by keyword classifier on the OCR'd text.
We don't. Routing is driven directly by monce_db.articles.type_article_monce
— a curated French taxonomy maintained by the data team. The mapping is an
identity map: classify_article_type lowercases
type_article_monce and, if the result is one of the six buckets, returns it
unchanged; everything else (Service, Forme, Article pièce, Perfil, PVB,
Résine, …) falls into misc. No keyword heuristics, no
num-prefix rules — monce_db is the single source of truth.
Verre → verre Intercalaire → intercalaire Remplissage → remplissage Façonnage → faconnage Croisillons → croisillons Misc / (any other type, or empty) → misc
Why this matters for non-French-language factories: their catalogs are in the local
language but type_article_monce is curated in a single canonical taxonomy. The
data team owns the mapping; Snake follows. Zero contamination of field models from
cross-language confusion. Verified clean across every tenant at deploy time.
A real order line is not a clean query — it's "2x 44.2 Silence/16 argon/4
planitherm 1200x800". Before Snake can match an article, that string must be split
into structured slots: quantity, dimensions, and each glass/intercalaire/gas component.
This parsing layer evolved through three generations:
The merge rule is deliberately asymmetric and reflects what each parser is good at:
quantity and dimensions always come from /random (structural regex extraction is
unbeatable on 1200x800), while glass components default to /random but fall
back to /understandable when structural confidence is low. Every selection is logged in
the response's XAI trail — you can see which parser won each slot and why. This is what
lifted field accuracy to 85.9% and the Glass AGI benchmark to 97.6%, up from
/understandable's standalone 92.9%.
The science is in /math. The reason this endpoint earns its keep day-to-day is operational:
/fix
wizard writes the correction back into snake_learning and triggers a targeted
rebuild. Today's correction is tomorrow's SAT clause. Hasna — the first operator on
this system besides its author — drove much of that loop's design.The numbers that characterize the system are the ones that don't move with the customer roster: latency, throughput, and accuracy. They hold per tenant regardless of how many tenants are live.
| Metric | Value |
|---|---|
| Deployment | Multi-tenant, one isolated catalog + client model per factory |
| Catalog scale | Thousands of articles & tens of thousands of synonyms per tenant |
| Per-query latency (P50) | 3.7 ms |
| Throughput (single instance) | ~260 q/s sustained |
| Glass AGI benchmark | 97.6% |
| Field accuracy (/comprendre) | 85.9% |
| Cost per matched line | ~$3.7 × 10−9 (see /economics) |
The live request mix on a representative production day — article matching
(/query, /batch) dominates the real work, with client lookups
(/query_client) the second pillar. Health probes are continuous monitoring, not
user traffic:
Snake API is the deterministic substrate for every glass-industry workflow at Monce. Quote engines call it. OCR pipelines call it. Excel-import flows call it. Hasna's correction wizard writes back to it. The auto-onboarding cron polls it. The observability dashboard graphs it.
If this endpoint goes dark, every downstream tool degrades to LLM fallback at 100x the cost and 100x the latency, with no audit trail. That's the operational case.
The scientific case is in /math: a constructive proof that you don't need an LLM for structured matching. Snake builds the answer; LLMs sample it.
Charles Dana · Monce SAS · snake.aws.monce.ai · deployed 2026-05-20
Co-Authored-By: Claude (Anthropic)