Prowl Gateway is a pass-through proxy. A fraction of real agent traffic flows through it — chosen by the agent's SDK using a salt that Prowl signs and the vendor never sees. Latency, errors, and response shape are measured on actual production calls. Vendors can't game the sample, because the sample isn't theirs to pick.
Every other API benchmark in the agent ecosystem is a snapshot. Someone pays $1, an LLM runs a battery of canned tests, a number gets pinned to the service for a week or a month, and then the number rots. The vendor knows exactly when the test is happening because they wrote the guide. They can "examine well" and the score never reflects what an agent actually feels at 3am on a Tuesday.
Three things are wrong with that model:
The fix is not to run benchmarks more often. The fix is to measure traffic that's happening anyway, and to do it on the vendor's actual production endpoints — not a sandbox they prepared.
The trick is that the vendor doesn't decide which calls are observed, and neither do we. The agent's SDK rolls the dice locally, using a salt that Prowl publishes and rotates daily. Here's the exact computation:
# The agent's SDK does this before every outbound call. roll = sha256(service_id | agent_id | salt_id | salt) bucket = first_32_bits(roll) / 2**32 # → float in [0, 1) if bucket < service.sampling_rate: # route through Prowl — observable POST proxy.prowl.world/{slug}/{path} X-Prowl-Salt-Id: {salt.id} X-Prowl-Sample-Decision: {roll.hex()} else: # call vendor directly — invisible to Prowl request(vendor_url, ...)
When the call hits the gateway, we have the salt too — so we recompute the same hash with the agent's ID and service ID and reject the call if the headers don't match. An agent that wants to skip being observed (or always be observed) can't, because the hash is deterministic and we re-verify it.
The salt rotates every 24 hours with a 60-minute overlap so calls in flight don't break. Anyone can audit a past sample decision via POST /v1/sampling/verify — the salt is public after rotation.
sampledThe main mode. Agent's SDK rolls local dice; only a fraction of calls pass through. Vendor gets continuous quality measurement, paid in monitoring credits (1 credit / observed call, refilled by paid benchmarks at 100/$1).
x402_onlyEvery call requires an x402 payment proof from the agent at $0.01 each. Prowl takes 10%, vendor gets 90%. Monetization-as-a-service for vendors that want pay-per-call usage without building billing.
vault_onlyReserved for scoped vault tokens — agents present short-lived credentials that Prowl translates into the vendor's real API key. M3+ territory; route reachable, policy conservative.
fullEvery call is forwarded, every call is logged. Useful for early-vendor pilots and for debugging the auth translation. No sampling guarantees and no payment enforcement.
Across every mode, the gateway strips Prowl-internal headers (X-Agent-Key, X-Prowl-*, the agent's Authorization) before forwarding. The vendor sees its own injected credential and the request body. It never sees who the agent is.
proxy.prowl.world/{slug}/v1/... with the X-Prowl-Salt-Id and X-Prowl-Sample-Decision headers.sha256(service|agent|salt-id|salt) and rejects if mismatched (400). Cheat path closed.service.min_reputation is set and the agent's score is below it, the call returns 403 with X-Prowl-Reason: below-min-reputation.service.gateway_credits. Out of credits → 503 + X-Prowl-Reason: monitoring-credits-exhausted. Vendor refills via a paid benchmark.X-Prowl-Proxy-Mode and X-Prowl-Proxy-Latency-Ms. Budget: <30 ms p99 of Prowl-attributable overhead.
Every observed call becomes one ProxyCall row. Three downstream pipelines turn those rows into the public score you see on Prowl:
min_reputation.
Receipts (POST /v1/receipts/submit, M1) close the loop on multi-step tasks: agent and counterparty co-sign that "the delivery happened, here's how it went," feeding the same aggregation. Single-sig weighted 0.3, dual-sig weighted 1.0.
We're not a replacement for Datadog, Honeycomb, or Sentry. Those live inside the vendor and watch the vendor's own requests. Prowl Gateway lives between agents and vendors and produces a public, third-party-witnessed signal. The two are complementary — vendors use one to improve, agents use the other to decide whether to call.
If you're calling a Prowl-registered service, the sampling protocol is ~15 lines. Body, query, headers pass through unchanged. We strip Prowl-internal headers and the agent's own Authorization on the way out (we substitute the vendor's stored credential).
from prowl_client import ProwlClient, sample_decision cli = ProwlClient(agent_key="ak_...") salt = await cli.current_sampling_salt() # rotates daily if sample_decision(service_id, agent_id, salt) < rate: resp = await httpx.request( method, f"https://proxy.prowl.world/{slug}/{path}", headers={ "X-Agent-Key": "ak_...", "X-Prowl-Salt-Id": salt.id, "X-Prowl-Sample-Decision": salt.decision(...), **your_headers, }, content=body, ) else: resp = await httpx.request(method, vendor_url, ...)
The credential you upload via POST /v1/credentials is Fernet-encrypted at rest, decrypted only in the hot path of an observed call, and never returned in any API response. The gateway's job is to shield your real key from agents, not expose it.
POST /v1/services/{id}/proxy Authorization: Bearer <vendor_jwt> { "proxy_modes": "sampled", "sampling_rate": 0.05, "proxy_target_url": "https://api.your-service.com", "proxy_auth_translation": { "header_name": "X-API-Key", "header_prefix": "" }, "min_reputation": 0 }
At sampling_rate=0.05, one paid benchmark (~$1 → 100 credits) covers 2,000 real calls of monitoring. x402_only is credit-exempt — agents already pay per call.
The gateway is shipped through M6 of the gateway+reputation plan. The route is live, the sampling protocol is enforced, the cheat audit runs every 24h, and a per-call ProxyCall is written for every request. But:
x402_only mode uses a hardcoded $0.01/call default. A future migration moves it to Service.gateway_price_per_call_usd.The bet: the long tail of agent traffic is going to need a neutral observability layer that neither the vendor nor the agent controls. The gateway is our attempt at building that layer in a way that doesn't depend on the vendor cooperating.
proxy_modes=sampled at 1–5% rate. The continuous score is real, you can disable it any time, and the data is yours via GET /v1/services/{id}/gateway.Continuous, third-party-witnessed, vendor-unbiasable. A signal you can build agent routing on without praying the last benchmark is still true.