
Stop Your AI Agents from Hallucinating Broken API Code
Today we'll be sharing why we were so excited to add Sideko to the Ontos accelerator.
Actually, no. We're going to let ChatGPT explain why your AI agents should use Sideko to generate API interaction code instead of using ChatGPT.
1 | Why spec-driven codegen is sturdier than prompt-driven codegen

2 | How an autonomous agent would wire Sideko into its loop
High-level flow:
spec → codegen → compile → runtime calls
The agent treats Sideko as just another tool it can shell-out to.
- 1. Spec discovery
Agent finds or is handedweather.yml
.
- 2. SDK generation step
sideko sdk create –config sdk-config.yaml –lang typescript
- The CLI delivers:
index.ts
with typed functionsmodels/
(schemas)__tests__/
+mock_server/
- 3. Compile & smoke test
npm i && npm run test # CI gate; fails fast if spec + code disagree
- 4. Runtime usage inside the agent
import { WeatherApi } from "./generated/weather";
const wx = new WeatherApi(process.env.WX_KEY);
const dubai = await wx.getCurrentWeather({ city: "Dubai" });
- 5. Self-healing on spec change
- Nightly cron → sideko api update –spec ./new-openapi.yml
- If breaking, agent opens a PR with the regenerated client + updated calls.
- 6. Observability hook
Because every call funnels through the SDK, you get a single place to add tracing, metrics, or policy enforcement (rate-limit, PII redaction, etc.).
3 | Why it matters for AI agents in particular
Deterministic building blocks
LLM reasoning is probabilistic; pairing it with deterministic, compile-checked SDK calls keeps the flaky part (planning) separate from the hard-guarantee part (execution).
Lower “hallucination blast-radius”
If the model decides “I should hit /currentWeather/v2”, the TypeScript compiler simply refuses—the agent gets an immediate signal to re-plan instead of a late-stage 404.
Cheaper and faster feedback loop
Failed tsc or Jest runs cost milliseconds, not API calls and tokens. The agent can iterate internally before ever touching the network.
Safer autonomous refactors
When the agent upgrades a dependency or rotates credentials, it only has to edit the central WeatherApiconstructor; scattered raw-fetch snippets would be brittle.
Easier human hand-off
Sideko’s AST-style output is readable; a human developer can jump in, audit, and extend. That’s invaluable for red-team reviews or regulated domains.
4 | Understanding Sideko versus MCP
Sideko flips API use from “prompt & pray” to “spec → contract → code”.
For autonomous systems—where every unnecessary runtime failure becomes a branching logic rabbit-hole—having that rock-solid, testable contract is the cheapest insurance you can buy.
Sideko feeds SDK usage snippets directly to LLMs, in effect showing the LLM how to use an API with the least amount of boilerplate code possible. MCP, on the other hand, relies on tool calling which is a less robust way to give context to LLMs.
Sideko-style SDK snippets
- What the LLM sees
You paste (or auto-inject) a real, compilable call like
const dubai = await wx.getCurrentWeather({ city: "Dubai" });
- The pattern is concrete: the function name, param names, return shape.
- How the model reasons
It imitates sample code it just read—no need to re-infer endpoint paths or parameter schemas. If it tries something illegal, the TypeScript checker (or equivalent) catches it before runtime. - Why it’s robust
The snippet is already valid code that a compiler will accept; the model’s “job” is mostly to fill in values and sequence calls, not design the call structure.
MCP tool calling
- What the LLM sees
A JSON schema that says:
{
"name": "get_current_weather",
"parameters": { "type": "object", "properties": { "city": { "type": "string" } } }
}
- plus instructions like “call this with the JSON arguments”.
- How the model reasons
It must remember the function name verbatim, emit a correct JSON blob, and trust the server to translate that into a real API request. Errors surface only after the round-trip. - Why it’s a bit flakier
The contract is enforced at runtime; if the model misspells a key or sends"Dubai,UAE"
instead of"Dubai"
, the call still ships and then fails. There’s no compiler in front of the LLM to stop it.
Key takeaway
Sideko lets you show, not tell: you hand the LLM a working code fragment that the host environment will actually compile and run. MCP tells, then hopes: the LLM is instructed how to assemble a JSON call, but correctness isn’t verified until after the request leaves the model. For autonomous agents that require tight guarantees and low-latency loops, embedding Sideko-generated snippets usually yields more dependable behavior; MCP’s generic tool interface shines when you need quick, model-agnostic connectivity and human-readable docs.