sideko-flow

Stop Your AI Agents from Hallucinating Broken API Code

Today we'll be sharing why we were so excited to add Sideko to the Ontos accelerator.

Actually, no. We're going to let ChatGPT explain why your AI agents should use Sideko to generate API interaction code instead of using ChatGPT.


1 | Why spec-driven codegen is sturdier than prompt-driven codegen



Screenshot 2025-05-14 at 3.22.23 PM.png


2 | How an autonomous agent would wire Sideko into its loop




High-level flow:

spec → codegen → compile → runtime calls

The agent treats Sideko as just another tool it can shell-out to.


  1. 1. Spec discovery

  2. Agent finds or is handed weather.yml.


  1. 2. SDK generation step
sideko sdk create –config sdk-config.yaml –lang typescript
  1. The CLI delivers:
    • index.ts with typed functions
    • models/ (schemas)
    • __tests__/ + mock_server/


  1. 3. Compile & smoke test
npm i && npm run test           # CI gate; fails fast if spec + code disagree
  1. 4. Runtime usage inside the agent
import { WeatherApi } from "./generated/weather";

const wx = new WeatherApi(process.env.WX_KEY);
const dubai = await wx.getCurrentWeather({ city: "Dubai" });
  1. 5. Self-healing on spec change


    • Nightly cron → sideko api update –spec ./new-openapi.yml
    • If breaking, agent opens a PR with the regenerated client + updated calls.


  1. 6. Observability hook

  2. Because every call funnels through the SDK, you get a single place to add tracing, metrics, or policy enforcement (rate-limit, PII redaction, etc.).


3 | Why it matters for AI agents in particular




Deterministic building blocks


  1. LLM reasoning is probabilistic; pairing it with deterministic, compile-checked SDK calls keeps the flaky part (planning) separate from the hard-guarantee part (execution).

Lower “hallucination blast-radius”


  1. If the model decides “I should hit /currentWeather/v2”, the TypeScript compiler simply refuses—the agent gets an immediate signal to re-plan instead of a late-stage 404.

Cheaper and faster feedback loop


  1. Failed tsc or Jest runs cost milliseconds, not API calls and tokens. The agent can iterate internally before ever touching the network.

Safer autonomous refactors


  1. When the agent upgrades a dependency or rotates credentials, it only has to edit the central WeatherApiconstructor; scattered raw-fetch snippets would be brittle.

Easier human hand-off


  1. Sideko’s AST-style output is readable; a human developer can jump in, audit, and extend. That’s invaluable for red-team reviews or regulated domains.


4 | Understanding Sideko versus MCP



Sideko flips API use from “prompt & pray” to “spec → contract → code”.
For autonomous systems—where every unnecessary runtime failure becomes a branching logic rabbit-hole—having that rock-solid, testable contract is the cheapest insurance you can buy.

Sideko feeds SDK usage snippets directly to LLMs, in effect showing the LLM how to use an API with the least amount of boilerplate code possible. MCP, on the other hand, relies on tool calling which is a less robust way to give context to LLMs.

Sideko-style SDK snippets

  • What the LLM sees
    You paste (or auto-inject) a real, compilable call like
const dubai = await wx.getCurrentWeather({ city: "Dubai" });
  • The pattern is concrete: the function name, param names, return shape.
  • How the model reasons
    It imitates sample code it just read—no need to re-infer endpoint paths or parameter schemas. If it tries something illegal, the TypeScript checker (or equivalent) catches it before runtime.
  • Why it’s robust
    The snippet is already valid code that a compiler will accept; the model’s “job” is mostly to fill in values and sequence calls, not design the call structure.

MCP tool calling

  • What the LLM sees
    A JSON schema that says:
{
  "name": "get_current_weather",
  "parameters": { "type": "object", "properties": { "city": { "type": "string" } } }
}
  • plus instructions like “call this with the JSON arguments”.
  • How the model reasons
    It must remember the function name verbatim, emit a correct JSON blob, and trust the server to translate that into a real API request. Errors surface only after the round-trip.
  • Why it’s a bit flakier
    The contract is enforced at runtime; if the model misspells a key or sends "Dubai,UAE" instead of "Dubai", the call still ships and then fails. There’s no compiler in front of the LLM to stop it.

Key takeaway

Sideko lets you show, not tell: you hand the LLM a working code fragment that the host environment will actually compile and run. MCP tells, then hopes: the LLM is instructed how to assemble a JSON call, but correctness isn’t verified until after the request leaves the model. For autonomous agents that require tight guarantees and low-latency loops, embedding Sideko-generated snippets usually yields more dependable behavior; MCP’s generic tool interface shines when you need quick, model-agnostic connectivity and human-readable docs.

So basically if you haven't heard of Sideko before and you're building in the agentic AI space, now is your wake up call. Like ChatGPT said so well, you don't need to prompt & pray.

Read their blog at https://sideko.dev/empathetic-engineer.

Ontos Logo

Transforming innovative ideas into successful startups through strategic advising, funding, and interconnected guidance.

Stay Updated

Subscribe to our newsletter for the latest updates, events, and resources.

© 2025 Ontos. All rights reserved.