Schema-Driven Interfaces for Humans and AIs

Žiga Sajovic

April 20, 2026

architecturemcpoperatorslunar

Schema-Driven Interfaces for Humans and AIs

When operations are defined as typed schemas, each interface is just a rendering of the same source — UI controls for humans, tool descriptions for models. Adding a new interface is implementing a new renderer. Through dynamic discovery, this architecture eliminates maintenance drift and reduces AI context overhead by over 95% at scale.

Žiga Sajovic, Polydera

Integrating AI into an application usually means building a second interface — an API wrapper, a set of MCP tools, a function-calling layer. This second interface reimplements validation, drifts out of sync with the UI, and burns thousands of tokens on static schemas before the model even begins to reason.

Lunar — a geometry workbench with 45+ operations — treats each interface as a rendering of the same schema. This architecture reduces AI context overhead by over 95% at scale compared to traditional tool enumeration. The UI renders controls. MCP renders tool descriptions. Both feed one executor.

The MCP problem

The standard approach to MCP integration is one tool per operation. This creates problems at scale.

MCP serverTools exposedSchema tokensGitHub51+~25,000Playwright~25~15,000Filesystem14~5,000Typical 3-server setup~90~45,000Lunar3~1,200

Each tool schema consumes 400–800 tokens. A standard three-server setup burns ~45,000 tokens of context before the agent starts working. Cursor hard-caps at 40 MCP tools total — anything beyond is silently dropped.

The problems compound:

ProblemEffectContext pollutionTool schemas consume tokens that should go to reasoningTool confusionModels misfire on similar-sounding tools (get_status, fetch_status, query_status)Resource support gapUnder half of MCP clients support resources — tools-only is the practical realitySchema stalenessTool descriptions are static — adding an operation requires redeploying the MCP serverMaintenance burdenThe AI interface is a separate codebase that must track the product

The workarounds — deferred loading, RAG-based tool selection, toolset flags — address symptoms. The root cause is architectural: operations are coupled to their interface.

Operations as schemas, interfaces as renderers

Most applications define operations inside their interface. The UI handler IS the boolean operation. The API endpoint IS the boolean operation. The MCP tool IS the boolean operation. Three implementations of the same thing.

Lunar defines operations independently — as typed schemas. Interfaces are renderers: they read the schema, present it in their format, gather inputs, and pass them to a shared executor. The operation doesn't know what rendered it.

operators.register({
  id: 'tf.boolean',
  label: 'Boolean',
  description: 'Perform a boolean operation (union, intersection, or difference) on two meshes.',
  category: 'cut',
  tags: ['boolean', 'csg', 'union', 'intersection', 'difference', 'subtract', 'combine'],
  docsUrl: 'https://trueform.polydera.com/ts/modules/cut#boolean-operations',
  inputs: [
    { name: 'meshA', label: 'Mesh A', type: 'mesh', description: 'First mesh' },
    { name: 'meshB', label: 'Mesh B', type: 'mesh', description: 'Second mesh' },
    { name: 'operation', label: 'Operation', type: 'string',
      description: 'Boolean operation type',
      enum: ['union', 'intersection', 'difference'], default: 'union' },
    { name: 'returnCurves', label: 'Return Curves', type: 'boolean',
      description: 'Include intersection curves', optional: true, default: false }
  ],
  outputs: [
    { name: 'mesh', label: 'Result', type: 'mesh',
      description: 'Boolean result mesh', primary: true },
    { name: 'labels', label: 'Labels', type: 'ndarray',
      description: 'Per-face region labels' },
    { name: 'curves', label: 'Curves', type: 'curves',
      description: 'Intersection curves',
      condition: { input: 'returnCurves', value: true } }
  ],
  async: async ({ meshA, meshB, operation, returnCurves }) => { /* dispatch */ }
})

This definition carries everything a renderer needs: typed inputs with labels, constraints, and defaults. Outputs with conditions. Tags for search. A docs URL for linking. The async function is used by the executor to dispatch the task to the engine. Each interface reads the same definition and renders it in its own format — the operation itself is written once.

The human interface

The schema drives every element of the UI. type maps to widget: number with min/max → slider, boolean → toggle, string with enum → dropdown. category places the operator in the sidebar. tags feed Cmd+K search — typing "csg" finds Boolean. label titles each control. docsUrl links the help icon to documentation. Operand inputs come from scene selection — the user selects nodes, the panel fills in operand slots by type.

No UI code per operation. Adding an operator to the registry adds it to the sidebar, makes it searchable, generates its control panel, and connects its help link. The panel renderer reads the schema and builds the interface.

The model interface

An AI model receives three tools:

ToolPurposeAnalogydiscover()Browse operator catalog, get schemas on demandHuman browses categories, opens Cmd+Kworld_state()Read scene — same data that drives UI panels, stats, bounding boxesHuman looks at the viewportrun({ operations })Execute one or more operations in batchHuman clicks Run

Three tools, ~1,200 tokens of schema. The model discovers 45+ operations dynamically through discover() — they are not loaded at connection time.

Compact catalog

The first discover() call returns a compact listing:

{
  "cut": [
    { "id": "tf.boolean",    "l": "Boolean", "n": 2, "p": ["operation"] },
    { "id": "tf.meshArrangements", "l": "Mesh Arrangements", "n": "N" },
    ...
  ],
  "geometry": [
    { "id": "tf.sphereMesh", "l": "Sphere", "p": ["radius", "stacks"] },
    { "id": "tf.boxMesh",    "l": "Box",    "p": ["width", "height", "depth"] },
    ...
  ],
  "scene": [
    { "id": "scene.add_mesh",   "l": "Add Mesh",   "p": ["label", "points", "faces"] },
    { "id": "scene.add_curves", "l": "Add Curves",  "p": ["label", "points", "paths", "offsets"] },
    { "id": "scene.screenshot", "l": "Screenshot" },
    ...
  ],
  "camera": [
    { "id": "camera.describe",  "l": "Describe Camera" },
    { "id": "camera.fit_to_nodes", "l": "Fit to Nodes", "p": ["nodeIds"] },
    ...
  ],
  ...
}

Organized by category. n = how many operands to pass. p = parameter names. The model sees the full catalog in one response without consuming context for 45+ full schemas.

When it needs to use an operator, it calls discover({ operatorIds: [...] }) for full schemas — types, constraints, defaults, and an _example showing the exact run() call. On-demand, not upfront.

World state

world_state() is how the model observes the scene — the equivalent of the human looking at the viewport and the properties panel.

Called without arguments, it returns a compact overview:

{
  "summary": "2 meshes, 550319 faces, 1 selected",
  "nodes": [
    { "id": "Stanford Dragon-1", "label": "Stanford Dragon",
      "type": "mesh", "visible": true, "center": [-1.14, 1.39, 0.26],
      "children": ["shape-index-1"] },
    { "id": "shape-index-1", "label": "Shape Index",
      "type": "ndarray", "visible": true, "parentId": "Stanford Dragon-1" },
    { "id": "Stanford Bunny-1", "label": "Stanford Bunny",
      "type": "mesh", "visible": true, "center": [12.24, 1.26, 0.00],
      "children": ["shape-index-2", "boundary-edges-1"] },
    ...
  ]
}

Called with nodeIds, it returns full detail — the same data that drives the properties inspector:

{
  "nodes": [
    { "id": "Stanford Dragon-1", "type": "mesh",
      "children": ["shape-index-1"],
      "properties": {
        "faces": 480868, "vertices": 240428,
        "aabb": { "min": [-9.06, -4.20, -3.28], "max": [6.78, 6.98, 3.80] },
        "obb": { "center": [-1.81, 0.46, -0.25], "axes": [...], "extent": [16.14, 12.54, 6.88] }
      }
    },
    { "id": "shape-index-1", "type": "ndarray", "parentId": "Stanford Dragon-1",
      "properties": {
        "shape": [240428], "dtype": "float32",
        "min": -0.999, "max": 0.997, "mean": 0.171, "std": 0.482
      }
    }
  ]
}

Derived data — curvature fields, boundary edges, distance maps — lives as children of the mesh that produced it. The model sees the same scene graph the human sees: nodes, parent-child relationships, spatial properties. Two levels of detail — compact for orientation, detailed for spatial reasoning.

Running operations

run() accepts an array of operations. Create a sphere, position it, boolean it — one tool call:

{
  "operations": [
    {
      "operatorId": "tf.sphereMesh",
      "params": { "radius": 3 }
    },
    {
      "operatorId": "scene.set_position",
      "nodeIds": ["sphere-1"],
      "params": { "xyz": [2, 4, 0] }
    },
    {
      "operatorId": "tf.boolean",
      "nodeIds": ["dragon-1", "sphere-1"],
      "params": { "operation": "difference" }
    }
  ]
}

Operations execute sequentially — each can reference nodes created by the previous. Fewer round-trips, fewer tokens, faster workflows.

Dynamic catalog

Because discover() is a tool call — not a static schema — the catalog updates as the application evolves. Add an operation to the registry, and the next discover() call includes it. No MCP server redeployment. No schema versioning. No stale descriptions.

The same applies to world_state() — scene state is read via a tool call, not MCP resources. This avoids the resource support gap: tools work across all clients.

One executor, one application

The human interface gathers inputs from UI widgets and scene selection. The model interface gathers inputs from the run() request. Both produce the same shape — a params dict and a list of nodeIds — and from there, the path is identical. One executor resolves operands, validates against the schema, runs the operation, places outputs into the scene. If validation fails, the same error propagates back through whichever interface initiated it. One validation path, written once.

All actions — geometry, scene manipulation, camera control, data injection — are operations in the same registry. camera.describe returns the viewport's coordinate mapping. scene.screenshot captures what the human sees. scene.add_mesh injects raw vertex arrays into the scene. All discovered and executed the same way. There is one application. The model doesn't get a separate API. It gets the application.

Maintainability

ActionTraditional MCPSchema-drivenAdd an operationRegister tool + write handler + update docsOne register() callChange a parameterUpdate tool schema + handler + UI + validationChange the schemaAdd a constraintUpdate validation in UI + API + wrapperAdd min/max to the inputRemove an operationRemove tool + handler + update docs + deprecationRemove the register() callTestTest UI + test API + test wrapperTest the operator

No wrapper code. The schema is the documentation — discover() serves it directly.

If MCP is replaced by a different protocol, the application doesn't change — you implement a new renderer for the schema. The logic, the validation, the execution path all stay. The interface layer is disposable by design.

Open Lunar

@article{polydera:schema-driven-interfaces-for-humans-and-ais,
  title={Schema-Driven Interfaces for Humans and AIs},
  author={Sajovic, {\v{Z}}iga, Polydera},
  year={2026},
  url={https://polydera.com/ai/schema-driven-interfaces-for-humans-and-ais},
  organization={Polydera}
}