Defining tools

A tool is a named operation the agent can invoke. The model sees a description and an input schema; the Engine routes the call to the right implementation. This page is the reference for adding a custom tool, plus the design rules that keep tools usable.

The tool definition

A tool definition has three required fields and one optional one:

{
  "name": "find_recent_commits",
  "description": "Return the N most recent commits on the current branch.",
  "input_schema": {
    "type": "object",
    "properties": {
      "limit": {
        "type": "integer",
        "minimum": 1,
        "maximum": 100,
        "default": 10,
        "description": "Number of commits to return."
      }
    }
  },
  "implementation": {
    "kind": "platform" | "skill" | "mcp",
    "ref": "..."
  }
}

Field	Required	Purpose
`name`	yes	Unique identifier. Lowercase, snake_case or dotted.
`description`	yes	What the tool does and when to use it. The model uses this to decide.
`input_schema`	yes	JSON schema for the input. Validated before execution.
`implementation`	yes	How the Engine should run the call.

Implementation kinds

Platform

The tool is implemented in src/utility_directory/:

"implementation": {
  "kind": "platform",
  "ref": "git_log"
}

ref is the Python function name registered in platform_registry.py. Adding a new platform tool means writing the function, registering it, and exposing it in the catalog.

Skill

The tool is implemented as a skill — a prompt template plus optional helper scripts:

"implementation": {
  "kind": "skill",
  "ref": "summarize_long_text"
}

Skills are easier to add than platform tools because they’re data, not code. See Skills endpoints for how to create one.

MCP

The tool is exposed by a connected MCP server:

"implementation": {
  "kind": "mcp",
  "ref": {
    "server": "slack",
    "action": "send_message"
  }
}

You don’t usually write these by hand. When a user connects an MCP server, the Engine auto-registers its actions.

Where tools live

Platform tools and skills are loaded from the catalog directory (CATALOG_DIR). Layout:

catalog/
  tools/
    git_log.json
    git_status.json
    web_search.json
  skills/
    summarize_long_text/
      skill.json
      prompt.md
      script.py
  agents/
    coder/
      agent.json
      system-prompt.md

After editing the catalog, hot-reload with POST /admin/reload-catalog — no Engine restart needed. MCP tools are populated dynamically; they aren’t in the catalog.

Description rules

The description is the single most important field. The model uses it to decide when to call your tool.

Be specific

Bad:  "Search the web."
Good: "Search the web for general information using Brave Search.
       Returns a list of results with title, URL, and snippet. Use this
       for current events and general queries. For programming docs,
       prefer the docs_search tool."

Name inputs and outputs

Inputs:
- query: the search query, plain text.
- limit: how many results to return (default 10, max 50).

Returns: a JSON array of {title, url, snippet} objects.

Hint when to use

Use this when the user asks about something the model wouldn't know
from training (recent events, your company's internal data, current
pricing). Don't use it for general knowledge questions the model can
answer directly.

Hint when not to use

Do not use this for queries about specific user data — use the
memory_search tool instead.

This is what disambiguates similar tools. Without “do not use this for X” hints, the model picks the most prominent option.

Schema rules

Type every field

{"type": "string"} is the floor. Add format, pattern, minLength, maxLength, enum where they apply.

Use enums for closed sets

"severity": {
  "type": "string",
  "enum": ["low", "medium", "high"],
  "description": "Severity level."
}

The model picks from the list. Without an enum, you’ll see “Medium” and “medium” inconsistently.

Mark required fields

"required": ["channel", "text"]

The model treats unmarked fields as optional. Use required to express your contract.

Add description to every property

The description is documentation the model reads at call time:

"channel": {
  "type": "string",
  "description": "The Slack channel ID. Starts with C. Example: C0123ABCDEF."
}

A field without a description gets filled with whatever the model guesses.

Output

There’s no output_schema. Tools return a string (or a JSON-serializable object that the Engine stringifies). The schema for output is whatever your description says it is.

Keep output small

Tools that return huge blobs hurt:

They eat context, which costs money and slows the model down.
They survive compaction less gracefully.
They overwhelm the model’s attention.

Cap output at a few KB. If your underlying source returns more, summarize or chunk at the tool layer.

Structured output beats prose

# bad
return f"Found 3 commits. Latest is by Alice on 2026-04-12 with hash 8c44ab1..."

# good
return json.dumps([
  {"hash": "8c44ab1", "author": "Alice", "date": "2026-04-12", "subject": "..."},
  ...
])

The model parses JSON cleanly. Prose has to be re-parsed and is more ambiguous.

Errors

When the tool fails, return a structured error:

return {
    "error": "rate_limited",
    "message": "Brave Search returned 429. Try again in 30 seconds.",
    "retry_after": 30
}

The Engine puts this in the tool_result.error field. The model can react — retry, switch tools, ask the user.

Versioning

Tools are part of the agent’s contract. Changing a tool’s name or schema is a breaking change. Patterns:

Adding a field. Backwards-compatible if it has a default.
Removing a field. Breaking. Add a new tool with the new schema; deprecate the old.
Renaming. Breaking. Treat as remove + add.
Changing semantics. Always breaking, even if the schema looks the same.

When you change a tool, run regression evals before merging.

Testing tools

Test tools against the agent, not just in isolation:

def test_find_recent_commits():
    response = engine.execute(
        message="What were the last 3 commits?",
        task_id="test-001",
    )
    tool_calls = [e for e in response if e["type"] == "tool_call"]
    assert any(c["tool"] == "find_recent_commits" for c in tool_calls)

The integration test catches description problems (the model didn’t pick the tool because the description was unclear) that unit tests miss.

Get started

Capabilities

Build with the Engine

Agents and tools

Test and evaluate

API reference

Guides

Resources

Defining tools

The tool definition

Implementation kinds

Platform

Skill

MCP

Where tools live

Description rules

Be specific

Name inputs and outputs

Hint when to use

Hint when not to use

Schema rules

Type every field

Use enums for closed sets

Mark required fields

Add description to every property

Output

Keep output small

Structured output beats prose

Errors

Versioning

Testing tools

See also

Get started

Capabilities

Build with the Engine

Agents and tools

Test and evaluate

API reference

Guides

Resources

Documentation Index

​The tool definition

​Implementation kinds

​Platform

​Skill

​MCP

​Where tools live

​Description rules

​Be specific

​Name inputs and outputs

​Hint when to use

​Hint when not to use

​Schema rules

​Type every field

​Use enums for closed sets

​Mark required fields

​Add description to every property

​Output

​Keep output small

​Structured output beats prose

​Errors

​Versioning

​Testing tools

​See also

The tool definition

Implementation kinds

Platform

Skill

MCP

Where tools live

Description rules

Be specific

Name inputs and outputs

Hint when to use

Hint when not to use

Schema rules

Type every field

Use enums for closed sets

Mark required fields

Add description to every property

Output

Keep output small

Structured output beats prose

Errors

Versioning

Testing tools

See also