Table of Contents

Prompt Caching Reference

JD.AI includes automatic prompt caching so stable, repeated prompt context can be reused across turns when supported by the active provider/model.

Supported providers

Current native support:

  • Anthropic API-key provider (/provider add anthropic)
  • Claude Code OAuth/session provider (claude auth login)

For unsupported providers, JD.AI leaves request behavior unchanged.

Runtime controls

Prompt caching is controlled with persisted /config keys:

Key Allowed values Default Description
prompt_cache on, off on Master toggle for automatic prompt caching
prompt_cache_ttl 5m, 1h 5m Cache TTL used where provider supports it
/config get prompt_cache
/config get prompt_cache_ttl
/config set prompt_cache on
/config set prompt_cache off
/config set prompt_cache_ttl 5m
/config set prompt_cache_ttl 1h

These settings are persisted in tui-settings.json under the JD.AI data root.

Auto-enable policy

JD.AI estimates prompt token count from chat history and only enables caching when context size is large enough to benefit:

  • Claude Sonnet / Opus families: >=1024 estimated tokens
  • Claude Haiku family: >=2048 estimated tokens

This avoids adding caching directives on short prompts where overhead can outweigh benefit.

Claude-native behavior

When prompt caching is enabled for Anthropic API-key or Claude Code OAuth sessions:

  • Requests are sent via Anthropic's native Messages API (Anthropic.SDK).
  • Prompt caching mode is set to automatic caching for tools/system.
  • Cache control checkpoints are applied to the last system/tool cache breakpoints.
  • TTL is set from prompt_cache_ttl (5m or 1h).

Execution coverage

The same policy is applied across JD.AI execution surfaces:

  • Interactive turns (streaming + non-streaming)
  • Subagents and team orchestration executors
  • Model analysis workflows
  • Gateway agent turns

See also