Prompt Caching Reference
JD.AI includes automatic prompt caching so stable, repeated prompt context can be reused across turns when supported by the active provider/model.
Supported providers
Current native support:
- Anthropic API-key provider (
/provider add anthropic) - Claude Code OAuth/session provider (
claude auth login)
For unsupported providers, JD.AI leaves request behavior unchanged.
Runtime controls
Prompt caching is controlled with persisted /config keys:
| Key | Allowed values | Default | Description |
|---|---|---|---|
prompt_cache |
on, off |
on |
Master toggle for automatic prompt caching |
prompt_cache_ttl |
5m, 1h |
5m |
Cache TTL used where provider supports it |
/config get prompt_cache
/config get prompt_cache_ttl
/config set prompt_cache on
/config set prompt_cache off
/config set prompt_cache_ttl 5m
/config set prompt_cache_ttl 1h
These settings are persisted in tui-settings.json under the JD.AI data root.
Auto-enable policy
JD.AI estimates prompt token count from chat history and only enables caching when context size is large enough to benefit:
- Claude Sonnet / Opus families:
>=1024estimated tokens - Claude Haiku family:
>=2048estimated tokens
This avoids adding caching directives on short prompts where overhead can outweigh benefit.
Claude-native behavior
When prompt caching is enabled for Anthropic API-key or Claude Code OAuth sessions:
- Requests are sent via Anthropic's native Messages API (
Anthropic.SDK). - Prompt caching mode is set to automatic caching for tools/system.
- Cache control checkpoints are applied to the last system/tool cache breakpoints.
- TTL is set from
prompt_cache_ttl(5mor1h).
Execution coverage
The same policy is applied across JD.AI execution surfaces:
- Interactive turns (streaming + non-streaming)
- Subagents and team orchestration executors
- Model analysis workflows
- Gateway agent turns