Table of Contents

Namespace JD.AI.Core.LocalModels

Classes

GpuDetector

Detects available GPU backends for LLamaSharp inference at runtime.

LlamaInferenceEngine

Wraps LLamaSharp to provide Semantic Kernel's Microsoft.SemanticKernel.ChatCompletion.IChatCompletionService. Manages model loading/unloading and streaming inference.

LocalModelDetector

Detects locally available GGUF models and provides LLamaSharp-based inference.

LocalModelOptions

Options for configuring the LLama inference engine.

LocalModelRegistry

Manages the local model registry (JSON manifest) and coordinates model sources.

ModelDownloader

Shared download logic with progress, resume support, retry, and cancellation.

ModelMetadata

Metadata describing a local GGUF model.

ModelRegistry

JSON-serializable model registry manifest.

Enums

GpuBackend

GPU backend type detected or configured for inference.

ModelSourceKind

Where a model was sourced from.

QuantizationType

GGUF quantization type parsed from the filename.