Namespace JD.AI.Core.LocalModels

Classes

GpuDetector: Detects available GPU backends for LLamaSharp inference at runtime.

LlamaInferenceEngine: Wraps LLamaSharp to provide Semantic Kernel's Microsoft.SemanticKernel.ChatCompletion.IChatCompletionService. Manages model loading/unloading and streaming inference.

LocalModelDetector: Detects locally available GGUF models and provides LLamaSharp-based inference.

LocalModelOptions: Options for configuring the LLama inference engine.

LocalModelRegistry: Manages the local model registry (JSON manifest) and coordinates model sources.

ModelDownloader: Shared download logic with progress, resume support, retry, and cancellation.

ModelMetadata: Metadata describing a local GGUF model.

ModelRegistry: JSON-serializable model registry manifest.

Enums

GpuBackend: GPU backend type detected or configured for inference.

ModelSourceKind: Where a model was sourced from.

QuantizationType: GGUF quantization type parsed from the filename.