Table of Contents

Class BatchEmbeddingPipeline

Namespace
JD.AI.Core.Memory
Assembly
JD.AI.Core.dll

Processes documents through a chunking → embedding → storage pipeline. Handles batching to stay within embedding provider rate limits.

public sealed class BatchEmbeddingPipeline
Inheritance
BatchEmbeddingPipeline
Inherited Members

Constructors

BatchEmbeddingPipeline(IEmbeddingProvider, IVectorStore, int, ILogger?)

public BatchEmbeddingPipeline(IEmbeddingProvider embedder, IVectorStore store, int batchSize = 100, ILogger? logger = null)

Parameters

embedder IEmbeddingProvider

Embedding provider for vectorization.

store IVectorStore

Vector store for persistence.

batchSize int

Maximum entries per embedding batch. Default 100.

logger ILogger

Optional logger.

Methods

IndexDocumentAsync(string, string, string?, string?, int, int, CancellationToken)

Indexes a document by chunking, embedding, and storing. Returns the number of chunks stored.

public Task<int> IndexDocumentAsync(string documentId, string content, string? source = null, string? category = null, int maxChunkChars = 1500, int overlapChars = 200, CancellationToken ct = default)

Parameters

documentId string
content string
source string
category string
maxChunkChars int
overlapChars int
ct CancellationToken

Returns

Task<int>

IndexDocumentsAsync(IReadOnlyList<(string Id, string Content, string? Source, string? Category)>, CancellationToken)

Indexes multiple documents in parallel batches.

public Task<int> IndexDocumentsAsync(IReadOnlyList<(string Id, string Content, string? Source, string? Category)> documents, CancellationToken ct = default)

Parameters

documents IReadOnlyList<(string Id, string Content, string Source, string Category)>
ct CancellationToken

Returns

Task<int>