Class TextChunker
Splits text into overlapping chunks suitable for embedding. Supports configurable chunk size and overlap to maintain context across chunk boundaries.
public static class TextChunker
- Inheritance
-
TextChunker
- Inherited Members
Fields
DefaultMaxChunkChars
Default maximum tokens per chunk (approximate, using char-based estimation).
public const int DefaultMaxChunkChars = 1500
Field Value
DefaultOverlapChars
Default overlap between consecutive chunks.
public const int DefaultOverlapChars = 200
Field Value
Methods
Chunk(string, int, int)
Splits text into chunks with overlap, respecting paragraph and sentence boundaries.
public static IReadOnlyList<TextChunk> Chunk(string text, int maxChunkChars = 1500, int overlapChars = 200)
Parameters
textstringThe text to chunk.
maxChunkCharsintMaximum characters per chunk.
overlapCharsintNumber of characters to overlap between chunks.
Returns
- IReadOnlyList<TextChunk>
List of text chunks with metadata.