Chunk Block

Overview

The Chunk Block is used to split a string into an array of strings based on a token count. This is particularly useful for handling large text inputs that exceed token limits in Language Models (LLMs), or for truncating strings to specific token counts.

Inputs

input

string

required

The string to be split into chunks. Non-string inputs will be coerced to strings.

model

string

The AI model to use for tokenization. Only available when “Use Model Input” is enabled in settings.

Outputs

chunks

string[]

An array containing all the chunks after splitting the input string.

first

string

The first chunk from the chunks array. Useful for truncating text from the start.

last

string

The last chunk from the chunks array. Useful for truncating text from the end.

indexes

number[]

A list of sequential numbers starting from 1, one for each chunk. Useful for filtering or zipping with the chunks array.

count

number

The total number of chunks created.

Editor Settings

AI Model

string

default:"gpt-3.5-turbo"

The model to use for tokenizing the text. Different models may tokenize text differently. Can be overridden by the “model” input if “Use Model Input” is enabled.

Use Model Input

boolean

default:false

When enabled, adds a “model” input port that can override the “AI Model” setting.

Number of tokens per chunk

number

default:1024

The target number of tokens for each chunk. The actual chunk sizes may vary slightly to maintain text coherence.

Overlap (in %)

number

default:0

The percentage of overlap between consecutive chunks. For example, with a 50% overlap and 1000 tokens per chunk, each chunk will share approximately 500 tokens with the next chunk. This helps maintain context between chunks.

Example: Chunking a Long Text

Create a Text Block with a long piece of text.
Add a Chunk Block and connect the Text Block to its input.
Configure the desired token count and overlap in the settings.
Run the flow. The text will be split into chunks based on your settings.

Error Handling

The Chunk Block will automatically coerce non-string inputs into strings. No other notable error handling behavior.

FAQ

Why use chunking for LLMs?

How does the overlap feature work?

Block Documentation

AI Blocks

Draft Blocks

Loader Blocks

Logic Blocks

Data Blocks

Modifier Blocks

Advanced Blocks

IO Blocks

Agent Blocks

Overview

Inputs

Outputs

Editor Settings

Example: Chunking a Long Text

Error Handling

FAQ

See Also

Block Documentation

AI Blocks

Draft Blocks

Loader Blocks

Logic Blocks

Data Blocks

Modifier Blocks

Advanced Blocks

IO Blocks

Agent Blocks

​Overview

​Inputs

​Outputs

​Editor Settings

​Example: Chunking a Long Text

​Error Handling

​FAQ

​See Also

Overview

Inputs

Outputs

Editor Settings

Example: Chunking a Long Text

Error Handling

FAQ

See Also