Skip to content

how to make more embeddings? #12

Discussion options

You must be logged in to vote

Hi @LexiestLeszek, if I understand your question correctly, I think you're referring to customizing the size of the text chunks. If so, that's a great question - the library attempts to pick some defaults for this, but they are fully configurable with the TextSplitterProtocol. Here's how you can do it inside the PDFExample with my preferred method - the RecursiveTokenSplitter

            let splitter = RecursiveTokenSplitter(withTokenizer: BertTokenizer())
-           let (splitText, _) = splitter.split(text: documentText)
+           let (splitText, _) = splitter.split(text: documentText, chunkSize: 100)
            chunks = splitText

This will set the splitter to try to make chunks up t…

Replies: 2 comments

Comment options

You must be logged in to vote
0 replies
Answer selected by ZachNagengast
Comment options

You must be logged in to vote
0 replies
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
Q&A
Labels
None yet
2 participants
Converted from issue

This discussion was converted from issue #9 on June 28, 2023 21:58.