Skip to content

understanding of "Question Answering model using open-source LLM" #4071

Discussion options

You must be logged in to vote

I'll do my best to answer your questions based on my understanding.

  1. Yes, even if you have 2M files, you can use the same process of (1) create the embeddings from the text, (2) perform a similarity search, and (3) pass the most relevant chunks of text as part of the model prompt. Of course, using a vectorstore and a retriever makes this process very simple in LangChain.
  2. The question of when to retrain (or what most people would call fine-tune) an LLM is still an open question of research. So there is no clear cut answer here. This Medium post I found helpful in exploring this question.
  3. It is possible to fine-tune an LLM with your own data. You need to have the appropriate compute resourc…

Replies: 1 comment 2 replies

Comment options

You must be logged in to vote
2 replies
@IamExperimenting
Comment options

@oddrationale
Comment options

Answer selected by IamExperimenting
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
Q&A
Labels
None yet
2 participants