understanding of "Question Answering model using open-source LLM" #4071

IamExperimenting · 2023-05-03T22:13:13Z

IamExperimenting
May 3, 2023

HI,
can anyone please help me to understand "Question Answering model using open-source LLM"

currently, I use my data(20 files) to create embedding from HuggingFaceEmbeddings. Even if I have 2 millions files do I need to follow the same steps like 1.create embedding from HuggingFaceEmbeddings, 2. do similarity test, and 3. pass it to model?
At what stage I need to retrain the LLM?
is it possible to retrain the LLM with my own data?
currently, I use chromadb as vector db, In case if I want to move it production how do I host it? where do I store all my data(embeddings)? do I need to store all embedding in any database, if yes, could you please recommend any?
how do I evaluated dolly LLM with my data?
currently, I noticed dolly model with my data gives one wrong answer. how do I correct the model? if it is other model like text classification I would correct the label and retrain the model with corrected label data. how do I do it here?
is there any other opensource embedding other than "HuggingFaceEmbeddings"?

Answered by oddrationale

May 4, 2023

I'll do my best to answer your questions based on my understanding.

Yes, even if you have 2M files, you can use the same process of (1) create the embeddings from the text, (2) perform a similarity search, and (3) pass the most relevant chunks of text as part of the model prompt. Of course, using a vectorstore and a retriever makes this process very simple in LangChain.
The question of when to retrain (or what most people would call fine-tune) an LLM is still an open question of research. So there is no clear cut answer here. This Medium post I found helpful in exploring this question.
It is possible to fine-tune an LLM with your own data. You need to have the appropriate compute resourc…

View full answer

oddrationale · 2023-05-04T17:08:20Z

oddrationale
May 4, 2023

I'll do my best to answer your questions based on my understanding.

Yes, even if you have 2M files, you can use the same process of (1) create the embeddings from the text, (2) perform a similarity search, and (3) pass the most relevant chunks of text as part of the model prompt. Of course, using a vectorstore and a retriever makes this process very simple in LangChain.
The question of when to retrain (or what most people would call fine-tune) an LLM is still an open question of research. So there is no clear cut answer here. This Medium post I found helpful in exploring this question.
It is possible to fine-tune an LLM with your own data. You need to have the appropriate compute resources to perform fine-tuning. However, as noted from Q2, a lot of success has been gained through in-context learning (or prompting) and it would depend on the use case on whether fine-tuning or prompting would give better results. If you're interested in fine-tuning, LMFlow and H2O LLM Studio seem like interesting toolkits/frameworks to get started. I'm sure there are other tools as well.
You would need to deploy a Chroma DB server. They have some guidance on their documentation site. All your embeddings will be stored in Chroma. Chroma is your embeddings database, so you would not need a separate database from Chroma for your embeddings.
Since Dolly is available on HuggingFace Hub, you can use the HuggingFace Local Pipeline to use Dolly as your LLM within LangChain. This will download the Dolly model and run it locally. So make sure your machine has enough diskspace, memory, and compute power to run the model.
You can attempt to resolve this through either in-context learning (prompting) or fine-tuning. As mentioned in Q2 and Q3, fine-tuning (currently) requires significant computational power, especially with large model sizes. So prompting with few-shot examples might be the easiest way here.
HuggingFaceEmbeddings which uses sentence_transformers is definitely one of the most popular open-source embedding models. There are probably other open-source embedding models, but you can't really go wrong with sentence_transformers.

I hope that helps!

2 replies

IamExperimenting May 4, 2023
Author

@oddrationale thanks for your response, it really helps me to understand.

oddrationale May 4, 2023

@IamExperimenting You're welcome! Glad I could help! If you think the response answers your question, please consider marking the response as the Answer so we know that this discussion has been resolved. Thank you!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

understanding of "Question Answering model using open-source LLM" #4071

{{title}}

Replies: 1 comment 2 replies

{{title}}

{{title}}

{{title}}

Select a reply

understanding of "Question Answering model using open-source LLM" #4071

IamExperimenting May 3, 2023

Replies: 1 comment · 2 replies

oddrationale May 4, 2023

IamExperimenting May 4, 2023 Author

oddrationale May 4, 2023

IamExperimenting
May 3, 2023

Replies: 1 comment 2 replies

oddrationale
May 4, 2023

IamExperimenting May 4, 2023
Author