You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hey @xinsu626! I'm here to help you with any bugs, questions, or contributions you have for the Llama Index repository. Let's squash those bugs together!
Yes, there is a duplicated for loop at line 127 in the keyval_docstore.py file that is causing a significant slowdown. The duplicated for loop is as follows:
fornodeinnodes:
# NOTE: doc could already exist in the store, but we overwrite itifnotallow_updateandself.document_exists(node.node_id):
raiseValueError(
f"node_id {node.node_id} already exists. ""Set allow_update to True to overwrite."
)
ref_doc_info=Noneifisinstance(node, TextNode) andnode.ref_doc_idisnotNone:
ref_doc_info=self.get_ref_doc_info(node.ref_doc_id) orRefDocInfo()
(
node_kv_pair,
metadata_kv_pair,
ref_doc_kv_pair,
) =self._get_kv_pairs_for_insert(node, ref_doc_info, store_text)
ifnode_kv_pairisnotNone:
node_kv_pairs.append(node_kv_pair)
ifmetadata_kv_pairisnotNone:
metadata_kv_pairs.append(metadata_kv_pair)
ifref_doc_kv_pairisnotNone:
key=ref_doc_kv_pair[0]
ifkeynotinref_doc_kv_pairs:
ref_doc_kv_pairs[key] = []
ref_doc_kv_pairs[key].append(ref_doc_kv_pair)
This loop appears twice in the add_documents method, which is unnecessary and could lead to performance issues [1].
Bug Description
Hello, it looks like the for loop here is duplicated and is causing a significant slowdown: https://github.com/run-llama/llama_index/blob/main/llama-index-core/llama_index/core/storage/docstore/keyval_docstore.py#L127
Version
main branch
Steps to Reproduce
Please see the link in bug description section.
Relevant Logs/Tracbacks
No response
The text was updated successfully, but these errors were encountered: