Testing GraphRAG with other LLMs #321

alexchaomander · 2024-07-02T20:23:49Z

alexchaomander
Jul 2, 2024
Maintainer

The team primarily built GraphRAG with the GPT4-family of models and current prompts have been tested to work well with GPT4-o.

We'd love to see how GraphRAG works with open models like phi3, llama, mistral, etc.

Share your ideas, experiments, and experiences here!

pradhandebasish2046 · 2024-07-03T05:28:15Z

pradhandebasish2046
Jul 3, 2024

Yes I am also looking forward to see the performance with other models

0 replies

namin · 2024-07-08T00:43:04Z

namin
Jul 8, 2024

I got the Get Started guide working entirely locally on Apple Silicon.

Setup

Toolio

I used https://github.com/OoriData/Toolio as my OpenAI-compatible server for text completions, because it uses Apple's MLX and supports JSON schemas.

I started the server with:

toolio_server --model=mlx-community/Hermes-2-Theta-Llama-3-8B-4bit

I also used a model with a larger context to satisfy the community report of graphrag:

toolio_server --model=mlx-community/Llama-3-8B-Instruct-1048k-4bit

I changed the llm: api_base setting to http://127.0.0.1:8000/v1/.

open-text-embeddings

For embeddings, I used open-text-embeddings.

I started the server with:

PORT=8080 VERBOSE=1 MODEL=BAAI/bge-large-en python -m open.text.embeddings.server

I changed the embeddings: llm: api_base setting to http://127.0.0.1:8080/v1/.

With this, I was able to get the indexing. I think it took around 2 hours.

I was also able to run the global and local search queries, after resolving some tweaks which I describe next.

Issues

Enhancement: JSON Schema defined informally instead of formally

Toolio can work with JSON schemas and enforce them. I added the following schema:

MAP_SYSTEM_JSON = """
{
  "type": "object",
  "properties": {
    "points": {
      "type": "array",
      "contains": {
        "type": "object",
        "properties": {
          "description": { "type": "string" },
          "score": { "type": "number" },
        },
        "required": ["description", "score"],
      },
      "minContains": 5,
    },
  },
  "required": ["points"],
}
"""

And then added a "schema" property set to that in the "response_format" map_llm_params in ./query/structured_search/global_search/search.py.

I suspect whenever GraphRAG informally prompts for some JSON, the effectiveness with Toolio could be improved by passing the schema formally.

Some bug with `streaming=True`

I had some issue seeing any response when LLM generation used streaming=True, so I changed those in *_search/search.py to False.

`tiktoken` and non-OpenAI embeddings

Somehow, the open-text-embeddings server complains about the chunking using tiktoken, saying it expects a string not a list of numbers. Setting the token_encoder to None, and not using an encoder when None in chunk_text (in ./query/llm/text_utils.py) seems to have resolved the issue for me.

def chunk_text(
    text: str, max_tokens: int, token_encoder: tiktoken.Encoding | None = None
):
    """Chunk text by token length."""
    if token_encoder is None:
        tokens = text.split()
    else:
        tokens = token_encoder.encode(text)  # type: ignore
    chunk_iterator = batched(iter(tokens), max_tokens)
    yield from chunk_iterator

Issue: special tokens within JSON generation

I did run into an issue with the large context llama that special tokens appeared amidst the JSON causing decoding errors. I did a brutal cleanup as follows:

                search_response = search_response.replace("<|start_header_id|>assistant<|end_header_id|>", "")
                search_response = search_response.split('<|eot_id|>')[0]

But that was also a shame, because I discarded all the alternate responses.

What's next?

I haven't evaluated how well the system work -- my first victory was to get it running!

I might add a JSON schema for Toolio for each prompt that requires JSON generation.

I am not sure whether there's anything to push upstream at the moment: all the issues I hit could probably be solved by improving the OpenAI-compatible servers.

1 reply

streamside7 Jul 9, 2024

What an awesome post! Thanks!

xxWeiDG · 2024-07-10T02:16:42Z

xxWeiDG
Jul 10, 2024

It's work on chatglm4-9b-chat with xinference

0 replies

shaoqing404 · 2024-08-12T09:36:13Z

shaoqing404
Aug 12, 2024

I tried deepseek as a lower-level alternative to gpt4o, and mentioned in the issue that many utf8 and json problems (before 0.2.1) will not be encountered in this model

0 replies

timothymeyers · 2024-08-16T13:46:36Z

timothymeyers
Aug 16, 2024

I've been experimenting with @TheAiSingularity's repo here.

This is not an endorsement, but it seems to work well leveraging ollama locally on Apple Silicon to try out different models.

0 replies

l4b4r4b4b4 · 2024-08-20T15:52:07Z

l4b4r4b4b4
Aug 20, 2024

I am using a custom vLLM server fork incl. openai compatible tool & function calling as well as an embeddings endpoint using funtionary-3.1-small (llama3.1-8B) AWQ quant on my RTX 3080 and indexing as well as global search work amazingly well. Both with JSON=True and JSON=True.

Will try with bigger versions of the functionary model family in the near future...

Local search does not work. But did not look to deeply, why that is the case.

1 reply

shaoqing404 Aug 21, 2024

本地搜索的问题你可以尝试现在global中进行搜索，积累社区以后再回到local进行检索。你会有不一样的体验。

nileshtrivedi · 2024-09-10T09:24:11Z

nileshtrivedi
Sep 10, 2024

I think it would be great if GraphRAG project was built on top of another Microsoft project, Autogen. This way, interoperability with various LLM providers would need to be implemented only in one place.

Another approach is to use existing open-source LLM interoperability libraries like LiteLLM.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Testing GraphRAG with other LLMs #321

{{title}}

Replies: 7 comments 2 replies

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Select a reply

Testing GraphRAG with other LLMs #321

alexchaomander Jul 2, 2024 Maintainer

Replies: 7 comments · 2 replies

Setup

Toolio

open-text-embeddings

Issues

Enhancement: JSON Schema defined informally instead of formally

Some bug with streaming=True

tiktoken and non-OpenAI embeddings

Issue: special tokens within JSON generation

What's next?

alexchaomander
Jul 2, 2024
Maintainer

Replies: 7 comments 2 replies

Some bug with `streaming=True`

`tiktoken` and non-OpenAI embeddings