Skip to content

Latest commit

 

History

History
31 lines (21 loc) · 951 Bytes

README.md

File metadata and controls

31 lines (21 loc) · 951 Bytes

kNNGen

Experimenting with some of the ideas in this paper:

Generalization through Memorization: Nearest Neighbor Language Models

and later might incorporate ideas from this paper as well:

Infini-gram: Scaling Unbounded n-gram Language Models to a Trillion Tokens


Setup

Requires Docker.

Install and run Milvus, as explained here:

# Download the installation script
$ curl -sfL https://raw.githubusercontent.com/milvus-io/milvus/master/scripts/standalone_embed.sh -o standalone_embed.sh

# Start the Docker container
$ bash standalone_embed.sh start

Optionally run milvus_test.py to see if that worked.

Create a .env file, and inside of it add your HuggingFace API token, like so:

HF_TOKEN=your_hugging_face_api_token_here

Or add the equivalent to your system's environment variables.