WebSummarizer is a LangChain Web Summarization project that aims to optimize your web searches based on a specific goal. Instead of just presenting you with a list of search results, the program employs AI to summarize content from multiple web sources. It then ranks and selects the summary that best aligns with your predefined search goal, giving you more targeted and useful information.
- M1 chip with 16GB RAM should suffice
- CUDA-compatible GPU is optional but can enhance performance
- At least 16GB RAM
Zephyr is a fine-tuned Llama2 model that is not particularly heavy, so even systems without a dedicated GPU should be able to run it without significant performance issues.
To utilize Google Search API through LangChain, the project requires you to have a Google account, an API key, and a Custom Search Engine ID (CSE ID).
- Visit the Google Cloud Console.
- Create a new project or select an existing one.
- Navigate to
APIs & Services > Credentials
. - Click on
Create Credentials
and chooseAPI Key
.
- Visit Google Custom Search.
- Create a new search engine and configure it according to your needs.
- After creation, you will find your Custom Search Engine ID (CSE ID) on the setup page.
- Visit the ollama GitHub repository.
- Follow the installation instructions provided in the README file.
Install the required packages by running the following command:
pip install -r requirements.txt
Create a .env
file in the root directory of the project and add the following variables:
GOOGLE_CSE_ID=your_cse_id
GOOGLE_API_KEY=your_api_key
# Optionally, you can specify other environment variables like LLM_MODEL, MIN_SCORE, and PROMPT_TEMPLATE
Execute the main program with the following command:
python main.py
parse_string_to_json_v2(input_str: str) -> dict
: Converts a string containing key-value pairs formatted as 'key="value"', possibly embedded within other text, into a dictionary.fetch_and_summarize(n_results: int, search_query: str, search_objective: str) -> Tuple[dict, str]
: Conducts a web search based on the given query and number of results, summarizes relevant content using large language models, and ranks the best match according to the specificity of your search goal.main(n_results: int, search_query: str, search_objective: str) -> dict
: Accepts search parameters as arguments and callsfetch_and_summarize()
to perform the goal-based web search and summarization. Returns a dictionary containing the results.
To run the unit tests, navigate to the test directory and execute the test files using unittest. For example:
python -m unittest test_main.py
To run the functional tests, navigate to the test directory and execute the functional test file using unittest. For example:
python -m unittest test_functional.py
Note: the functional tests require a working internet connection and a valid GOOGLE_API_KEY and GOOGLE_CSE_ID in the .env file. The results of the test might vary depending on the search results returned by google at the time of testing.
For more details or issues, feel free to contact the maintainers.
By using this software, you are agreeing to the terms and conditions as defined by the license.