Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Langchain_Community: SQL LanguageParser #28430

Open
wants to merge 18 commits into
base: master
Choose a base branch
from

Conversation

anushak18
Copy link

Description

(This PR has contributions from @khushiDesai, @ashvini8, and @ssumaiyaahmed).

This PR addresses Issue #11229 which addresses the need for SQL support in document parsing. This is integrated into the generic TreeSitter parsing library, allowing LangChain users to easily load codebases in SQL into smaller, manageable "documents."

This pull request adds a new SQLSegmenter class, which provides the SQL integration.

Issue

Issue #11229: Add support for a variety of languages to LanguageParser

Testing

We created a file test_sql.py with several tests to ensure the SQLSegmenter is functional. Below are the tests we added:

  • def test_is_valid: Checks SQL validity.
  • def test_extract_functions_classes: Extracts individual SQL statements.
  • def test_simplify_code: Simplifies SQL code with comments.

Copy link

vercel bot commented Nov 30, 2024

The latest updates on your projects. Learn more about Vercel for Git ↗︎

Name Status Preview Comments Updated (UTC)
langchain ✅ Ready (Inspect) Visit Preview 💬 Add feedback Nov 30, 2024 4:01pm

@dosubot dosubot bot added size:L This PR changes 100-499 lines, ignoring generated files. community Related to langchain-community Ɑ: doc loader Related to document loader module (not documentation) labels Nov 30, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
community Related to langchain-community Ɑ: doc loader Related to document loader module (not documentation) size:L This PR changes 100-499 lines, ignoring generated files.
Projects
Status: Needs support
Development

Successfully merging this pull request may close these issues.

4 participants