Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: handle KeyError when accessing rules in CleanProcessor.clean #10258

Merged
merged 1 commit into from
Nov 5, 2024

Conversation

pinsily
Copy link
Contributor

@pinsily pinsily commented Nov 4, 2024

fix: handle KeyError when accessing rules in CleanProcessor.clean

Checklist

  • Please open an issue before creating a PR or link to an existing issue
  • I have performed a self-review of my own code
  • I have commented my code, particularly in hard-to-understand areas
  • I ran dev/reformat(backend) and cd web && npx lint-staged(frontend) to appease the lint gods

Description

Fixed a KeyError that occurs when calling /console/api/datasets/indexing-estimate endpoint. The issue happens because CleanProcessor.clean() tries to access process_rule["rules"] when the rules parameter passed in is already the processed rules data.

The fix modifies the CleanProcessor.clean() method to directly use the passed rules parameter instead of trying to extract it from a process_rule dictionary, which maintains compatibility while fixing the KeyError.

Fixes #<issue_number>

Type of Change

  • Bug fix (non-breaking change which fixes an issue)

Testing Instructions

  1. API Route Test
# Test the indexing-estimate endpoint
response = client.post("/console/api/datasets/indexing-estimate", 
    json={
        "text": "sample text",
        "rules": {}
    }
)
assert response.status_code == 200
  1. CleanProcessor Test
# Test direct rules passing
result = CleanProcessor.clean("test text", rules={})
assert result == "expected output"

# Test backward compatibility
result = CleanProcessor.clean("test text", {"rules": {}})
assert result == "expected output"

# Test with None rules
result = CleanProcessor.clean("test text", rules=None)
assert result == "test text"
  1. Integration Test
  • Tested with both direct rules passing and process_rule object passing to ensure backward compatibility
  • Verified no KeyError occurs in either case
  • Confirmed text cleaning still works as expected

Please test these scenarios to verify the fix works as intended.

@dosubot dosubot bot added size:XS This PR changes 0-9 lines, ignoring generated files. 🐞 bug Something isn't working labels Nov 4, 2024
@crazywoola crazywoola requested a review from JohnJyong November 5, 2024 01:37
@dosubot dosubot bot added the lgtm This PR has been approved by a maintainer label Nov 5, 2024
@JohnJyong JohnJyong merged commit 5f21d13 into langgenius:main Nov 5, 2024
6 checks passed
Scorpion1221 added a commit to yybht155/dify that referenced this pull request Nov 5, 2024
* commit '7f583ec1ac6e7fb5269ed696f40a96a8b6b6f5fc': (176 commits)
  chore: update version to 0.11.0 across all relevant files (langgenius#10278)
  fix: iteration none output error (langgenius#10295)
  fix(http_request): improve parameter initialization and reorganize tests (langgenius#10297)
  fix typo: writeOpner to writeOpener (langgenius#10290)
  fix: handle KeyError when accessing rules in CleanProcessor.clean (langgenius#10258)
  fix: borken faq url in CONTRIBUTING.md (langgenius#10275)
  feat: add xAI model provider (langgenius#10272)
  feat(model_runtime): add new model 'claude-3-5-haiku-20241022' (langgenius#10285)
  fix(model_runtime): fix wrong max_tokens for Claude 3.5 Haiku on Amazon Bedrock (langgenius#10286)
  feat(model): add validation for custom disclaimer length (langgenius#10287)
  fix(node): correct file property name in function switch (langgenius#10284)
  refactor the logic of refreshing access_token (langgenius#10068)
  chore: translate i18n files (langgenius#10273)
  Updates: Add mplfonts library for customizing matplotlib fonts and Va… (langgenius#9903)
  feat: Iteration node support parallel mode (langgenius#9493)
  fix(workflow):  handle else condition branch addition error in if-else node (langgenius#10257)
  feat(document_extractor): support tool file in document extractor (langgenius#10217)
  feat: support Claude 3.5 Haiku on Amazon Bedrock (langgenius#10265)
  refactor(parameter_extractor): implement custom error classes (langgenius#10260)
  fix: buitin tool aippt (langgenius#10234)
  ...

# Conflicts:
#	.github/workflows/build-push.yml
#	api/Dockerfile
#	api/core/workflow/nodes/code/code_node.py
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
🐞 bug Something isn't working lgtm This PR has been approved by a maintainer size:XS This PR changes 0-9 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants