Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TrussServer supports request/repsonse #1148

Merged
merged 9 commits into from
Sep 19, 2024
Merged

TrussServer supports request/repsonse #1148

merged 9 commits into from
Sep 19, 2024

Conversation

marius-baseten
Copy link
Contributor

@marius-baseten marius-baseten commented Sep 17, 2024

🚀 What

Support access to requests object in truss model and allow returning response objects. This follows the internal proposal.

Includes:

  • Various load/predict-time validations of the model's methods' signatures. New ModelDefinitionError is raised.
  • Consolidate truss_server.py inference_server.py and model_wrapper.py: truss_server contains the actual server implementation, as an application entrypoint a super shallow main.py file is used.
  • Move a lot of runtime inspections into ModelDescriptor that is just initialized once and contains information whether methods are present and sync/async etc.

Side quests:

  • Fix warnings / deprecations in docker files.
  • Suppress pydantic v1/v2 warning, because we consciously have to support v1.
  • Fix/silence other warnings.
  • Replace intercept_exceptions function wrapper with contextmanager so it can be easier used on non-function code sections.
  • Filter traceback to only include frames from model.py and below - if present, otherwise complete stack.
  • Remove boilerplate form tests.
  • Use importlib.util.module_from_spec instead of importlib.import_module because it has side effects in unit/integration tests.
  • Fix skipped tests for model_wrapper loading. - Root cause for prior failure was import side effects (previous point).
  • Revive ignored test_model_wrapper_streaming_timeout that had missing @pytest.mark.anyio.
  • Unify the stream termination with a sentinel object (we had a few competing variants before).

This PR includes changes from #1103, which would be obsolete.

💻 How

🔬 Testing

@marius-baseten marius-baseten force-pushed the marius/requests branch 3 times, most recently from 2c10e41 to eafad4e Compare September 18, 2024 00:24
Copy link
Collaborator

@squidarth squidarth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is AWESOME. Very excited for these changes -- particularly enabling SSE (which we'll get on day 1), and the stacktrace change (this has been bothering me for a long time, glad we finally got around to it).

took a first pass w a couple questions, will take a closer look later

truss/templates/server/common/errors.py Outdated Show resolved Hide resolved
truss/templates/server/common/errors.py Outdated Show resolved Hide resolved
truss/templates/server/model_wrapper.py Outdated Show resolved Hide resolved
truss/templates/server/model_wrapper.py Show resolved Hide resolved
truss/tests/test_model_inference.py Show resolved Hide resolved
@@ -974,3 +874,275 @@ def enable_gpu_fn(conf):
# But make sure traces have parents at all.
assert len(user_parents) > 3
assert len(truss_parents) > 3


# Returning Response Objects ###########################################################
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One thought -- do you think it makes sense to have an integration test for the stack trace we made in this PR?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was too lazy to lookup the traceback API and the initial implementation was totally bloated by chat GPT - I had this feeling that it could be simplified, now it's pretty straightforward.

But adding a test would be good too!

@marius-baseten marius-baseten force-pushed the marius/requests branch 5 times, most recently from aa764b8 to ef153e4 Compare September 19, 2024 19:16
Copy link
Collaborator

@squidarth squidarth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm! not sure what these junit test report failures are

truss/tests/test_model_inference.py Show resolved Hide resolved
@marius-baseten marius-baseten merged commit b6f8959 into main Sep 19, 2024
15 checks passed
@marius-baseten marius-baseten deleted the marius/requests branch September 19, 2024 21:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants