Replies: 2 comments
-
Hi @sunnyl95 Are you running this in google colab, your local jupyter notebook or local terminal? Can you also make an issue for this to help track it? I will look into this issue and update with you. |
Beta Was this translation helpful? Give feedback.
-
I'm running into this when using a fairly large model (gpt-j) from huggingface transformers. I can see from my ram usage that my prediction process exits (40 gb ram are needed for the model, ram usage drops fast after I send the POST). I'll get a "Service unavailable" message returned fairly quick when doing a POST to /predict, faster than the I'm running this on a local machine |
Beta Was this translation helpful? Give feedback.
-
According to the official website tutorial——https://colab.research.google.com/github/bentoml/BentoML/blob/master/guides/quick-start/bentoml-quick-start-guide.ipynb#scrollTo=1OQ3Lkq2gX32
Execution to“# Start a dev model server to test out everything
iris_ classifier_ service.start_ dev_ "Server ()",
The results are as follows:
[2021-05-26 15:47:06,085] INFO - BentoService bundle 'IrisClassifier:20210526154705_50EE91' created at: /tmp/tmptlpbiag3
[2021-05-26 15:47:06,090] INFO - ======= starting dev server on port: 5001 =======
It's not the same as in the tutorial
Later, continue to use request to request service, and print the result
“Serverice Unavalibale”
The result shows that the service has not been started successfully, and some people encounter the same situation, or know how to solve it? Thank you very much!
Beta Was this translation helpful? Give feedback.
All reactions