pytorch/serve

how does `default_response_timeout` work?

Open

#2,452 opened on Jul 8, 2023

View on GitHub
 (6 comments) (0 reactions) (0 assignees)Java (790 forks)batch import
documentationgood first issuetriaged

Repository metrics

Stars
 (3,844 stars)
PR merge metrics
 (No merged PRs in 30d)

Description

📚 The doc issue

I set the value of default_response_timeout to 4 i.e. 4 seconds. At the start of the model load, this happens after 4 (ish) seconds:

org.pytorch.serve.wlm.WorkerInitializationException: Backend worker did not respond in given time

My guess is because the model takes a while to load (more than 4 seconds), the worker gets killed. Is there a way to set a larger initial delay i.e. differentiate these two scenarios:

  • account for the initial model load with a number different from default_response_timeout
  • if model doesn't response in default_response_timeout after the initial load, then kill the worker

Suggest a potential alternative/fix

No response

Contributor guide