pytorch/serve

Confused about Cumulative Inference Duration vs. PredictionTime

Open

#1,698 opened on Jun 20, 2022

View on GitHub
 (3 comments) (0 reactions) (0 assignees)Java (790 forks)batch import
help wantedquestion

Repository metrics

Stars
 (3,844 stars)
PR merge metrics
 (No merged PRs in 30d)

Description

📚 The doc issue

I am running a model on TorchServe and I am trying to see how long it takes for inference. If I use logging and view the logs, then I can see there is something called PredictionTime: image

However, if I use the Metrics API, then I got something called "Cumulative Inference Duration" image

And in terms of values those 2 are very different. So I am not sure which one should I use to measure the total inference time for my requests?

Btw, there is also something else called HandlerTime in the logs image

What does it mean? Where can I find related information about what are the meanings of these metrics?

Thanks,

Suggest a potential alternative/fix

No response

Contributor guide