-
Notifications
You must be signed in to change notification settings - Fork 104
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add retry on model loading. Expose option to set model retry count #308
Conversation
@@ -1282,6 +1282,12 @@ class PyServerOptions : public PyWrapper<struct TRITONSERVER_ServerOptions> { | |||
triton_object_, thread_count)); | |||
} | |||
|
|||
void SetModelLoadRetryCount(unsigned int retry_count) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unrelated, but do you think we could have some kind of test that asserts the python bindings fully cover the C API?
I'm thinking long term, folks may not always remember to add the binding equivalent for any API changes, and it might be good to automate that check.
CreateModel(model_id, version, model_info, is_config_provided); | ||
// Model state will be changed to NOT loading if failed to load, | ||
// so the model is loaded if state is LOADING. | ||
if (model_info->state_ == ModelReadyState::LOADING) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
question - is there a pause / is there one needed between retry attempts?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why would one be needed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
typically you try again after some time in order to let a possible transient error resolve - instead of retrying immediately - but I'm not familiar with the underlying isssue reported here - so mainly wondering if it makes sense here or not
@@ -1978,6 +1978,15 @@ TRITONSERVER_DECLSPEC struct TRITONSERVER_Error* | |||
TRITONSERVER_ServerOptionsSetModelLoadThreadCount( | |||
struct TRITONSERVER_ServerOptions* options, unsigned int thread_count); | |||
|
|||
/// Set the number of retry to load a model in a server options. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
would be good to mention what the default is.
const double min_compute_capability; | ||
// The backend configuration settings specified on the command-line | ||
const triton::common::BackendCmdlineConfigMap& backend_cmdline_config_map_; | ||
const triton::common::BackendCmdlineConfigMap& backend_cmdline_config_map; | ||
// The host policy setting used when loading models. | ||
const triton::common::HostPolicyCmdlineConfigMap& host_policy_map_; | ||
const triton::common::HostPolicyCmdlineConfigMap& host_policy_map; | ||
// Number of the threads to use for concurrently loading models | ||
const unsigned int model_load_thread_count_; | ||
const unsigned int model_load_thread_count; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice refactoring!
No description provided.