Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add retry on model loading. Expose option to set model retry count #308

Merged
merged 2 commits into from
Jan 5, 2024

Conversation

GuanLuo
Copy link
Contributor

@GuanLuo GuanLuo commented Jan 3, 2024

No description provided.

@GuanLuo GuanLuo requested review from Tabrizian and kthui January 3, 2024 19:43
@@ -1282,6 +1282,12 @@ class PyServerOptions : public PyWrapper<struct TRITONSERVER_ServerOptions> {
triton_object_, thread_count));
}

void SetModelLoadRetryCount(unsigned int retry_count)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unrelated, but do you think we could have some kind of test that asserts the python bindings fully cover the C API?

I'm thinking long term, folks may not always remember to add the binding equivalent for any API changes, and it might be good to automate that check.

CreateModel(model_id, version, model_info, is_config_provided);
// Model state will be changed to NOT loading if failed to load,
// so the model is loaded if state is LOADING.
if (model_info->state_ == ModelReadyState::LOADING) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

question - is there a pause / is there one needed between retry attempts?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why would one be needed?

Copy link
Contributor

@nnshah1 nnshah1 Jan 3, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typically you try again after some time in order to let a possible transient error resolve - instead of retrying immediately - but I'm not familiar with the underlying isssue reported here - so mainly wondering if it makes sense here or not

@@ -1978,6 +1978,15 @@ TRITONSERVER_DECLSPEC struct TRITONSERVER_Error*
TRITONSERVER_ServerOptionsSetModelLoadThreadCount(
struct TRITONSERVER_ServerOptions* options, unsigned int thread_count);

/// Set the number of retry to load a model in a server options.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would be good to mention what the default is.

Comment on lines +56 to +62
const double min_compute_capability;
// The backend configuration settings specified on the command-line
const triton::common::BackendCmdlineConfigMap& backend_cmdline_config_map_;
const triton::common::BackendCmdlineConfigMap& backend_cmdline_config_map;
// The host policy setting used when loading models.
const triton::common::HostPolicyCmdlineConfigMap& host_policy_map_;
const triton::common::HostPolicyCmdlineConfigMap& host_policy_map;
// Number of the threads to use for concurrently loading models
const unsigned int model_load_thread_count_;
const unsigned int model_load_thread_count;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice refactoring!

@GuanLuo GuanLuo merged commit 3b97b2f into main Jan 5, 2024
1 check passed
@GuanLuo GuanLuo deleted the gluo-reload branch January 5, 2024 22:47
nnshah1 pushed a commit that referenced this pull request Jan 11, 2024
)

* Group model repository files

* Expose option to set model retry count
nnshah1 added a commit that referenced this pull request Jan 11, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

5 participants