Think Models

Model

The specification of the model to run. In general, a HuggingFace handle, such as mistralai/Ministral-8B-Instruct-2410

ModelInstance

A dedicated instance of a model, created by a user. This will be associated with a name, and user-specified options (e.g. size, arguments, ...)

The set of resources allocated for a specific instance, for instance 1-b200-27c-240g, with the most important of it being the GPU used (assigned in a static and isolated way, e.g. each GPU resource could be used by at most one model instance).

Shared Models

Shared Models are managed by evroc. With Shared Models you're up and running in no time with reliable and performant inference. All the Shared Models have OpenAI API compatible endpoints.

evroc User Documentation

Think Models

Model

ModelInstance

ModelInstance size

Shared Models