Using Models
Cortex’s Models API is compatible with OpenAI’s Models endpoint. It is a fork of the OpenAI API used for model management. Additionally, Cortex exposes lower-level operations for managing models like downloading models from a model hub and model loading.
Model Operation
Model Operation allows you to pull, run, and stop models.
Run Model
curl --request POST \ --url http://localhost:39281/v1/models/mistral/start \ --header 'Content-Type: application/json' \ --data '{ "prompt_template": "system\n{system_message}\nuser\n{prompt}\nassistant", "stop": [], "ngl": 4096, "ctx_len": 4096, "cpu_threads": 10, "n_batch": 2048, "caching_enabled": true, "grp_attn_n": 1, "grp_attn_w": 512, "mlock": false, "flash_attn": true, "cache_type": "f16", "use_mmap": true, "engine": "llamacpp"}'
cortex models run <model_id>
Stop Model
curl --request POST \ --url http://localhost:39281/models/mistral/stop
cortex models stop <model_id>
Pull Model
curl --request POST \ --url http://localhost:39281/v1/models/mistral/pull
# Download a built-in modelcortex models pull mistral# Download a specific variantcortex models pull bartowski/Hermes-2-Theta-Llama-3-70B-GGUF
Models Management
Model Management allows you to manage your local models, which can be found in ~users/user_name/cortex/models
.
List Models
curl --request GET \ --url http://localhost:39281/v1/models
cortex models list
Get Model
curl --request GET \ --url http://localhost:39281/v1/models/mistral
cortex models get <model_id>
Delete Model
curl --request DELETE \ --url http://localhost:39281/v1/models/mistral
cortex models remove <model_id>
Update Model
curl --request PATCH \ --url http://localhost:39281/v1/models/mistral \ --header 'Content-Type: application/json' \ --data '{}'
cortex models update