Pricing
Base Model Pricing
Your use of base models on Anyscale Endpoints is billed on a $/million-tokens basis. For more information on pricing and billing, including “what is a token”, see the FAQs
Model | Price ($/M tokens) |
---|---|
Mistral-7B-Instruct-v0.1 | 0.15 |
Llama-2-7b-chat-hf | 0.15 |
Llama-3-8b-chat-hf. | 0.15 |
gemma-7b-it | 0.15 |
NeuralHermes-2.5-Mistral-7B | 0.15 |
Llama-2-13b-chat-hf | 0.25 |
Mixtral-8x7B-Instruct-v0.1 | 0.50 |
Mixtral-8x22B-Instruct-v0.1 | 0.90 |
Llama-2-70b-chat-hf | 1.0 |
Llama-3-70b-chat-hf | 1.0 |
CodeLlama-70b-Instruct-hf | 1.0 |
thenlper-gte-large | 0.05 |
BAAI/bge-large-en-v1.5 | 0.05 |
tip
Every new user will receive $10 free credit to get started.
Fine Tuning Pricing
Fine Tuning is billed at a fixed cost of $5 per run and $/million-tokens. For example, a fine tuning job of Llama-2-13b-chat-hf with 10M tokens would cost $5 + $2x10 = $25
Model | Fixed Cost/Run | Price ($/M tokens) |
---|---|---|
Llama-2-13b-chat-hf | 5 | 2 |
Llama-2-70b-chat-hf | 5 | 4 |
mistralai/Mistral-7B-Instruct-v0.1 | 5 | 1 |
mistralai/Mixtral-8x7B-Instruct-v0.1 | 5 | 4 |
Querying the fine-tuned models is billed on a $/million-tokens basis.
Model | Price ($/M tokens) |
---|---|
Llama-2-7b-chat-hf* | 0.25 |
Llama-2-13b-chat-hf | 0.50 |
Llama-2-70b-chat-hf | 2.00 |
mistralai/Mistral-7B-Instruct-v0.1 | 0.25 |
mistralai/Mixtral-8x7B-Instruct-v0.1 | 1.00 |
- We are moving
meta-llama/Llama-2-7b-chat-hf
to legacy models list. Please usemistralai/Mistral-7B-Instruct-v0.1
for new fine-tuning jobs. You can still query the existing fine-tuned models that are based onmeta-llama/Llama-2-7b-chat-hf
, but this will be temporary and we recommend migrating tomistralai/Mistral-7B-Instruct-v0.1
.