Version: 0.0.0

Pricing

Base Model Pricing

Your use of base models on Anyscale Endpoints is billed on a $/million-tokens basis. For more information on pricing and billing, including “what is a token”, see the FAQs

Model	Price ($/M tokens)
Mistral-7B-Instruct-v0.1	0.15
Llama-2-7b-chat-hf	0.15
Llama-3-8b-chat-hf.	0.15
gemma-7b-it	0.15
NeuralHermes-2.5-Mistral-7B	0.15
Llama-2-13b-chat-hf	0.25
Mixtral-8x7B-Instruct-v0.1	0.50
Mixtral-8x22B-Instruct-v0.1	0.90
Llama-2-70b-chat-hf	1.0
Llama-3-70b-chat-hf	1.0
CodeLlama-70b-Instruct-hf	1.0
thenlper-gte-large	0.05
BAAI/bge-large-en-v1.5	0.05

tip

Every new user will receive $10 free credit to get started.

Fine Tuning Pricing

Fine Tuning is billed at a fixed cost of $5 per run and $/million-tokens. For example, a fine tuning job of Llama-2-13b-chat-hf with 10M tokens would cost $5 + $2x10 = $25

Model	Fixed Cost/Run	Price ($/M tokens)
Llama-2-13b-chat-hf	5	2
Llama-2-70b-chat-hf	5	4
mistralai/Mistral-7B-Instruct-v0.1	5	1
mistralai/Mixtral-8x7B-Instruct-v0.1	5	4

Querying the fine-tuned models is billed on a $/million-tokens basis.

Model	Price ($/M tokens)
Llama-2-7b-chat-hf*	0.25
Llama-2-13b-chat-hf	0.50
Llama-2-70b-chat-hf	2.00
mistralai/Mistral-7B-Instruct-v0.1	0.25
mistralai/Mixtral-8x7B-Instruct-v0.1	1.00

We are moving meta-llama/Llama-2-7b-chat-hf to legacy models list. Please use mistralai/Mistral-7B-Instruct-v0.1 for new fine-tuning jobs. You can still query the existing fine-tuned models that are based on meta-llama/Llama-2-7b-chat-hf, but this will be temporary and we recommend migrating to mistralai/Mistral-7B-Instruct-v0.1.

Pricing

Base Model Pricing​

Fine Tuning Pricing​

Base Model Pricing

Fine Tuning Pricing