vGPUs | vCPUs | Memory in GiB |
---|---|---|
- | 4 | 16 |
- | 8 | 32 |
1xL40S | 16 | 232 |
2xL40S | 32 | 464 |
1xH100 | 16 | 232 |
2xH100 | 32 | 464 |
4xH100 | 64 | 928 |
1xA100 | 16 | 232 |
2xA100 | 32 | 464 |
4xA100 | 64 | 928 |
1xRTX-4000 | 10 | 40 |
2xRTX-4000 | 20 | 80 |
4xRTX-4000 | 40 | 160 |
Model | Description |
---|---|
aya-expanse-32b | An open-weight research release of a model with highly advanced multilingual capabilities. |
ByteDance/SDXL-Lightning | An advanced framework designed to accelerate the development and deployment of scalable deep learning models |
FLUX.1-dev | A 12-billion-parameter rectified flow transformer designed for text-to-image generation. |
FLUX.1-schnell | A 12-billion-parameter rectified flow transformer designed for rapid text-to-image generation. |
Llama-3.1-8B-Instruct | A multilingual large language model developed by Meta. |
Llama-3.1-Nemotron-70B-Instruct | Q large language model customized by NVIDIA to improve the helpfulness of LLM-generated responses to user queries. |
Llama-3.2-1B-Instruct | A multilingual large language model developed by Meta. |
Llama-3.3-70B-Instruct | A tuned text-only model optimized for multilingual dialogue use cases. |
Marco-o1 | A model focused on disciplines with standard answers, such as mathematics, physics, and coding. |
Mistral-7B-Instruct-v0.3 | An instruct fine-tuned version of the Mistral-7B-v0.3 base model, designed to generate coherent and contextually relevant text. |
Mistral-Nemo-Instruct-2407 | An instruct fine-tuned version of the Mistral-Nemo-Base-2407. |
Mistral-Small-Instruct-2409 | A 22-billion-parameter large language model developed by Mistral AI. |
Pixtral-12B-2409 | A Multimodal Model of 12B parameters plus a 400M parameter vision encoder. |
Qwen2.5-14B-Instruct-GPTQ-Int8 | The latest generation of Qwen language models. |
Qwen2.5-7B-Instruct | The latest generation of Qwen language models. |
Qwen2.5-Coder-32B-Instruct | Qwen2.5-Coder is the latest series of Code-Specific Qwen large language models. |
Qwen2-VL-7B-Instruct | The latest version of the Qwen-VL model was enhanced to set new standards in visual understanding and interactivity. |
QwQ-32B-Preview | An experimental research model developed by the Qwen team focused on advancing AI reasoning capabilities. |
SDXL-Lightning Gradio | A high-performance variant of the Stable Diffusion XL text-to-image generation model designed for speed and efficiency. |
stable-cascade | A compositional generative model that refines outputs through a staged pipeline. |
stable-diffusion-3.5-large | An advanced text-to-image generation model featuring 8.1 billion parameters. |
stable-diffusion-3.5-large-turbo | A distilled version of the Stable Diffusion 3.5 large model. |
stable-diffusion-xl | A state-of-the-art text-to-image generation model with an enhanced UNet backbone and dual text encoders for improved detail and prompt adherence. |
whisper-large-v3 | A state-of-the-art model for automatic speech recognition (ASR) and speech translation. |
whisper-large-v3-turbo | A fine-tuned version of a pruned Whisper large-v3. |