Validated Alibaba Models
You can import large language models from Hugging Face and OCI Object Storage buckets into OCI Generative AI, create endpoints for those models, and use them in the Generative AI service.
Alibaba Qwen model family, feature advanced multilingual and multimodal capabilities. For model cards on Hugging Face, see the links in the following tables.
Qwen Image
| Hugging Face Model ID | Model Capability | Recommended Dedicated AI Cluster Unit Shape |
|---|---|---|
| Qwen/Qwen-Image | TEXT_TO_IMAGE | A100_80G_X1 |
| Qwen/Qwen-Image-Edit | IMAGE_TEXT_TO_IMAGE | A100_80G_X1 |
| Qwen/Qwen-Image-2512 | TEXT_TO_IMAGE | A100_80G_X1 |
| Qwen/Qwen-Image-Edit-2511 | IMAGE_TEXT_TO_IMAGE | A100_80G_X1 |
| Qwen/Qwen-Image-Edit-2509 | IMAGE_TEXT_TO_IMAGE | A100_80G_X1 |
Note
response_format: "url"doesn't work and returns an HTTP 400 bad request error.n(number of images): only0or1work.- Streaming isn’t validated.
- Non-standard image sizes might be rounded (for example,
999x999→992x992) instead of returning an HTTP 400 (unlike the OpenAI API). - Transparency might not work because of model limitations.
Qwen Q (Reasoning)
| Hugging Face Model ID | Model Capability | Recommended Dedicated AI Cluster Unit Shape |
|---|---|---|
| Qwen/QwQ-32B | TEXT_TO_TEXT | A100_80G_X2 |
Qwen 3
| Hugging Face Model ID | Model Capability | Recommended Dedicated AI Cluster Unit Shape |
|---|---|---|
| Qwen/Qwen3-Embedding-0.6B | EMBEDDING | A10_X1 |
| Qwen/Qwen3-Embedding-4B | EMBEDDING | A10_X2 |
| Qwen/Qwen3-Embedding-8B | EMBEDDING | A100_80G_X1 |
| Qwen/Qwen3-0.6B | TEXT_TO_TEXT | A100_80G_X1 |
| Qwen/Qwen3-1.7B | TEXT_TO_TEXT | A100_80G_X1 |
| Qwen/Qwen3-4B | TEXT_TO_TEXT | A100_80G_X1 |
| Qwen/Qwen3-8B | TEXT_TO_TEXT | A100_80G_X1 |
| Qwen/Qwen3-14B | TEXT_TO_TEXT | A100_80G_X1 |
| Qwen/Qwen3-32B | TEXT_TO_TEXT | A100_80G_X2 |
| Qwen/Qwen3-4B-Instruct-2507 | TEXT_TO_TEXT | A100_80G_X1 |
| Qwen/Qwen3-30B-A3B-Instruct-2507 | TEXT_TO_TEXT | A100_80G_X2 |
| Qwen/Qwen3-235B-A22B-Instruct-2507 | TEXT_TO_TEXT | H100_X8 |
| Qwen/Qwen3-VL-30B-A3B-Instruct | IMAGE_TEXT_TO_TEXT | H100_X2 |
| Qwen/Qwen3-VL-235B-A22B-Instruct | IMAGE_TEXT_TO_TEXT | H100_X8 |
Qwen 2.5
| Hugging Face Model ID | Model Capability | Recommended Dedicated AI Cluster Unit Shape |
|---|---|---|
| Qwen/Qwen2.5-Coder-32B-Instruct | TEXT_TO_TEXT | A100_80G_X2 |
| Qwen/Qwen2.5-0.5B-Instruct | TEXT_TO_TEXT | A100_80G_X1 |
| Qwen/Qwen2.5-1.5B-Instruct | TEXT_TO_TEXT | A100_80G_X1 |
| Qwen/Qwen2.5-3B-Instruct | TEXT_TO_TEXT | A100_80G_X1 |
| Qwen/Qwen2.5-7B-Instruct | TEXT_TO_TEXT | A100_80G_X1 |
| Qwen/Qwen2.5-14B-Instruct | TEXT_TO_TEXT | A100_80G_X1 |
| Qwen/Qwen2.5-32B-Instruct | TEXT_TO_TEXT | A100_80G_X2 |
| Qwen/Qwen2.5-72B-Instruct | TEXT_TO_TEXT | A100_80G_X4 |
| Qwen/Qwen2.5-VL-3B-Instruct | IMAGE_TEXT_TO_TEXT | A100_80G_X1 |
| Qwen/Qwen2.5-VL-7B-Instruct | IMAGE_TEXT_TO_TEXT | A100_80G_X1 |
| Qwen/Qwen2.5-VL-32B-Instruct | IMAGE_TEXT_TO_TEXT | A100_80G_X2 |
| Qwen/Qwen2.5-VL-72B-Instruct | IMAGE_TEXT_TO_TEXT | A100_80G_X4 |
Qwen 2
| Hugging Face Model ID | Model Capability | Recommended Dedicated AI Cluster Unit Shape |
|---|---|---|
| Qwen/Qwen2-0.5B-Instruct | TEXT_TO_TEXT | A100_80G_X1 |
| Qwen/Qwen2-1.5B-Instruct | TEXT_TO_TEXT | A100_80G_X1 |
| Qwen/Qwen2-7B-Instruct | TEXT_TO_TEXT | A100_80G_X1 |
| Qwen/Qwen2-72B-Instruct | TEXT_TO_TEXT | A100_80G_X4 |
| Qwen/Qwen2-VL-2B-Instruct | IMAGE_TEXT_TO_TEXT | A100_80G_X1 |
| Qwen/Qwen2-VL-7B-Instruct | IMAGE_TEXT_TO_TEXT | A100_80G_X1 |
| Qwen/Qwen2-VL-72B-Instruct | IMAGE_TEXT_TO_TEXT | A100_80G_X4 |
Important
-
While you can import any chat, embedding, (and fine-tuned) model validated through Open Model Engine (with vLLM or SGLang runtime), only models explicitly listed on this page are validated for this model family. Unlisted models might have compatibility issues and we recommend that you test any unlisted model before production use. Learn about OCI Generative AI Imported Model Architecture.
- For imported models, you can use the native context length specified by the model provider. However, the effective maximum context length is limited by the underlying hardware setup that you select for the hosting dedicated AI clusters in OCI Generative AI. To take full advantage of a model's native context length, you might need to provision more hardware resources.
- Use the fine-tuned models only if they match the validated base model's transformer version and have a parameter count within ±10% of the original.
- For available hardware and steps on how to deploy the imported models, see Managing Imported Models.
- If the validated unit shape isn't available in the region, select a higher-tier option. For example, if A100 isn't available, select H100.