# Models

<mark style="color:$primary;">Explore all available models in nexos.ai; we suggest using search if you have a specific model in mind - unless you'd like to read about 99+ models.</mark>

## nexos.ai Recommended Models

We focus on a select group of recommended models to guarantee full feature support and superior result quality. By using these models, you can access nexos.ai's complete capabilities: **document generation, deep research, and tool integrations** (Slack, Jira, Confluence, GitHub, and more) - ensuring the best possible performance and stability.

<table data-view="cards"><thead><tr><th align="center"></th><th></th><th></th><th></th><th></th><th></th></tr></thead><tbody><tr><td align="center"><strong>Anthropic</strong></td><td><ul><li>Claude Sonnet 4.5</li></ul></td><td><ul><li>Claude Haiku 4.5</li></ul></td><td><ul><li>Claude Sonnet 4.6</li></ul></td><td><ul><li>Claude Opus 4.7</li></ul></td><td><ul><li>Claude Opus 4.8</li></ul></td></tr><tr><td align="center"><strong>Google</strong></td><td><ul><li>Gemini 3 Flash Preview</li></ul></td><td><ul><li>Gemini 3.1 Pro Preview</li></ul></td><td></td><td></td><td></td></tr><tr><td align="center"><strong>OpenAI</strong></td><td><ul><li>GPT 5.3 Instant</li></ul></td><td><ul><li>GPT 5.4</li></ul></td><td><ul><li>GPT 5.4 mini</li></ul></td><td><ul><li>GPT-OSS 120b</li></ul></td><td><ul><li>GPT 5.5</li></ul></td></tr></tbody></table>

### All Models

<details>

<summary>Annotations</summary>

<table><thead><tr><th width="241.03125"></th><th></th></tr></thead><tbody><tr><td><strong>Model</strong></td><td>Name of the model variant (includes version or size).</td></tr><tr><td><strong>Hosted</strong></td><td>Platform where the model is served (e.g. Azure, Bedrock, nexos.ai).</td></tr><tr><td><strong>Developer</strong></td><td>Organization that created the model (e.g. OpenAI, Meta, Google).</td></tr><tr><td><strong>Region</strong></td><td>Deployment location. <strong>EU</strong> = Europe, <strong>US</strong> = United States, <strong>OTHER</strong> = global or unspecified region.</td></tr><tr><td><strong>Info used for model training</strong></td><td>Indicates whether user inputs and model outputs are used to improve or retrain the model.</td></tr><tr><td><strong>Data Retention Period (days)</strong></td><td>Number of days that user inputs and model outputs are stored by the provider before automatic deletion.</td></tr><tr><td><strong>Input cost per 1M tokens</strong></td><td>Price for 1 million tokens sent to the model as input.</td></tr><tr><td><strong>Output cost per 1M tokens</strong></td><td>Price for 1 million tokens generated by the model as output.</td></tr><tr><td><strong>Cost per image</strong></td><td>Price per image generated by the model.</td></tr><tr><td><strong>Long context boundary</strong></td><td>Token threshold beyond which long-context pricing rates apply.</td></tr><tr><td><strong>Long context input cost per 1M tokens</strong></td><td>Price for 1 million input tokens when context length exceeds the boundary.</td></tr><tr><td><strong>Long context output cost per 1M tokens</strong></td><td>Price for 1 million output tokens when context length exceeds the boundary.</td></tr><tr><td><strong>Cache creation cost per 1M tokens</strong></td><td>Price for storing 1 million tokens in prompt cache.</td></tr><tr><td><strong>Cached input cost per 1M tokens</strong></td><td>Price for 1 million input tokens retrieved from prompt cache.</td></tr><tr><td><strong>Input cost per audio second</strong></td><td>Price for every second of audio input processed by the model.</td></tr></tbody></table>

</details>

{% hint style="info" %}
You may enable or disable large language models (LLMs) at your discretion and are solely responsible for such choice. Information about the LLMs below is sourced from publicly available terms and may change without notice. We strive to provide accurate information but make no guarantees. We encourage you to review official provider terms to verify all details.
{% endhint %}

<details>

<summary>Hosted by nexos.ai</summary>

**OpenAI**

|                              |              |
| ---------------------------- | ------------ |
| Model                        | GPT-OSS 120b |
| Hosted                       | nexos.ai     |
| Developer                    | OpenAI       |
| Region                       | EU           |
| Info used for model training | No           |
| Data Retention Period (days) | 0            |
| Input cost per 1M tokens     | $0.8         |
| Output cost per 1M tokens    | $1.6         |

</details>

<details>

<summary>Amazon</summary>

|                                 |           |
| ------------------------------- | --------- |
| Model                           | Nova Lite |
| Hosted                          | Bedrock   |
| Developer                       | Amazon    |
| Region                          | EU        |
| Info used for model training    | No        |
| Data Retention Period (days)    | 0         |
| Input cost per 1M tokens        | $0.078    |
| Output cost per 1M tokens       | $0.312    |
| Cached input cost per 1M tokens | $0.0195   |

|                                 |          |
| ------------------------------- | -------- |
| Model                           | Nova Pro |
| Hosted                          | Bedrock  |
| Developer                       | Amazon   |
| Region                          | EU       |
| Info used for model training    | No       |
| Data Retention Period (days)    | 0        |
| Input cost per 1M tokens        | $1.05    |
| Output cost per 1M tokens       | $4.2     |
| Cached input cost per 1M tokens | $0.2625  |

|                              |             |
| ---------------------------- | ----------- |
| Model                        | Nova 2 Lite |
| Hosted                       | Bedrock     |
| Developer                    | Amazon      |
| Region                       | EU          |
| Info used for model training | No          |
| Data Retention Period (days) | 0           |
| Input cost per 1M tokens     | $0.43       |
| Output cost per 1M tokens    | $3.597      |

|                                 |           |
| ------------------------------- | --------- |
| Model                           | Nova Lite |
| Hosted                          | Bedrock   |
| Developer                       | Amazon    |
| Region                          | US        |
| Info used for model training    | No        |
| Data Retention Period (days)    | 0         |
| Input cost per 1M tokens        | $0.06     |
| Output cost per 1M tokens       | $0.24     |
| Cached input cost per 1M tokens | $0.015    |

|                                 |          |
| ------------------------------- | -------- |
| Model                           | Nova Pro |
| Hosted                          | Bedrock  |
| Developer                       | Amazon   |
| Region                          | US       |
| Info used for model training    | No       |
| Data Retention Period (days)    | 0        |
| Input cost per 1M tokens        | $0.8     |
| Output cost per 1M tokens       | $3.2     |
| Cached input cost per 1M tokens | $0.2     |

|                              |              |
| ---------------------------- | ------------ |
| Model                        | Nova Premier |
| Hosted                       | Bedrock      |
| Developer                    | Amazon       |
| Region                       | US           |
| Info used for model training | No           |
| Data Retention Period (days) | 0            |
| Input cost per 1M tokens     | $2.5         |
| Output cost per 1M tokens    | $12.5        |

|                              |             |
| ---------------------------- | ----------- |
| Model                        | Nova 2 Lite |
| Hosted                       | Bedrock     |
| Developer                    | Amazon      |
| Region                       | US          |
| Info used for model training | No          |
| Data Retention Period (days) | 0           |
| Input cost per 1M tokens     | $0.33       |
| Output cost per 1M tokens    | $2.75       |

</details>

<details>

<summary>Anthropic</summary>

|                                                |                   |
| ---------------------------------------------- | ----------------- |
| Model                                          | Claude Sonnet 4.5 |
| Hosted                                         | Vertex            |
| Developer                                      | Anthropic         |
| Region                                         | EU                |
| Info used for model training                   | No                |
| Data Retention Period (days)                   | 1                 |
| Input cost per 1M tokens                       | $3.3              |
| Output cost per 1M tokens                      | $16.5             |
| Cache creation cost per 1M tokens              | $4.13             |
| Cached input cost per 1M tokens                | $0.33             |
| Long context boundary                          | 200,000           |
| Long context input cost per 1M tokens          | $6.6              |
| Long context output cost per 1M tokens         | $24.75            |
| Long context cache creation cost per 1M tokens | $8.25             |
| Long context cached input cost per 1M tokens   | $0.66             |

|                                   |                  |
| --------------------------------- | ---------------- |
| Model                             | Claude Haiku 4.5 |
| Hosted                            | Vertex           |
| Developer                         | Anthropic        |
| Region                            | EU               |
| Info used for model training      | No               |
| Data Retention Period (days)      | 1                |
| Input cost per 1M tokens          | $1.1             |
| Output cost per 1M tokens         | $5.5             |
| Cache creation cost per 1M tokens | $1.375           |
| Cached input cost per 1M tokens   | $0.11            |

|                                   |                 |
| --------------------------------- | --------------- |
| Model                             | Claude Opus 4.5 |
| Hosted                            | Vertex          |
| Developer                         | Anthropic       |
| Region                            | EU              |
| Info used for model training      | No              |
| Data Retention Period (days)      | 1               |
| Input cost per 1M tokens          | $5.5            |
| Output cost per 1M tokens         | $27.5           |
| Cache creation cost per 1M tokens | $6.875          |
| Cached input cost per 1M tokens   | $0.55           |

|                                   |                   |
| --------------------------------- | ----------------- |
| Model                             | Claude Sonnet 4.5 |
| Hosted                            | Anthropic         |
| Developer                         | Anthropic         |
| Region                            | US                |
| Info used for model training      | No                |
| Data Retention Period (days)      | 30                |
| Input cost per 1M tokens          | $3                |
| Output cost per 1M tokens         | $15               |
| Cache creation cost per 1M tokens | $3.75             |
| Cached input cost per 1M tokens   | $0.3              |

|                                                |                   |
| ---------------------------------------------- | ----------------- |
| Model                                          | Claude Sonnet 4.5 |
| Hosted                                         | Vertex            |
| Developer                                      | Anthropic         |
| Region                                         | OTHER             |
| Info used for model training                   | No                |
| Data Retention Period (days)                   | 1                 |
| Input cost per 1M tokens                       | $3                |
| Output cost per 1M tokens                      | $15               |
| Cache creation cost per 1M tokens              | $3.75             |
| Cached input cost per 1M tokens                | $0.3              |
| Long context boundary                          | 200,000           |
| Long context input cost per 1M tokens          | $6                |
| Long context output cost per 1M tokens         | $22.5             |
| Long context cache creation cost per 1M tokens | $7.5              |
| Long context cached input cost per 1M tokens   | $0.6              |

|                                   |                  |
| --------------------------------- | ---------------- |
| Model                             | Claude Haiku 4.5 |
| Hosted                            | Vertex           |
| Developer                         | Anthropic        |
| Region                            | US               |
| Info used for model training      | No               |
| Data Retention Period (days)      | 1                |
| Input cost per 1M tokens          | $1.1             |
| Output cost per 1M tokens         | $5.5             |
| Cache creation cost per 1M tokens | $1.375           |
| Cached input cost per 1M tokens   | $0.11            |

|                                   |                  |
| --------------------------------- | ---------------- |
| Model                             | Claude Haiku 4.5 |
| Hosted                            | Anthropic        |
| Developer                         | Anthropic        |
| Region                            | US               |
| Info used for model training      | No               |
| Data Retention Period (days)      | 30               |
| Input cost per 1M tokens          | $1               |
| Output cost per 1M tokens         | $5               |
| Cache creation cost per 1M tokens | $1.25            |
| Cached input cost per 1M tokens   | $0.1             |

|                                   |                 |
| --------------------------------- | --------------- |
| Model                             | Claude Opus 4.5 |
| Hosted                            | Vertex          |
| Developer                         | Anthropic       |
| Region                            | OTHER           |
| Info used for model training      | No              |
| Data Retention Period (days)      | 1               |
| Input cost per 1M tokens          | $5              |
| Output cost per 1M tokens         | $25             |
| Cache creation cost per 1M tokens | $6.25           |
| Cached input cost per 1M tokens   | $0.5            |

|                                                |                 |
| ---------------------------------------------- | --------------- |
| Model                                          | Claude Opus 4.6 |
| Hosted                                         | Vertex          |
| Developer                                      | Anthropic       |
| Region                                         | EU              |
| Info used for model training                   | No              |
| Data Retention Period (days)                   | 1               |
| Input cost per 1M tokens                       | $5.5            |
| Output cost per 1M tokens                      | $27.5           |
| Cache creation cost per 1M tokens              | $6.875          |
| Cached input cost per 1M tokens                | $0.55           |
| Long context boundary                          | 200,000         |
| Long context input cost per 1M tokens          | $5.5            |
| Long context output cost per 1M tokens         | $27.5           |
| Long context cache creation cost per 1M tokens | $6.875          |
| Long context cached input cost per 1M tokens   | $0.55           |

|                                                |                   |
| ---------------------------------------------- | ----------------- |
| Model                                          | Claude Sonnet 4.6 |
| Hosted                                         | Vertex            |
| Developer                                      | Anthropic         |
| Region                                         | EU                |
| Info used for model training                   | No                |
| Data Retention Period (days)                   | 1                 |
| Input cost per 1M tokens                       | $3.3              |
| Output cost per 1M tokens                      | $16.5             |
| Cache creation cost per 1M tokens              | $4.13             |
| Cached input cost per 1M tokens                | $0.33             |
| Long context boundary                          | 200,000           |
| Long context input cost per 1M tokens          | $3.3              |
| Long context output cost per 1M tokens         | $16.5             |
| Long context cache creation cost per 1M tokens | $4.13             |
| Long context cached input cost per 1M tokens   | $0.33             |

|                                                |                   |
| ---------------------------------------------- | ----------------- |
| Model                                          | Claude Sonnet 4.6 |
| Hosted                                         | Vertex            |
| Developer                                      | Anthropic         |
| Region                                         | OTHER             |
| Info used for model training                   | No                |
| Data Retention Period (days)                   | 1                 |
| Input cost per 1M tokens                       | $3                |
| Output cost per 1M tokens                      | $15               |
| Cache creation cost per 1M tokens              | $3.75             |
| Cached input cost per 1M tokens                | $0.3              |
| Long context boundary                          | 200,000           |
| Long context input cost per 1M tokens          | $3                |
| Long context output cost per 1M tokens         | $15               |
| Long context cache creation cost per 1M tokens | $3.75             |
| Long context cached input cost per 1M tokens   | $0.3              |

|                                                |                 |
| ---------------------------------------------- | --------------- |
| Model                                          | Claude Opus 4.7 |
| Hosted                                         | Vertex          |
| Developer                                      | Anthropic       |
| Region                                         | EU              |
| Info used for model training                   | No              |
| Data Retention Period (days)                   | 1               |
| Input cost per 1M tokens                       | $5.5            |
| Output cost per 1M tokens                      | $27.5           |
| Cache creation cost per 1M tokens              | $6.75           |
| Cached input cost per 1M tokens                | $0.55           |
| Long context boundary                          | 200,000         |
| Long context input cost per 1M tokens          | $5.5            |
| Long context output cost per 1M tokens         | $27.5           |
| Long context cache creation cost per 1M tokens | $6.75           |
| Long context cached input cost per 1M tokens   | $0.55           |

|                                                |                 |
| ---------------------------------------------- | --------------- |
| Model                                          | Claude Opus 4.8 |
| Hosted                                         | Vertex          |
| Developer                                      | Anthropic       |
| Region                                         | EU              |
| Info used for model training                   | No              |
| Data Retention Period (days)                   | 1               |
| Input cost per 1M tokens                       | $5.5            |
| Output cost per 1M tokens                      | $27.5           |
| Cache creation cost per 1M tokens              | $6.875          |
| Cached input cost per 1M tokens                | $0.55           |
| Long context boundary                          | 200,000         |
| Long context input cost per 1M tokens          | $5.5            |
| Long context output cost per 1M tokens         | $27.5           |
| Long context cache creation cost per 1M tokens | $6.875          |
| Long context cached input cost per 1M tokens   | $0.55           |

|                                                |                 |
| ---------------------------------------------- | --------------- |
| Model                                          | Claude Opus 4.8 |
| Hosted                                         | Vertex          |
| Developer                                      | Anthropic       |
| Region                                         | OTHER           |
| Info used for model training                   | No              |
| Data Retention Period (days)                   | 1               |
| Input cost per 1M tokens                       | $5              |
| Output cost per 1M tokens                      | $25             |
| Cache creation cost per 1M tokens              | $6.25           |
| Cached input cost per 1M tokens                | $0.5            |
| Long context boundary                          | 200,000         |
| Long context input cost per 1M tokens          | $5              |
| Long context output cost per 1M tokens         | $25             |
| Long context cache creation cost per 1M tokens | $6.25           |
| Long context cached input cost per 1M tokens   | $0.5            |

</details>

<details>

<summary>Black Forest Labs</summary>

|                              |                   |
| ---------------------------- | ----------------- |
| Model                        | FLUX 1.1 Pro      |
| Hosted                       | Azure             |
| Developer                    | Black Forest Labs |
| Region                       | EU                |
| Info used for model training | No                |
| Data Retention Period (days) | 0                 |
| Cost per image               | $0.044            |

|                              |                   |
| ---------------------------- | ----------------- |
| Model                        | FLUX 1.1 Pro      |
| Hosted                       | Azure             |
| Developer                    | Black Forest Labs |
| Region                       | US                |
| Info used for model training | No                |
| Data Retention Period (days) | 0                 |
| Cost per image               | $0.044            |

</details>

<details>

<summary>Cohere</summary>

|                              |                  |
| ---------------------------- | ---------------- |
| Model                        | Cohere Command A |
| Hosted                       | Azure            |
| Developer                    | Cohere           |
| Region                       | OTHER            |
| Info used for model training | No               |
| Data Retention Period (days) | 0                |
| Input cost per 1M tokens     | $2.5             |
| Output cost per 1M tokens    | $10              |

</details>

<details>

<summary>DeepSeek</summary>

|                              |                |
| ---------------------------- | -------------- |
| Model                        | DeepSeek 4 Pro |
| Hosted                       | Azure          |
| Developer                    | DeepSeek       |
| Region                       | OTHER          |
| Info used for model training | No             |
| Data Retention Period (days) | 0              |
| Input cost per 1M tokens     | $1.74          |
| Output cost per 1M tokens    | $3.48          |

|                                 |                   |
| ------------------------------- | ----------------- |
| Model                           | DeepSeek V4 Flash |
| Hosted                          | Fireworks AI      |
| Developer                       | DeepSeek          |
| Region                          | OTHER             |
| Info used for model training    | No                |
| Data Retention Period (days)    | 0                 |
| Input cost per 1M tokens        | $0.14             |
| Output cost per 1M tokens       | $0.28             |
| Cached input cost per 1M tokens | $0.03             |

|                                 |                 |
| ------------------------------- | --------------- |
| Model                           | DeepSeek V4 Pro |
| Hosted                          | Fireworks AI    |
| Developer                       | DeepSeek        |
| Region                          | OTHER           |
| Info used for model training    | No              |
| Data Retention Period (days)    | 0               |
| Input cost per 1M tokens        | $1.74           |
| Output cost per 1M tokens       | $3.48           |
| Cached input cost per 1M tokens | $0.14           |

</details>

<details>

<summary>Google</summary>

|                                              |                       |
| -------------------------------------------- | --------------------- |
| Model                                        | Gemini 2.5 Flash Lite |
| Hosted                                       | Vertex                |
| Developer                                    | Google                |
| Region                                       | EU                    |
| Info used for model training                 | No                    |
| Data Retention Period (days)                 | 1                     |
| Input cost per 1M tokens                     | $0.1                  |
| Output cost per 1M tokens                    | $0.4                  |
| Cached input cost per 1M tokens              | $0.01                 |
| Long context boundary                        | 200,000               |
| Long context input cost per 1M tokens        | $0.1                  |
| Long context output cost per 1M tokens       | $0.4                  |
| Long context cached input cost per 1M tokens | $0.01                 |

|                                              |                |
| -------------------------------------------- | -------------- |
| Model                                        | Gemini 2.5 Pro |
| Hosted                                       | Vertex         |
| Developer                                    | Google         |
| Region                                       | EU             |
| Info used for model training                 | No             |
| Data Retention Period (days)                 | 1              |
| Input cost per 1M tokens                     | $1.25          |
| Output cost per 1M tokens                    | $10            |
| Cached input cost per 1M tokens              | $0.13          |
| Long context boundary                        | 200,000        |
| Long context input cost per 1M tokens        | $2.5           |
| Long context output cost per 1M tokens       | $15            |
| Long context cached input cost per 1M tokens | $0.25          |

|                                              |                  |
| -------------------------------------------- | ---------------- |
| Model                                        | Gemini 2.5 Flash |
| Hosted                                       | Vertex           |
| Developer                                    | Google           |
| Region                                       | EU               |
| Info used for model training                 | No               |
| Data Retention Period (days)                 | 1                |
| Input cost per 1M tokens                     | $0.3             |
| Output cost per 1M tokens                    | $2.5             |
| Cached input cost per 1M tokens              | $0.03            |
| Long context boundary                        | 200,000          |
| Long context input cost per 1M tokens        | $0.3             |
| Long context output cost per 1M tokens       | $2.5             |
| Long context cached input cost per 1M tokens | $0.03            |

|                              |          |
| ---------------------------- | -------- |
| Model                        | Imagen 4 |
| Hosted                       | Vertex   |
| Developer                    | Google   |
| Region                       | EU       |
| Info used for model training | No       |
| Data Retention Period (days) | 1        |
| Cost per image               | $0.04    |

|                              |               |
| ---------------------------- | ------------- |
| Model                        | Imagen 4 Fast |
| Hosted                       | Vertex        |
| Developer                    | Google        |
| Region                       | EU            |
| Info used for model training | No            |
| Data Retention Period (days) | 1             |
| Cost per image               | $0.02         |

|                              |                |
| ---------------------------- | -------------- |
| Model                        | Imagen 4 Ultra |
| Hosted                       | Vertex         |
| Developer                    | Google         |
| Region                       | EU             |
| Info used for model training | No             |
| Data Retention Period (days) | 1              |
| Cost per image               | $0.06          |

|                                              |                       |
| -------------------------------------------- | --------------------- |
| Model                                        | Gemini 2.5 Flash Lite |
| Hosted                                       | Vertex                |
| Developer                                    | Google                |
| Region                                       | OTHER                 |
| Info used for model training                 | No                    |
| Data Retention Period (days)                 | 1                     |
| Input cost per 1M tokens                     | $0.1                  |
| Output cost per 1M tokens                    | $0.4                  |
| Cached input cost per 1M tokens              | $0.01                 |
| Long context boundary                        | 200,000               |
| Long context input cost per 1M tokens        | $0.1                  |
| Long context output cost per 1M tokens       | $0.4                  |
| Long context cached input cost per 1M tokens | $0.01                 |

|                                              |                  |
| -------------------------------------------- | ---------------- |
| Model                                        | Gemini 2.5 Flash |
| Hosted                                       | Vertex           |
| Developer                                    | Google           |
| Region                                       | OTHER            |
| Info used for model training                 | No               |
| Data Retention Period (days)                 | 1                |
| Input cost per 1M tokens                     | $0.3             |
| Output cost per 1M tokens                    | $2.5             |
| Cached input cost per 1M tokens              | $0.03            |
| Long context boundary                        | 200,000          |
| Long context input cost per 1M tokens        | $0.3             |
| Long context output cost per 1M tokens       | $2.5             |
| Long context cached input cost per 1M tokens | $0.03            |

|                                              |                |
| -------------------------------------------- | -------------- |
| Model                                        | Gemini 2.5 Pro |
| Hosted                                       | Vertex         |
| Developer                                    | Google         |
| Region                                       | OTHER          |
| Info used for model training                 | No             |
| Data Retention Period (days)                 | 1              |
| Input cost per 1M tokens                     | $1.25          |
| Output cost per 1M tokens                    | $10            |
| Cached input cost per 1M tokens              | $0.13          |
| Long context boundary                        | 200,000        |
| Long context input cost per 1M tokens        | $2.5           |
| Long context output cost per 1M tokens       | $15            |
| Long context cached input cost per 1M tokens | $0.25          |

|                                              |                        |
| -------------------------------------------- | ---------------------- |
| Model                                        | Gemini 3 Flash Preview |
| Hosted                                       | Vertex                 |
| Developer                                    | Google                 |
| Region                                       | OTHER                  |
| Info used for model training                 | No                     |
| Data Retention Period (days)                 | 1                      |
| Input cost per 1M tokens                     | $0.5                   |
| Output cost per 1M tokens                    | $3                     |
| Cached input cost per 1M tokens              | $0.05                  |
| Long context boundary                        | 200,000                |
| Long context input cost per 1M tokens        | $0.5                   |
| Long context output cost per 1M tokens       | $3                     |
| Long context cached input cost per 1M tokens | $0.05                  |

|                                              |                |
| -------------------------------------------- | -------------- |
| Model                                        | Gemini 2.5 Pro |
| Hosted                                       | Vertex         |
| Developer                                    | Google         |
| Region                                       | US             |
| Info used for model training                 | No             |
| Data Retention Period (days)                 | 1              |
| Input cost per 1M tokens                     | $1.25          |
| Output cost per 1M tokens                    | $10            |
| Cached input cost per 1M tokens              | $0.13          |
| Long context boundary                        | 200,000        |
| Long context input cost per 1M tokens        | $2.5           |
| Long context output cost per 1M tokens       | $15            |
| Long context cached input cost per 1M tokens | $0.25          |

|                                              |                  |
| -------------------------------------------- | ---------------- |
| Model                                        | Gemini 2.5 Flash |
| Hosted                                       | Vertex           |
| Developer                                    | Google           |
| Region                                       | US               |
| Info used for model training                 | No               |
| Data Retention Period (days)                 | 1                |
| Input cost per 1M tokens                     | $0.3             |
| Output cost per 1M tokens                    | $2.5             |
| Cached input cost per 1M tokens              | $0.03            |
| Long context boundary                        | 200,000          |
| Long context input cost per 1M tokens        | $0.3             |
| Long context output cost per 1M tokens       | $2.5             |
| Long context cached input cost per 1M tokens | $0.03            |

|                                              |                       |
| -------------------------------------------- | --------------------- |
| Model                                        | Gemini 2.5 Flash Lite |
| Hosted                                       | Vertex                |
| Developer                                    | Google                |
| Region                                       | US                    |
| Info used for model training                 | No                    |
| Data Retention Period (days)                 | 1                     |
| Input cost per 1M tokens                     | $0.1                  |
| Output cost per 1M tokens                    | $0.4                  |
| Cached input cost per 1M tokens              | $0.01                 |
| Long context boundary                        | 200,000               |
| Long context input cost per 1M tokens        | $0.1                  |
| Long context output cost per 1M tokens       | $0.4                  |
| Long context cached input cost per 1M tokens | $0.01                 |

|                              |          |
| ---------------------------- | -------- |
| Model                        | Imagen 4 |
| Hosted                       | Vertex   |
| Developer                    | Google   |
| Region                       | US       |
| Info used for model training | No       |
| Data Retention Period (days) | 1        |
| Cost per image               | $0.04    |

|                              |               |
| ---------------------------- | ------------- |
| Model                        | Imagen 4 Fast |
| Hosted                       | Vertex        |
| Developer                    | Google        |
| Region                       | US            |
| Info used for model training | No            |
| Data Retention Period (days) | 1             |
| Cost per image               | $0.02         |

|                              |                |
| ---------------------------- | -------------- |
| Model                        | Imagen 4 Ultra |
| Hosted                       | Vertex         |
| Developer                    | Google         |
| Region                       | US             |
| Info used for model training | No             |
| Data Retention Period (days) | 1              |
| Cost per image               | $0.06          |

|                                              |                        |
| -------------------------------------------- | ---------------------- |
| Model                                        | Gemini 3.1 Pro Preview |
| Hosted                                       | Vertex                 |
| Developer                                    | Google                 |
| Region                                       | OTHER                  |
| Info used for model training                 | No                     |
| Data Retention Period (days)                 | 1                      |
| Input cost per 1M tokens                     | $2                     |
| Output cost per 1M tokens                    | $12                    |
| Cached input cost per 1M tokens              | $0.2                   |
| Long context boundary                        | 200,000                |
| Long context input cost per 1M tokens        | $4                     |
| Long context output cost per 1M tokens       | $18                    |
| Long context cached input cost per 1M tokens | $0.4                   |

|                                              |                       |
| -------------------------------------------- | --------------------- |
| Model                                        | Gemini 3.1 Flash Lite |
| Hosted                                       | Vertex                |
| Developer                                    | Google                |
| Region                                       | OTHER                 |
| Info used for model training                 | No                    |
| Data Retention Period (days)                 | 1                     |
| Input cost per 1M tokens                     | $0.25                 |
| Output cost per 1M tokens                    | $1.5                  |
| Cached input cost per 1M tokens              | $0.03                 |
| Long context boundary                        | 200,000               |
| Long context input cost per 1M tokens        | $0.25                 |
| Long context output cost per 1M tokens       | $1.5                  |
| Long context cached input cost per 1M tokens | $0.03                 |

|                                              |                       |
| -------------------------------------------- | --------------------- |
| Model                                        | Gemini 3.1 Flash Lite |
| Hosted                                       | Vertex                |
| Developer                                    | Google                |
| Region                                       | EU                    |
| Info used for model training                 | No                    |
| Data Retention Period (days)                 | 1                     |
| Input cost per 1M tokens                     | $0.25                 |
| Output cost per 1M tokens                    | $1.5                  |
| Cached input cost per 1M tokens              | $0.025                |
| Long context boundary                        | 200,000               |
| Long context input cost per 1M tokens        | $0.25                 |
| Long context output cost per 1M tokens       | $1.5                  |
| Long context cached input cost per 1M tokens | $0.025                |

|                                              |                  |
| -------------------------------------------- | ---------------- |
| Model                                        | Gemini 3.5 Flash |
| Hosted                                       | Vertex           |
| Developer                                    | Google           |
| Region                                       | EU               |
| Info used for model training                 | No               |
| Data Retention Period (days)                 | 1                |
| Input cost per 1M tokens                     | $1.5             |
| Output cost per 1M tokens                    | $9               |
| Cached input cost per 1M tokens              | $0.15            |
| Long context boundary                        | 200,000          |
| Long context input cost per 1M tokens        | $1.5             |
| Long context output cost per 1M tokens       | $9               |
| Long context cached input cost per 1M tokens | $0.15            |

|                                              |                  |
| -------------------------------------------- | ---------------- |
| Model                                        | Gemini 3.5 Flash |
| Hosted                                       | Vertex           |
| Developer                                    | Google           |
| Region                                       | OTHER            |
| Info used for model training                 | No               |
| Data Retention Period (days)                 | 1                |
| Input cost per 1M tokens                     | $1.5             |
| Output cost per 1M tokens                    | $9               |
| Cached input cost per 1M tokens              | $0.15            |
| Long context boundary                        | 200,000          |
| Long context input cost per 1M tokens        | $1.5             |
| Long context output cost per 1M tokens       | $9               |
| Long context cached input cost per 1M tokens | $0.15            |

</details>

<details>

<summary>MiniMax</summary>

|                                 |              |
| ------------------------------- | ------------ |
| Model                           | MiniMax M2.7 |
| Hosted                          | Fireworks AI |
| Developer                       | MiniMax      |
| Region                          | OTHER        |
| Info used for model training    | No           |
| Data Retention Period (days)    | 0            |
| Input cost per 1M tokens        | $0.3         |
| Output cost per 1M tokens       | $1.2         |
| Cached input cost per 1M tokens | $0.06        |

</details>

<details>

<summary>Mistral AI</summary>

|                              |                  |
| ---------------------------- | ---------------- |
| Model                        | Mistral Medium 3 |
| Hosted                       | Mistral AI       |
| Developer                    | Mistral AI       |
| Region                       | EU               |
| Info used for model training | No               |
| Data Retention Period (days) | 0                |
| Input cost per 1M tokens     | $0.4             |
| Output cost per 1M tokens    | $2               |

|                              |            |
| ---------------------------- | ---------- |
| Model                        | Codestral  |
| Hosted                       | Mistral AI |
| Developer                    | Mistral AI |
| Region                       | EU         |
| Info used for model training | No         |
| Data Retention Period (days) | 0          |
| Input cost per 1M tokens     | $0.3       |
| Output cost per 1M tokens    | $0.9       |

|                              |                    |
| ---------------------------- | ------------------ |
| Model                        | Mistral Medium 3.1 |
| Hosted                       | Mistral AI         |
| Developer                    | Mistral AI         |
| Region                       | EU                 |
| Info used for model training | No                 |
| Data Retention Period (days) | 0                  |
| Input cost per 1M tokens     | $0.4               |
| Output cost per 1M tokens    | $2                 |

|                              |                 |
| ---------------------------- | --------------- |
| Model                        | Mistral Large 3 |
| Hosted                       | Mistral AI      |
| Developer                    | Mistral AI      |
| Region                       | EU              |
| Info used for model training | No              |
| Data Retention Period (days) | 0               |
| Input cost per 1M tokens     | $0.5            |
| Output cost per 1M tokens    | $1.5            |

|                              |            |
| ---------------------------- | ---------- |
| Model                        | Devstral 2 |
| Hosted                       | Mistral AI |
| Developer                    | Mistral AI |
| Region                       | EU         |
| Info used for model training | No         |
| Data Retention Period (days) | 0          |
| Input cost per 1M tokens     | $0.4       |
| Output cost per 1M tokens    | $2         |

|                              |                 |
| ---------------------------- | --------------- |
| Model                        | Mistral Small 4 |
| Hosted                       | Mistral AI      |
| Developer                    | Mistral AI      |
| Region                       | EU              |
| Info used for model training | No              |
| Data Retention Period (days) | 0               |
| Input cost per 1M tokens     | $0.15           |
| Output cost per 1M tokens    | $0.6            |

|                              |                    |
| ---------------------------- | ------------------ |
| Model                        | Mistral Medium 3.5 |
| Hosted                       | Mistral AI         |
| Developer                    | Mistral AI         |
| Region                       | EU                 |
| Info used for model training | No                 |
| Data Retention Period (days) | 0                  |
| Input cost per 1M tokens     | $1.5               |
| Output cost per 1M tokens    | $7.5               |

</details>

<details>

<summary>Moonshot AI</summary>

|                                 |              |
| ------------------------------- | ------------ |
| Model                           | Kimi K2.5    |
| Hosted                          | Fireworks AI |
| Developer                       | Moonshot AI  |
| Region                          | OTHER        |
| Info used for model training    | No           |
| Data Retention Period (days)    | 0            |
| Input cost per 1M tokens        | $0.6         |
| Output cost per 1M tokens       | $3           |
| Cached input cost per 1M tokens | $0.1         |

|                                 |             |
| ------------------------------- | ----------- |
| Model                           | Kimi K2.6   |
| Hosted                          | Azure       |
| Developer                       | Moonshot AI |
| Region                          | OTHER       |
| Info used for model training    | No          |
| Data Retention Period (days)    | 0           |
| Input cost per 1M tokens        | $0.95       |
| Output cost per 1M tokens       | $4          |
| Cached input cost per 1M tokens | $0.16       |

|                                 |              |
| ------------------------------- | ------------ |
| Model                           | Kimi K2.6    |
| Hosted                          | Fireworks AI |
| Developer                       | Moonshot AI  |
| Region                          | OTHER        |
| Info used for model training    | No           |
| Data Retention Period (days)    | 0            |
| Input cost per 1M tokens        | $0.95        |
| Output cost per 1M tokens       | $4           |
| Cached input cost per 1M tokens | $0.16        |

</details>

<details>

<summary>OpenAI</summary>

|                                 |         |
| ------------------------------- | ------- |
| Model                           | GPT 4.1 |
| Hosted                          | Azure   |
| Developer                       | OpenAI  |
| Region                          | EU      |
| Info used for model training    | No      |
| Data Retention Period (days)    | 0       |
| Input cost per 1M tokens        | $2.2    |
| Output cost per 1M tokens       | $8.8    |
| Cached input cost per 1M tokens | $0.55   |

|                                 |        |
| ------------------------------- | ------ |
| Model                           | GPT 4o |
| Hosted                          | Azure  |
| Developer                       | OpenAI |
| Region                          | EU     |
| Info used for model training    | No     |
| Data Retention Period (days)    | 0      |
| Input cost per 1M tokens        | $2.75  |
| Output cost per 1M tokens       | $11    |
| Cached input cost per 1M tokens | $1.375 |

|                                 |            |
| ------------------------------- | ---------- |
| Model                           | GPT 5 mini |
| Hosted                          | Azure      |
| Developer                       | OpenAI     |
| Region                          | EU         |
| Info used for model training    | No         |
| Data Retention Period (days)    | 0          |
| Input cost per 1M tokens        | $0.28      |
| Output cost per 1M tokens       | $2.2       |
| Cached input cost per 1M tokens | $0.03      |

|                                 |            |
| ------------------------------- | ---------- |
| Model                           | GPT 5 nano |
| Hosted                          | Azure      |
| Developer                       | OpenAI     |
| Region                          | EU         |
| Info used for model training    | No         |
| Data Retention Period (days)    | 0          |
| Input cost per 1M tokens        | $0.06      |
| Output cost per 1M tokens       | $0.44      |
| Cached input cost per 1M tokens | $0.01      |

|                                 |        |
| ------------------------------- | ------ |
| Model                           | GPT 5  |
| Hosted                          | Azure  |
| Developer                       | OpenAI |
| Region                          | EU     |
| Info used for model training    | No     |
| Data Retention Period (days)    | 0      |
| Input cost per 1M tokens        | $1.38  |
| Output cost per 1M tokens       | $11    |
| Cached input cost per 1M tokens | $0.14  |

|                              |                        |
| ---------------------------- | ---------------------- |
| Model                        | Text Embedding 3 Large |
| Hosted                       | Azure                  |
| Developer                    | OpenAI                 |
| Region                       | EU                     |
| Info used for model training | No                     |
| Data Retention Period (days) | 0                      |
| Input cost per 1M tokens     | $0.143                 |

|                                 |                  |
| ------------------------------- | ---------------- |
| Model                           | GPT Image 1 mini |
| Hosted                          | Azure            |
| Developer                       | OpenAI           |
| Region                          | OTHER            |
| Info used for model training    | No               |
| Data Retention Period (days)    | 0                |
| Input cost per 1M tokens        | $2               |
| Output cost per 1M tokens       | $8               |
| Cached input cost per 1M tokens | $0.2             |

|                                 |         |
| ------------------------------- | ------- |
| Model                           | GPT 5.1 |
| Hosted                          | OpenAI  |
| Developer                       | OpenAI  |
| Region                          | OTHER   |
| Info used for model training    | No      |
| Data Retention Period (days)    | 30      |
| Input cost per 1M tokens        | $1.25   |
| Output cost per 1M tokens       | $10     |
| Cached input cost per 1M tokens | $0.125  |

|                                 |         |
| ------------------------------- | ------- |
| Model                           | GPT 5.1 |
| Hosted                          | Azure   |
| Developer                       | OpenAI  |
| Region                          | EU      |
| Info used for model training    | No      |
| Data Retention Period (days)    | 0       |
| Input cost per 1M tokens        | $1.38   |
| Output cost per 1M tokens       | $11     |
| Cached input cost per 1M tokens | $0.14   |

|                              |                  |
| ---------------------------- | ---------------- |
| Model                        | Text to Speech 1 |
| Hosted                       | OpenAI           |
| Developer                    | OpenAI           |
| Region                       | OTHER            |
| Info used for model training | No               |
| Data Retention Period (days) | 30               |

|                                 |         |
| ------------------------------- | ------- |
| Model                           | GPT 5.2 |
| Hosted                          | OpenAI  |
| Developer                       | OpenAI  |
| Region                          | OTHER   |
| Info used for model training    | No      |
| Data Retention Period (days)    | 30      |
| Input cost per 1M tokens        | $1.75   |
| Output cost per 1M tokens       | $14     |
| Cached input cost per 1M tokens | $0.175  |

|                                 |                   |
| ------------------------------- | ----------------- |
| Model                           | GPT 5.1 Codex Max |
| Hosted                          | OpenAI            |
| Developer                       | OpenAI            |
| Region                          | OTHER             |
| Info used for model training    | No                |
| Data Retention Period (days)    | 30                |
| Input cost per 1M tokens        | $1.25             |
| Output cost per 1M tokens       | $10               |
| Cached input cost per 1M tokens | $0.125            |

|                                 |                    |
| ------------------------------- | ------------------ |
| Model                           | GPT 5.1 Codex mini |
| Hosted                          | OpenAI             |
| Developer                       | OpenAI             |
| Region                          | OTHER              |
| Info used for model training    | No                 |
| Data Retention Period (days)    | 30                 |
| Input cost per 1M tokens        | $0.25              |
| Output cost per 1M tokens       | $2                 |
| Cached input cost per 1M tokens | $0.025             |

|                              |             |
| ---------------------------- | ----------- |
| Model                        | GPT 5.2 Pro |
| Hosted                       | OpenAI      |
| Developer                    | OpenAI      |
| Region                       | OTHER       |
| Info used for model training | No          |
| Data Retention Period (days) | 30          |
| Input cost per 1M tokens     | $21         |
| Output cost per 1M tokens    | $168        |

|                                 |               |
| ------------------------------- | ------------- |
| Model                           | GPT 5.1 Codex |
| Hosted                          | OpenAI        |
| Developer                       | OpenAI        |
| Region                          | OTHER         |
| Info used for model training    | No            |
| Data Retention Period (days)    | 30            |
| Input cost per 1M tokens        | $1.25         |
| Output cost per 1M tokens       | $10           |
| Cached input cost per 1M tokens | $0.125        |

|                                 |               |
| ------------------------------- | ------------- |
| Model                           | GPT Image 1.5 |
| Hosted                          | Azure         |
| Developer                       | OpenAI        |
| Region                          | OTHER         |
| Info used for model training    | No            |
| Data Retention Period (days)    | 0             |
| Input cost per 1M tokens        | $5            |
| Output cost per 1M tokens       | $32           |
| Cached input cost per 1M tokens | $1.25         |

|                                 |         |
| ------------------------------- | ------- |
| Model                           | GPT 5.2 |
| Hosted                          | Azure   |
| Developer                       | OpenAI  |
| Region                          | OTHER   |
| Info used for model training    | No      |
| Data Retention Period (days)    | 0       |
| Input cost per 1M tokens        | $1.75   |
| Output cost per 1M tokens       | $14     |
| Cached input cost per 1M tokens | $0.175  |

|                              |             |
| ---------------------------- | ----------- |
| Model                        | Whisper 1   |
| Hosted                       | Azure       |
| Developer                    | OpenAI      |
| Region                       | EU          |
| Info used for model training | No          |
| Data Retention Period (days) | 0           |
| Input cost per audio second  | $0.00012222 |

|                              |                  |
| ---------------------------- | ---------------- |
| Model                        | Text to Speech 1 |
| Hosted                       | Azure            |
| Developer                    | OpenAI           |
| Region                       | EU               |
| Info used for model training | No               |
| Data Retention Period (days) | 0                |

|                                 |         |
| ------------------------------- | ------- |
| Model                           | GPT 4.1 |
| Hosted                          | OpenAI  |
| Developer                       | OpenAI  |
| Region                          | OTHER   |
| Info used for model training    | No      |
| Data Retention Period (days)    | 30      |
| Input cost per 1M tokens        | $2      |
| Output cost per 1M tokens       | $8      |
| Cached input cost per 1M tokens | $0.5    |

|                              |                        |
| ---------------------------- | ---------------------- |
| Model                        | Text Embedding 3 Large |
| Hosted                       | OpenAI                 |
| Developer                    | OpenAI                 |
| Region                       | OTHER                  |
| Info used for model training | No                     |
| Data Retention Period (days) | 30                     |
| Input cost per 1M tokens     | $0.13                  |

|                              |                        |
| ---------------------------- | ---------------------- |
| Model                        | Text Embedding 3 Small |
| Hosted                       | OpenAI                 |
| Developer                    | OpenAI                 |
| Region                       | OTHER                  |
| Info used for model training | No                     |
| Data Retention Period (days) | 30                     |
| Input cost per 1M tokens     | $0.02                  |

|                              |           |
| ---------------------------- | --------- |
| Model                        | Whisper 1 |
| Hosted                       | OpenAI    |
| Developer                    | OpenAI    |
| Region                       | OTHER     |
| Info used for model training | No        |
| Data Retention Period (days) | 30        |
| Input cost per audio second  | $0.0001   |

|                                 |              |
| ------------------------------- | ------------ |
| Model                           | GPT 4.1 mini |
| Hosted                          | OpenAI       |
| Developer                       | OpenAI       |
| Region                          | OTHER        |
| Info used for model training    | No           |
| Data Retention Period (days)    | 30           |
| Input cost per 1M tokens        | $0.4         |
| Output cost per 1M tokens       | $1.6         |
| Cached input cost per 1M tokens | $0.1         |

|                                 |              |
| ------------------------------- | ------------ |
| Model                           | GPT 4.1 nano |
| Hosted                          | OpenAI       |
| Developer                       | OpenAI       |
| Region                          | OTHER        |
| Info used for model training    | No           |
| Data Retention Period (days)    | 30           |
| Input cost per 1M tokens        | $0.1         |
| Output cost per 1M tokens       | $0.4         |
| Cached input cost per 1M tokens | $0.025       |

|                                 |             |
| ------------------------------- | ----------- |
| Model                           | GPT OSS 20B |
| Hosted                          | Groq        |
| Developer                       | OpenAI      |
| Region                          | OTHER       |
| Info used for model training    | No          |
| Data Retention Period (days)    | 0           |
| Input cost per 1M tokens        | $0.075      |
| Output cost per 1M tokens       | $0.3        |
| Cached input cost per 1M tokens | $0.037      |

|                                 |        |
| ------------------------------- | ------ |
| Model                           | GPT 5  |
| Hosted                          | OpenAI |
| Developer                       | OpenAI |
| Region                          | OTHER  |
| Info used for model training    | No     |
| Data Retention Period (days)    | 30     |
| Input cost per 1M tokens        | $1.25  |
| Output cost per 1M tokens       | $10    |
| Cached input cost per 1M tokens | $0.125 |

|                                 |            |
| ------------------------------- | ---------- |
| Model                           | GPT 5 mini |
| Hosted                          | OpenAI     |
| Developer                       | OpenAI     |
| Region                          | OTHER      |
| Info used for model training    | No         |
| Data Retention Period (days)    | 30         |
| Input cost per 1M tokens        | $0.25      |
| Output cost per 1M tokens       | $2         |
| Cached input cost per 1M tokens | $0.025     |

|                                 |            |
| ------------------------------- | ---------- |
| Model                           | GPT 5 nano |
| Hosted                          | OpenAI     |
| Developer                       | OpenAI     |
| Region                          | OTHER      |
| Info used for model training    | No         |
| Data Retention Period (days)    | 30         |
| Input cost per 1M tokens        | $0.05      |
| Output cost per 1M tokens       | $0.4       |
| Cached input cost per 1M tokens | $0.005     |

|                                 |            |
| ------------------------------- | ---------- |
| Model                           | GPT 5 mini |
| Hosted                          | Azure      |
| Developer                       | OpenAI     |
| Region                          | US         |
| Info used for model training    | No         |
| Data Retention Period (days)    | 0          |
| Input cost per 1M tokens        | $0.28      |
| Output cost per 1M tokens       | $2.2       |
| Cached input cost per 1M tokens | $0.03      |

|                                 |         |
| ------------------------------- | ------- |
| Model                           | GPT 4.1 |
| Hosted                          | Azure   |
| Developer                       | OpenAI  |
| Region                          | US      |
| Info used for model training    | No      |
| Data Retention Period (days)    | 0       |
| Input cost per 1M tokens        | $2.2    |
| Output cost per 1M tokens       | $8.8    |
| Cached input cost per 1M tokens | $0.55   |

|                              |              |
| ---------------------------- | ------------ |
| Model                        | GPT OSS 120B |
| Hosted                       | Azure        |
| Developer                    | OpenAI       |
| Region                       | OTHER        |
| Info used for model training | No           |
| Data Retention Period (days) | 0            |
| Input cost per 1M tokens     | $0.15        |
| Output cost per 1M tokens    | $0.6         |

|                                 |        |
| ------------------------------- | ------ |
| Model                           | GPT 5  |
| Hosted                          | Azure  |
| Developer                       | OpenAI |
| Region                          | OTHER  |
| Info used for model training    | No     |
| Data Retention Period (days)    | 0      |
| Input cost per 1M tokens        | $1.25  |
| Output cost per 1M tokens       | $10    |
| Cached input cost per 1M tokens | $0.13  |

|                              |                        |
| ---------------------------- | ---------------------- |
| Model                        | Text Embedding 3 Large |
| Hosted                       | Azure                  |
| Developer                    | OpenAI                 |
| Region                       | US                     |
| Info used for model training | No                     |
| Data Retention Period (days) | 0                      |
| Input cost per 1M tokens     | $0.143                 |

|                                 |         |
| ------------------------------- | ------- |
| Model                           | GPT 5.1 |
| Hosted                          | Azure   |
| Developer                       | OpenAI  |
| Region                          | US      |
| Info used for model training    | No      |
| Data Retention Period (days)    | 0       |
| Input cost per 1M tokens        | $1.38   |
| Output cost per 1M tokens       | $11     |
| Cached input cost per 1M tokens | $0.14   |

|                                 |            |
| ------------------------------- | ---------- |
| Model                           | GPT 5 nano |
| Hosted                          | Azure      |
| Developer                       | OpenAI     |
| Region                          | US         |
| Info used for model training    | No         |
| Data Retention Period (days)    | 0          |
| Input cost per 1M tokens        | $0.06      |
| Output cost per 1M tokens       | $0.44      |
| Cached input cost per 1M tokens | $0.01      |

|                                 |        |
| ------------------------------- | ------ |
| Model                           | GPT 4o |
| Hosted                          | Azure  |
| Developer                       | OpenAI |
| Region                          | US     |
| Info used for model training    | No     |
| Data Retention Period (days)    | 0      |
| Input cost per 1M tokens        | $2.75  |
| Output cost per 1M tokens       | $11    |
| Cached input cost per 1M tokens | $1.375 |

|                              |             |
| ---------------------------- | ----------- |
| Model                        | Whisper 1   |
| Hosted                       | Azure       |
| Developer                    | OpenAI      |
| Region                       | US          |
| Info used for model training | No          |
| Data Retention Period (days) | 0           |
| Input cost per audio second  | $0.00012222 |

|                                 |              |
| ------------------------------- | ------------ |
| Model                           | GPT 4.1 mini |
| Hosted                          | Azure        |
| Developer                       | OpenAI       |
| Region                          | EU           |
| Info used for model training    | No           |
| Data Retention Period (days)    | 0            |
| Input cost per 1M tokens        | $0.44        |
| Output cost per 1M tokens       | $1.76        |
| Cached input cost per 1M tokens | $0.11        |

|                                 |               |
| ------------------------------- | ------------- |
| Model                           | GPT 5.3 Codex |
| Hosted                          | OpenAI        |
| Developer                       | OpenAI        |
| Region                          | OTHER         |
| Info used for model training    | No            |
| Data Retention Period (days)    | 30            |
| Input cost per 1M tokens        | $1.75         |
| Output cost per 1M tokens       | $14           |
| Cached input cost per 1M tokens | $0.175        |

|                                 |                 |
| ------------------------------- | --------------- |
| Model                           | GPT 5.3 Instant |
| Hosted                          | OpenAI          |
| Developer                       | OpenAI          |
| Region                          | OTHER           |
| Info used for model training    | No              |
| Data Retention Period (days)    | 30              |
| Input cost per 1M tokens        | $1.75           |
| Output cost per 1M tokens       | $14             |
| Cached input cost per 1M tokens | $0.175          |

|                                 |                 |
| ------------------------------- | --------------- |
| Model                           | GPT 5.3 Instant |
| Hosted                          | Azure           |
| Developer                       | OpenAI          |
| Region                          | OTHER           |
| Info used for model training    | No              |
| Data Retention Period (days)    | 0               |
| Input cost per 1M tokens        | $1.75           |
| Output cost per 1M tokens       | $14             |
| Cached input cost per 1M tokens | $0.175          |

|                                              |         |
| -------------------------------------------- | ------- |
| Model                                        | GPT 5.4 |
| Hosted                                       | Azure   |
| Developer                                    | OpenAI  |
| Region                                       | OTHER   |
| Info used for model training                 | No      |
| Data Retention Period (days)                 | 0       |
| Input cost per 1M tokens                     | $2.5    |
| Output cost per 1M tokens                    | $15     |
| Cached input cost per 1M tokens              | $0.25   |
| Long context boundary                        | 272,000 |
| Long context input cost per 1M tokens        | $5      |
| Long context output cost per 1M tokens       | $22.5   |
| Long context cached input cost per 1M tokens | $0.5    |

|                                 |              |
| ------------------------------- | ------------ |
| Model                           | GPT 5.4 nano |
| Hosted                          | Azure        |
| Developer                       | OpenAI       |
| Region                          | OTHER        |
| Info used for model training    | No           |
| Data Retention Period (days)    | 0            |
| Input cost per 1M tokens        | $0.2         |
| Output cost per 1M tokens       | $1.25        |
| Cached input cost per 1M tokens | $0.02        |

|                                 |              |
| ------------------------------- | ------------ |
| Model                           | GPT 5.4 mini |
| Hosted                          | Azure        |
| Developer                       | OpenAI       |
| Region                          | OTHER        |
| Info used for model training    | No           |
| Data Retention Period (days)    | 0            |
| Input cost per 1M tokens        | $0.75        |
| Output cost per 1M tokens       | $4.5         |
| Cached input cost per 1M tokens | $0.075       |

|                              |              |
| ---------------------------- | ------------ |
| Model                        | GPT-OSS 120b |
| Hosted                       | nexos.ai     |
| Developer                    | OpenAI       |
| Region                       | EU           |
| Info used for model training | No           |
| Data Retention Period (days) | 0            |
| Input cost per 1M tokens     | $0.8         |
| Output cost per 1M tokens    | $1.6         |

|                                              |         |
| -------------------------------------------- | ------- |
| Model                                        | GPT 5.5 |
| Hosted                                       | Azure   |
| Developer                                    | OpenAI  |
| Region                                       | OTHER   |
| Info used for model training                 | No      |
| Data Retention Period (days)                 | 0       |
| Input cost per 1M tokens                     | $5      |
| Output cost per 1M tokens                    | $30     |
| Cached input cost per 1M tokens              | $0.5    |
| Long context boundary                        | 272,000 |
| Long context input cost per 1M tokens        | $10     |
| Long context output cost per 1M tokens       | $45     |
| Long context cached input cost per 1M tokens | $1      |

|                                              |         |
| -------------------------------------------- | ------- |
| Model                                        | GPT 5.5 |
| Hosted                                       | Azure   |
| Developer                                    | OpenAI  |
| Region                                       | EU      |
| Info used for model training                 | No      |
| Data Retention Period (days)                 | 0       |
| Input cost per 1M tokens                     | $5.5    |
| Output cost per 1M tokens                    | $33     |
| Cached input cost per 1M tokens              | $0.55   |
| Long context boundary                        | 272,000 |
| Long context input cost per 1M tokens        | $11     |
| Long context output cost per 1M tokens       | $49.5   |
| Long context cached input cost per 1M tokens | $1.1    |

|                                 |             |
| ------------------------------- | ----------- |
| Model                           | GPT Image 2 |
| Hosted                          | Azure       |
| Developer                       | OpenAI      |
| Region                          | OTHER       |
| Info used for model training    | No          |
| Data Retention Period (days)    | 0           |
| Input cost per 1M tokens        | $5          |
| Output cost per 1M tokens       | $30         |
| Cached input cost per 1M tokens | $1.25       |

|                                              |         |
| -------------------------------------------- | ------- |
| Model                                        | GPT 5.4 |
| Hosted                                       | Azure   |
| Developer                                    | OpenAI  |
| Region                                       | EU      |
| Info used for model training                 | No      |
| Data Retention Period (days)                 | 0       |
| Input cost per 1M tokens                     | $2.75   |
| Output cost per 1M tokens                    | $16.5   |
| Cached input cost per 1M tokens              | $0.28   |
| Long context boundary                        | 272,000 |
| Long context input cost per 1M tokens        | $5.5    |
| Long context output cost per 1M tokens       | $22.5   |
| Long context cached input cost per 1M tokens | $0.56   |

|                                 |                    |
| ------------------------------- | ------------------ |
| Model                           | GPT Instant Latest |
| Hosted                          | Azure              |
| Developer                       | OpenAI             |
| Region                          | OTHER              |
| Info used for model training    | No                 |
| Data Retention Period (days)    | 0                  |
| Input cost per 1M tokens        | $5                 |
| Output cost per 1M tokens       | $30                |
| Cached input cost per 1M tokens | $0.5               |

|                              |                        |
| ---------------------------- | ---------------------- |
| Model                        | Text Embedding 3 Small |
| Hosted                       | Azure                  |
| Developer                    | OpenAI                 |
| Region                       | OTHER                  |
| Info used for model training | No                     |
| Data Retention Period (days) | 0                      |
| Input cost per 1M tokens     | $0.02                  |

</details>

<details>

<summary>StepFun</summary>

|                                 |                |
| ------------------------------- | -------------- |
| Model                           | Step 3.5 Flash |
| Hosted                          | DeepInfra      |
| Developer                       | StepFun        |
| Region                          | OTHER          |
| Info used for model training    | No             |
| Data Retention Period (days)    | 0              |
| Input cost per 1M tokens        | $0.1           |
| Output cost per 1M tokens       | $0.3           |
| Cached input cost per 1M tokens | $0.02          |

</details>

<details>

<summary>Zhipu AI</summary>

|                                 |              |
| ------------------------------- | ------------ |
| Model                           | GLM 5.1      |
| Hosted                          | Fireworks AI |
| Developer                       | Zhipu AI     |
| Region                          | OTHER        |
| Info used for model training    | No           |
| Data Retention Period (days)    | 0            |
| Input cost per 1M tokens        | $1.4         |
| Output cost per 1M tokens       | $4.4         |
| Cached input cost per 1M tokens | $0.26        |

</details>

<details>

<summary>xAI</summary>

{% hint style="info" %}
Please note that under xAI’s Agreement, while you retain ownership of your input and output, xAI is granted a license to use your content to provide the services and enforce its policies, including safety, compliance, and moderation. Additionally, xAI may create and use de-identified data from your use of the services to improve or develop its products, which it owns, and some user content may be retained temporarily for legal or moderation purposes. Although xAI is contractually restricted from using your content for model training, the scope of de-identified data use and policy enforcement is defined solely by xAI and may not fully align with conservative data handling expectations.
{% endhint %}

|                                              |                     |
| -------------------------------------------- | ------------------- |
| Model                                        | Grok 4.20 Reasoning |
| Hosted                                       | xAI                 |
| Developer                                    | xAI                 |
| Region                                       | EU                  |
| Info used for model training                 | —                   |
| Data Retention Period (days)                 | —                   |
| Input cost per 1M tokens                     | $1.25               |
| Output cost per 1M tokens                    | $2.5                |
| Cached input cost per 1M tokens              | $0.2                |
| Long context boundary                        | 200,000             |
| Long context input cost per 1M tokens        | $2.5                |
| Long context output cost per 1M tokens       | $5                  |
| Long context cached input cost per 1M tokens | $0.4                |

|                                              |           |
| -------------------------------------------- | --------- |
| Model                                        | Grok 4.20 |
| Hosted                                       | xAI       |
| Developer                                    | xAI       |
| Region                                       | EU        |
| Info used for model training                 | —         |
| Data Retention Period (days)                 | —         |
| Input cost per 1M tokens                     | $1.25     |
| Output cost per 1M tokens                    | $2.5      |
| Cached input cost per 1M tokens              | $0.2      |
| Long context boundary                        | 200,000   |
| Long context input cost per 1M tokens        | $2.5      |
| Long context output cost per 1M tokens       | $5        |
| Long context cached input cost per 1M tokens | $0.4      |

|                                              |          |
| -------------------------------------------- | -------- |
| Model                                        | Grok 4.3 |
| Hosted                                       | xAI      |
| Developer                                    | xAI      |
| Region                                       | EU       |
| Info used for model training                 | —        |
| Data Retention Period (days)                 | —        |
| Input cost per 1M tokens                     | $1.25    |
| Output cost per 1M tokens                    | $2.5     |
| Cached input cost per 1M tokens              | $0.2     |
| Long context boundary                        | 200,000  |
| Long context input cost per 1M tokens        | $2.5     |
| Long context output cost per 1M tokens       | $5       |
| Long context cached input cost per 1M tokens | $0.4     |

|                                              |          |
| -------------------------------------------- | -------- |
| Model                                        | Grok 4.3 |
| Hosted                                       | xAI      |
| Developer                                    | xAI      |
| Region                                       | OTHER    |
| Info used for model training                 | —        |
| Data Retention Period (days)                 | —        |
| Input cost per 1M tokens                     | $1.25    |
| Output cost per 1M tokens                    | $2.5     |
| Cached input cost per 1M tokens              | $0.2     |
| Long context boundary                        | 200,000  |
| Long context input cost per 1M tokens        | $2.5     |
| Long context output cost per 1M tokens       | $5       |
| Long context cached input cost per 1M tokens | $0.4     |

|                                              |                |
| -------------------------------------------- | -------------- |
| Model                                        | Grok Build 0.1 |
| Hosted                                       | xAI            |
| Developer                                    | xAI            |
| Region                                       | EU             |
| Info used for model training                 | —              |
| Data Retention Period (days)                 | —              |
| Input cost per 1M tokens                     | $1             |
| Output cost per 1M tokens                    | $2             |
| Cached input cost per 1M tokens              | $0.2           |
| Long context boundary                        | 200,000        |
| Long context input cost per 1M tokens        | $2             |
| Long context output cost per 1M tokens       | $4             |
| Long context cached input cost per 1M tokens | $0.4           |

</details>


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.nexos.ai/models/readme.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
