# GPT Image 2 API Guide

This guide describes how to call `gpt-image-2` through sub2api or any OpenAI-compatible gateway.

Default examples use:

```text
BASE_URL=https://claude.omniclaw.store/v1
API_KEY=<sub2api API key generated from the /keys page>
```

Do not use ChatGPT OAuth tokens from `.codex/auth.json` as API keys.

## Quick Summary

- Direct image generation: call `POST /v1/images/generations` with `model: "gpt-image-2"`.
- Image editing: call `POST /v1/images/edits` with multipart `image[]` files and an optional `mask`.
- Agent/Codex workflows: keep the main model as a text/agent model such as `gpt-5.5`, then call image generation through the Responses API `image_generation` tool.
- Do not use `gpt-image-2` as the Codex main model.
- `gpt-image-2` normally returns base64 image data at `data[0].b64_json`.
- `3840x2160` 4K output works but is high-latency and high-cost; use 180-300 second timeouts for production.

## Official Capability Summary

`gpt-image-2` is an image generation and editing model with text input, image input, and image output support.

Model aliases:

```text
gpt-image-2
gpt-image-2-2026-04-21
```

Supported API surfaces:

```text
/v1/images/generations
/v1/images/edits
/v1/responses   # via image_generation tool
```

Official references:

- https://developers.openai.com/api/docs/models/gpt-image-2
- https://developers.openai.com/api/docs/guides/image-generation
- https://developers.openai.com/api/reference/resources/images

## Authentication

```bash
export BASE_URL="https://claude.omniclaw.store/v1"
export API_KEY="sk-..."
```

JSON requests require:

```http
Authorization: Bearer $API_KEY
Content-Type: application/json
```

For multipart image edits, let `curl -F` or the SDK set `Content-Type`.

## Image Generation

### Minimal Request

```bash
curl -sS "$BASE_URL/images/generations" \
  -H "Authorization: Bearer $API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gpt-image-2",
    "prompt": "A compact Apple-style dashboard UI, clean white background",
    "size": "1024x1024",
    "quality": "medium",
    "output_format": "png",
    "n": 1
  }' > image.json
```

Decode the response:

```bash
jq -r '.data[0].b64_json' image.json | base64 --decode > image.png
```

### 4K Request

```bash
curl -sS "$BASE_URL/images/generations" \
  -H "Authorization: Bearer $API_KEY" \
  -H "Content-Type: application/json" \
  --max-time 300 \
  -d '{
    "model": "gpt-image-2",
    "prompt": "A modern product poster, cinematic lighting, premium realistic photography",
    "size": "3840x2160",
    "quality": "medium",
    "output_format": "png",
    "n": 1
  }' > image-4k.json
```

Production recommendation: first validate prompts with `1024x1024` or `1536x1024`, then upscale the request to `3840x2160`. `4K + high` can be slow and expensive.

## Generation Parameters

| Parameter | Type | Recommended value | Notes |
|---|---|---|---|
| `model` | string | `gpt-image-2` | Required. The snapshot `gpt-image-2-2026-04-21` is also valid. |
| `prompt` | string | detailed natural language | Required. Include subject, environment, camera, style, lighting, and constraints. |
| `n` | number | `1` | Number of images. Prefer single-image requests for retry and billing attribution. |
| `size` | string | `1024x1024`, `1536x1024`, `3840x2160` | Flexible sizes are supported when they satisfy the model constraints. |
| `quality` | string | `low`, `medium`, `high`, `auto` | Use `low` for drafts, `medium` for normal output, `high` for final assets. |
| `output_format` | string | `png`, `jpeg`, `webp` | Default is usually `png`; use `jpeg` for latency-sensitive outputs. |
| `output_compression` | number | `0-100` | Only applies to `jpeg` and `webp`. |
| `background` | string | `auto`, `opaque` | `gpt-image-2` currently does not support `transparent`. |
| `moderation` | string | `auto`, `low` | Adjusts filtering level but does not bypass safety policy. |
| `stream` | boolean | `false` | Enables SSE image streaming. |
| `partial_images` | number | `0-3` | Streaming only; partial images increase output token cost. |
| `user` | string | end-user ID | Useful for audit and abuse monitoring. |

## Size Constraints

`size` can be `auto` or a valid `widthxheight` value:

- Maximum edge length is `3840px`.
- Width and height must both be multiples of `16px`.
- Long edge to short edge ratio must be at most `3:1`.
- Total pixels must be between `655,360` and `8,294,400`.

Common values:

```text
1024x1024
1536x1024
1024x1536
2048x2048
2048x1152
3840x2160
2160x3840
auto
```

Treat outputs larger than `2560x1440` as experimental high-pixel workloads with higher latency, higher cost, and higher failure probability.

## Response Shape

Typical response:

```json
{
  "created": 1770000000,
  "background": "opaque",
  "data": [
    {
      "b64_json": "...",
      "revised_prompt": "..."
    }
  ],
  "model": "gpt-image-2",
  "output_format": "png",
  "quality": "medium",
  "size": "1024x1024",
  "usage": {
    "input_tokens": 43,
    "input_tokens_details": {
      "image_tokens": 0,
      "text_tokens": 43
    },
    "output_tokens": 196,
    "output_tokens_details": {
      "image_tokens": 196,
      "text_tokens": 0
    },
    "total_tokens": 239
  }
}
```

Production systems should store:

- `model`
- `size`
- `quality`
- `output_format`
- `usage.total_tokens`
- `usage.input_tokens`
- `usage.output_tokens`
- latency
- upstream account, group, user, and key identifiers

## Image Editing

### Single-image Edit

```bash
curl -sS "$BASE_URL/images/edits" \
  -H "Authorization: Bearer $API_KEY" \
  -F "model=gpt-image-2" \
  -F "image[]=@input.png" \
  -F "prompt=Replace the sofa with a minimalist white lounge chair" \
  -F "size=1024x1024" \
  -F "quality=medium" \
  -F "output_format=png" \
  > edit.json
```

### Masked Local Edit

```bash
curl -sS "$BASE_URL/images/edits" \
  -H "Authorization: Bearer $API_KEY" \
  -F "model=gpt-image-2" \
  -F "image[]=@input.png" \
  -F "mask=@mask.png" \
  -F "prompt=Change only the transparent masked region into a glass button" \
  -F "size=1024x1024" \
  -F "quality=medium" \
  > edit-mask.json
```

Mask requirements:

- `image` and `mask` must have the same format and dimensions.
- Files must be under 50MB.
- `mask` must include an alpha channel.
- Do not pass `input_fidelity` for `gpt-image-2`; the model processes image inputs at high fidelity by default.

## Responses API With `image_generation`

Use this when an agent should reason about the task before generating an image. The main model should be a text/agent model, such as `gpt-5.5`.

```bash
curl -sS "$BASE_URL/responses" \
  -H "Authorization: Bearer $API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gpt-5.5",
    "input": "Generate a clean product poster for an AI proxy service.",
    "tools": [
      {
        "type": "image_generation",
        "quality": "medium",
        "size": "1536x1024",
        "output_format": "png"
      }
    ]
  }' > response-image.json
```

Important:

- `model` is the main reasoning model, not `gpt-image-2`.
- The `image_generation` tool performs the image work.
- sub2api may inject the image tool for official Codex clients, but application calls should pass it explicitly.

## Streaming Images

The Images API supports SSE streaming:

```bash
curl -N "$BASE_URL/images/generations" \
  -H "Authorization: Bearer $API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gpt-image-2",
    "prompt": "A futuristic city skyline at sunrise",
    "stream": true,
    "partial_images": 2,
    "size": "1536x1024",
    "quality": "medium"
  }'
```

Events:

```text
image_generation.partial_image
image_generation.completed
```

`partial_images` can be `0-3`. Each partial image adds output token cost.

## SDK Examples

### Node.js

```ts
import fs from "node:fs";
import OpenAI from "openai";

const client = new OpenAI({
  apiKey: process.env.API_KEY,
  baseURL: process.env.BASE_URL ?? "https://claude.omniclaw.store/v1",
});

const result = await client.images.generate({
  model: "gpt-image-2",
  prompt: "A premium product poster for an AI service",
  size: "1536x1024",
  quality: "medium",
  output_format: "png",
  n: 1,
});

const b64 = result.data?.[0]?.b64_json;
if (!b64) throw new Error("No image returned");
fs.writeFileSync("image.png", Buffer.from(b64, "base64"));
```

### Python

```py
import base64
import os
from openai import OpenAI

client = OpenAI(
    api_key=os.environ["API_KEY"],
    base_url=os.environ.get("BASE_URL", "https://claude.omniclaw.store/v1"),
)

result = client.images.generate(
    model="gpt-image-2",
    prompt="A premium product poster for an AI service",
    size="1536x1024",
    quality="medium",
    output_format="png",
    n=1,
)

b64 = result.data[0].b64_json
with open("image.png", "wb") as f:
    f.write(base64.b64decode(b64))
```

## Production Dispatch

- Routing: prefer plus/team/pro OpenAI OAuth accounts for image workloads.
- Timeout: use 120 seconds for normal images and 300 seconds for 4K.
- Retry: only retry transient network failures and 502/503/504 with low retry counts.
- Concurrency: 4K output produces many image tokens; use low per-account concurrency. Standard 1024 images can use higher concurrency.
- Billing: record `usage` and charge based on input and output tokens. 4K can produce far more output tokens than 1024 images.
- Latency: use `jpeg` and `quality: low` for drafts or latency-sensitive previews.
- Fallback: if `4K/high` fails, retry `4K/medium`; if that still fails, generate `1536x1024/medium` and upscale separately.

## Common Errors

| Symptom | Likely cause | Action |
|---|---|---|
| `401 INVALID_API_KEY` | Key is not a sub2api key or is disabled/deleted | Generate a new key from `/keys` |
| `400 invalid_request_error` | Incompatible params such as transparent background or invalid size | Check `size`, `background`, and `quality` |
| `429 usage_limit_reached` | Upstream account usage window hit | Switch plus/team/pro account or wait for reset |
| `502 Upstream request failed` | Upstream did not return image data, network failed, or content was refused | Inspect server logs, simplify prompt, lower quality or size |
| Request takes over 2 minutes | High pixels or complex prompt | Increase timeout, use streaming, or test lower resolution first |
| `/v1/models` does not show `gpt-image-2` | Codex/text model list is not the Images API capability list | Call `/v1/images/generations` directly |

## Safety Boundary

Filter clearly disallowed content before sending requests, especially:

- Sexualized minors or young-looking subjects
- Non-consensual sexual content, coercion, or sexual violence
- Explicit nudity or graphic sexual activity
- Illegal, hateful, or extreme violent content

For safe romantic scenes, explicitly constrain prompts with terms such as adult, non-explicit, no nudity, and fully clothed.