feat: initialize OmniClaw skills registry

2026-04-24 01:12:20 -07:00
commit 789bb38e69
16 changed files with 1713 additions and 0 deletions
--- a/apis/sub2api/gpt-image-2.en.md
+++ b/apis/sub2api/gpt-image-2.en.md
@@ -0,0 +1,370 @@
+# GPT Image 2 API Guide
+
+This guide describes how to call `gpt-image-2` through sub2api or any OpenAI-compatible gateway.
+
+Default examples use:
+
+```text
+BASE_URL=https://claude.omniclaw.store/v1
+API_KEY=<sub2api API key generated from the /keys page>
+```
+
+Do not use ChatGPT OAuth tokens from `.codex/auth.json` as API keys.
+
+## Quick Summary
+
+- Direct image generation: call `POST /v1/images/generations` with `model: "gpt-image-2"`.
+- Image editing: call `POST /v1/images/edits` with multipart `image[]` files and an optional `mask`.
+- Agent/Codex workflows: keep the main model as a text/agent model such as `gpt-5.5`, then call image generation through the Responses API `image_generation` tool.
+- Do not use `gpt-image-2` as the Codex main model.
+- `gpt-image-2` normally returns base64 image data at `data[0].b64_json`.
+- `3840x2160` 4K output works but is high-latency and high-cost; use 180-300 second timeouts for production.
+
+## Official Capability Summary
+
+`gpt-image-2` is an image generation and editing model with text input, image input, and image output support.
+
+Model aliases:
+
+```text
+gpt-image-2
+gpt-image-2-2026-04-21
+```
+
+Supported API surfaces:
+
+```text
+/v1/images/generations
+/v1/images/edits
+/v1/responses   # via image_generation tool
+```
+
+Official references:
+
+- https://developers.openai.com/api/docs/models/gpt-image-2
+- https://developers.openai.com/api/docs/guides/image-generation
+- https://developers.openai.com/api/reference/resources/images
+
+## Authentication
+
+```bash
+export BASE_URL="https://claude.omniclaw.store/v1"
+export API_KEY="sk-..."
+```
+
+JSON requests require:
+
+```http
+Authorization: Bearer $API_KEY
+Content-Type: application/json
+```
+
+For multipart image edits, let `curl -F` or the SDK set `Content-Type`.
+
+## Image Generation
+
+### Minimal Request
+
+```bash
+curl -sS "$BASE_URL/images/generations" \
+  -H "Authorization: Bearer $API_KEY" \
+  -H "Content-Type: application/json" \
+  -d '{
+    "model": "gpt-image-2",
+    "prompt": "A compact Apple-style dashboard UI, clean white background",
+    "size": "1024x1024",
+    "quality": "medium",
+    "output_format": "png",
+    "n": 1
+  }' > image.json
+```
+
+Decode the response:
+
+```bash
+jq -r '.data[0].b64_json' image.json | base64 --decode > image.png
+```
+
+### 4K Request
+
+```bash
+curl -sS "$BASE_URL/images/generations" \
+  -H "Authorization: Bearer $API_KEY" \
+  -H "Content-Type: application/json" \
+  --max-time 300 \
+  -d '{
+    "model": "gpt-image-2",
+    "prompt": "A modern product poster, cinematic lighting, premium realistic photography",
+    "size": "3840x2160",
+    "quality": "medium",
+    "output_format": "png",
+    "n": 1
+  }' > image-4k.json
+```
+
+Production recommendation: first validate prompts with `1024x1024` or `1536x1024`, then upscale the request to `3840x2160`. `4K + high` can be slow and expensive.
+
+## Generation Parameters
+
+| Parameter | Type | Recommended value | Notes |
+|---|---|---|---|
+| `model` | string | `gpt-image-2` | Required. The snapshot `gpt-image-2-2026-04-21` is also valid. |
+| `prompt` | string | detailed natural language | Required. Include subject, environment, camera, style, lighting, and constraints. |
+| `n` | number | `1` | Number of images. Prefer single-image requests for retry and billing attribution. |
+| `size` | string | `1024x1024`, `1536x1024`, `3840x2160` | Flexible sizes are supported when they satisfy the model constraints. |
+| `quality` | string | `low`, `medium`, `high`, `auto` | Use `low` for drafts, `medium` for normal output, `high` for final assets. |
+| `output_format` | string | `png`, `jpeg`, `webp` | Default is usually `png`; use `jpeg` for latency-sensitive outputs. |
+| `output_compression` | number | `0-100` | Only applies to `jpeg` and `webp`. |
+| `background` | string | `auto`, `opaque` | `gpt-image-2` currently does not support `transparent`. |
+| `moderation` | string | `auto`, `low` | Adjusts filtering level but does not bypass safety policy. |
+| `stream` | boolean | `false` | Enables SSE image streaming. |
+| `partial_images` | number | `0-3` | Streaming only; partial images increase output token cost. |
+| `user` | string | end-user ID | Useful for audit and abuse monitoring. |
+
+## Size Constraints
+
+`size` can be `auto` or a valid `widthxheight` value:
+
+- Maximum edge length is `3840px`.
+- Width and height must both be multiples of `16px`.
+- Long edge to short edge ratio must be at most `3:1`.
+- Total pixels must be between `655,360` and `8,294,400`.
+
+Common values:
+
+```text
+1024x1024
+1536x1024
+1024x1536
+2048x2048
+2048x1152
+3840x2160
+2160x3840
+auto
+```
+
+Treat outputs larger than `2560x1440` as experimental high-pixel workloads with higher latency, higher cost, and higher failure probability.
+
+## Response Shape
+
+Typical response:
+
+```json
+{
+  "created": 1770000000,
+  "background": "opaque",
+  "data": [
+    {
+      "b64_json": "...",
+      "revised_prompt": "..."
+    }
+  ],
+  "model": "gpt-image-2",
+  "output_format": "png",
+  "quality": "medium",
+  "size": "1024x1024",
+  "usage": {
+    "input_tokens": 43,
+    "input_tokens_details": {
+      "image_tokens": 0,
+      "text_tokens": 43
+    },
+    "output_tokens": 196,
+    "output_tokens_details": {
+      "image_tokens": 196,
+      "text_tokens": 0
+    },
+    "total_tokens": 239
+  }
+}
+```
+
+Production systems should store:
+
+- `model`
+- `size`
+- `quality`
+- `output_format`
+- `usage.total_tokens`
+- `usage.input_tokens`
+- `usage.output_tokens`
+- latency
+- upstream account, group, user, and key identifiers
+
+## Image Editing
+
+### Single-image Edit
+
+```bash
+curl -sS "$BASE_URL/images/edits" \
+  -H "Authorization: Bearer $API_KEY" \
+  -F "model=gpt-image-2" \
+  -F "image[]=@input.png" \
+  -F "prompt=Replace the sofa with a minimalist white lounge chair" \
+  -F "size=1024x1024" \
+  -F "quality=medium" \
+  -F "output_format=png" \
+  > edit.json
+```
+
+### Masked Local Edit
+
+```bash
+curl -sS "$BASE_URL/images/edits" \
+  -H "Authorization: Bearer $API_KEY" \
+  -F "model=gpt-image-2" \
+  -F "image[]=@input.png" \
+  -F "mask=@mask.png" \
+  -F "prompt=Change only the transparent masked region into a glass button" \
+  -F "size=1024x1024" \
+  -F "quality=medium" \
+  > edit-mask.json
+```
+
+Mask requirements:
+
+- `image` and `mask` must have the same format and dimensions.
+- Files must be under 50MB.
+- `mask` must include an alpha channel.
+- Do not pass `input_fidelity` for `gpt-image-2`; the model processes image inputs at high fidelity by default.
+
+## Responses API With `image_generation`
+
+Use this when an agent should reason about the task before generating an image. The main model should be a text/agent model, such as `gpt-5.5`.
+
+```bash
+curl -sS "$BASE_URL/responses" \
+  -H "Authorization: Bearer $API_KEY" \
+  -H "Content-Type: application/json" \
+  -d '{
+    "model": "gpt-5.5",
+    "input": "Generate a clean product poster for an AI proxy service.",
+    "tools": [
+      {
+        "type": "image_generation",
+        "quality": "medium",
+        "size": "1536x1024",
+        "output_format": "png"
+      }
+    ]
+  }' > response-image.json
+```
+
+Important:
+
+- `model` is the main reasoning model, not `gpt-image-2`.
+- The `image_generation` tool performs the image work.
+- sub2api may inject the image tool for official Codex clients, but application calls should pass it explicitly.
+
+## Streaming Images
+
+The Images API supports SSE streaming:
+
+```bash
+curl -N "$BASE_URL/images/generations" \
+  -H "Authorization: Bearer $API_KEY" \
+  -H "Content-Type: application/json" \
+  -d '{
+    "model": "gpt-image-2",
+    "prompt": "A futuristic city skyline at sunrise",
+    "stream": true,
+    "partial_images": 2,
+    "size": "1536x1024",
+    "quality": "medium"
+  }'
+```
+
+Events:
+
+```text
+image_generation.partial_image
+image_generation.completed
+```
+
+`partial_images` can be `0-3`. Each partial image adds output token cost.
+
+## SDK Examples
+
+### Node.js
+
+```ts
+import fs from "node:fs";
+import OpenAI from "openai";
+
+const client = new OpenAI({
+  apiKey: process.env.API_KEY,
+  baseURL: process.env.BASE_URL ?? "https://claude.omniclaw.store/v1",
+});
+
+const result = await client.images.generate({
+  model: "gpt-image-2",
+  prompt: "A premium product poster for an AI service",
+  size: "1536x1024",
+  quality: "medium",
+  output_format: "png",
+  n: 1,
+});
+
+const b64 = result.data?.[0]?.b64_json;
+if (!b64) throw new Error("No image returned");
+fs.writeFileSync("image.png", Buffer.from(b64, "base64"));
+```
+
+### Python
+
+```py
+import base64
+import os
+from openai import OpenAI
+
+client = OpenAI(
+    api_key=os.environ["API_KEY"],
+    base_url=os.environ.get("BASE_URL", "https://claude.omniclaw.store/v1"),
+)
+
+result = client.images.generate(
+    model="gpt-image-2",
+    prompt="A premium product poster for an AI service",
+    size="1536x1024",
+    quality="medium",
+    output_format="png",
+    n=1,
+)
+
+b64 = result.data[0].b64_json
+with open("image.png", "wb") as f:
+    f.write(base64.b64decode(b64))
+```
+
+## Production Dispatch
+
+- Routing: prefer plus/team/pro OpenAI OAuth accounts for image workloads.
+- Timeout: use 120 seconds for normal images and 300 seconds for 4K.
+- Retry: only retry transient network failures and 502/503/504 with low retry counts.
+- Concurrency: 4K output produces many image tokens; use low per-account concurrency. Standard 1024 images can use higher concurrency.
+- Billing: record `usage` and charge based on input and output tokens. 4K can produce far more output tokens than 1024 images.
+- Latency: use `jpeg` and `quality: low` for drafts or latency-sensitive previews.
+- Fallback: if `4K/high` fails, retry `4K/medium`; if that still fails, generate `1536x1024/medium` and upscale separately.
+
+## Common Errors
+
+| Symptom | Likely cause | Action |
+|---|---|---|
+| `401 INVALID_API_KEY` | Key is not a sub2api key or is disabled/deleted | Generate a new key from `/keys` |
+| `400 invalid_request_error` | Incompatible params such as transparent background or invalid size | Check `size`, `background`, and `quality` |
+| `429 usage_limit_reached` | Upstream account usage window hit | Switch plus/team/pro account or wait for reset |
+| `502 Upstream request failed` | Upstream did not return image data, network failed, or content was refused | Inspect server logs, simplify prompt, lower quality or size |
+| Request takes over 2 minutes | High pixels or complex prompt | Increase timeout, use streaming, or test lower resolution first |
+| `/v1/models` does not show `gpt-image-2` | Codex/text model list is not the Images API capability list | Call `/v1/images/generations` directly |
+
+## Safety Boundary
+
+Filter clearly disallowed content before sending requests, especially:
+
+- Sexualized minors or young-looking subjects
+- Non-consensual sexual content, coercion, or sexual violence
+- Explicit nudity or graphic sexual activity
+- Illegal, hateful, or extreme violent content
+
+For safe romantic scenes, explicitly constrain prompts with terms such as adult, non-explicit, no nudity, and fully clothed.
+