# Qwen Image Skill for OpenClaw

This repository provides an OpenClaw-compatible image generation skill backed by a configurable Qwen-compatible image API.

It follows the same basic contract as the ComfyUI reference skill:

- `SKILL.md` defines how OpenClaw should discover and call the skill.
- `scripts/registry.py` exposes the available workflow and parameters.
- `scripts/qwen_image_client.py` executes the actual image generation request and saves images locally.

## Features

- Configurable `base_url`, `model`, and `api_key`
- Natural-language image generation through a single workflow: `qwen/text-to-image`
- OpenClaw-friendly registry output for parameter discovery
- Local image download and storage under `./outputs`
- Compatible with OpenAI-style `images/generations` APIs that return `b64_json` or image URLs

## Project Structure

```text
qwen-image-skill/
├── SKILL.md
├── README.md
├── config.example.json
├── requirements.txt
├── outputs/
│   └── .gitkeep
└── scripts/
    ├── registry.py
    ├── qwen_image_client.py
    └── shared/
        ├── __init__.py
        └── config.py
```

## Installation

Install this repository into your OpenClaw skills directory, for example:

```bash
cd ~/.openclaw/workspace/skills
git clone <your-repo-url> qwen-image-skill-openclaw
cd qwen-image-skill-openclaw
python3 -m venv .venv
source .venv/bin/activate
python3 -m pip install -r requirements.txt
cp config.example.json config.json
```

## Configuration

Edit `config.json`:

```json
{
  "provider": {
    "name": "qwen-compatible",
    "base_url": "https://api-inference.modelscope.cn/v1",
    "api_key": "YOUR_QWEN_API_KEY",
    "model": "qwen-image"
  },
  "generation": {
    "output_dir": "./outputs",
    "timeout_seconds": 300,
    "default_size": "1024x1024",
    "default_n": 1,
    "default_response_format": "b64_json",
    "default_quality": "standard"
  }
}
```

Notes:

- `base_url` defaults to `https://api-inference.modelscope.cn/v1`. You only need to override it if you are using a different compatible gateway.
- `base_url` should point to the API root, not necessarily the full endpoint. The client appends `/images/generations` when needed.
- `model` is fully configurable so you can switch to a newer Qwen image model later without code changes.
- `api_key` is read from `config.json`. If you prefer environment variables later, that can be added separately.

## Verify

List the registered workflow:

```bash
python scripts/registry.py list --agent
```

Run a test generation:

```bash
python scripts/qwen_image_client.py \
  --workflow qwen/text-to-image \
  --args '{"prompt":"A cinematic portrait of a white cat astronaut on the moon","size":"1024x1024"}'
```

Expected success output:

```json
{
  "status": "success",
  "run_id": "...",
  "model": "qwen-image",
  "images": [
    "./outputs/..._1.png"
  ]
}
```

## API Compatibility Assumption

This implementation targets OpenAI-style image generation APIs exposed by Qwen-compatible providers. It supports two modes:

- Synchronous providers that return `data[].b64_json` or `data[].url` directly from `POST <base_url>/images/generations`
- ModelScope-style asynchronous providers that require `X-ModelScope-Async-Mode: true` and polling `GET <base_url>/tasks/<task_id>` with `X-ModelScope-Task-Type: image_generation`

For the default ModelScope endpoint, the client automatically detects the synchronous-call rejection and retries in async mode.