Metadata-Version: 2.4
Name: diffuse-mlx
Version: 0.1.2
Summary: Stable Diffusion image generation for Apple Silicon, powered by MLX
Project-URL: Homepage, https://github.com/aiamblichus/diffuse-mlx
Project-URL: Repository, https://github.com/aiamblichus/diffuse-mlx
Project-URL: Issues, https://github.com/aiamblichus/diffuse-mlx/issues
License: MIT
Keywords: apple-silicon,image-generation,mlx,stable-diffusion
Classifier: Development Status :: 3 - Alpha
Classifier: Environment :: MacOS X
Classifier: Intended Audience :: Developers
Classifier: Intended Audience :: End Users/Desktop
Classifier: License :: OSI Approved :: MIT License
Classifier: Operating System :: MacOS
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Topic :: Multimedia :: Graphics
Requires-Python: >=3.10
Requires-Dist: huggingface-hub
Requires-Dist: mlx
Requires-Dist: numpy
Requires-Dist: pillow
Requires-Dist: regex
Requires-Dist: tqdm
Requires-Dist: typer
Description-Content-Type: text/markdown

# diffuse-mlx

<div align="center">

![diffuse-mlx banner](https://raw.githubusercontent.com/aiamblichus/diffuse-mlx/main/images/cathedral.png)

MLX-powered Stable Diffusion CLI for Apple Silicon. Fast, memory-efficient, runs quantized — no CUDA required. This is a touched-up version of [Apple's mlx-examples](https://github.com/ml-explore/mlx-examples/tree/main/stable_diffusion) to make it compatible with SD-1.5.

</div>

---

## Install

### No install — run directly with uvx

```bash
uvx diffuse-mlx generate "a red fox in a snowy forest"
```

### Permanent install with uv tool

```bash
uv tool install diffuse-mlx
diffuse-mlx generate "a red fox in a snowy forest"
```

---

## Quick start

### SDXL Turbo (default — blazing fast, 2 steps)

```bash
diffuse-mlx generate "a red fox in a snowy forest, cinematic lighting, 8k"
```

### Stable Diffusion 2.1

```bash
diffuse-mlx generate "a red fox in a snowy forest" --model sd --steps 30
```

### DALL-E 2 Finetune (painterly surrealist quality)

```bash
diffuse-mlx generate \
  "Victorian botanist cataloguing impossible flowers that are also doors, gouache illustration, soft candlelight, muted palette, highly detailed, trending on artstation" \
  --model dalle2 -q
```

---

## The DALL-E 2 Finetune

[snwy/SD1.5-DALLE-2](https://huggingface.co/snwy/SD1.5-DALLE-2) on HuggingFace is an SD 1.5 model finetuned on DALL-E 2 outputs. The training data gives it a distinctive painterly, surrealist quality — images come out soft, dreamlike, and compositionally unusual in ways that feel closer to illustration than photorealism.

**Tips for best results:**

- Use `-q` (quantized) — strongly recommended on 8 GB devices, and the quality loss is negligible.
- Write prompts in the old-school comma-separated style: `"subject, style, mood, lighting, medium"`.
- Lean into the surreal: this model shines with imaginative, painterly subjects rather than photorealistic ones.
- Default cfg (7.5) and steps (50) are already tuned for it; no need to adjust unless experimenting.

```bash
diffuse-mlx generate \
  "clockwork cathedral assembled from musical instruments, choral light, baroque architecture, concept art" \
  --model dalle2 -q --n-images 4
```

---

## Models

| Alias    | HuggingFace repo                                                                            | Description                                             |
|----------|---------------------------------------------------------------------------------------------|---------------------------------------------------------|
| `sdxl`   | [stabilityai/sdxl-turbo](https://huggingface.co/stabilityai/sdxl-turbo)                    | SDXL Turbo — distilled model, 2-step inference, default |
| `sd`     | [stabilityai/stable-diffusion-2-1-base](https://huggingface.co/stabilityai/stable-diffusion-2-1-base) | SD 2.1 Base — solid all-rounder, 50 steps      |
| `dalle2` | [snwy/SD1.5-DALLE-2](https://huggingface.co/snwy/SD1.5-DALLE-2)                            | SD 1.5 finetuned on DALL-E 2 outputs — painterly, surreal |

---

## Memory and quantization

On Apple Silicon Macs with 8 GB unified memory, run with `-q` (quantize) to keep memory usage manageable:

```bash
diffuse-mlx generate "your prompt here" --model sd -q
```

Quantization applies 8-bit quantization to the UNet and linear layers of the text encoder(s). Quality impact is minimal for most prompts. SDXL Turbo is already fast and light; `-q` helps most with SD 2.1 and the DALL-E 2 finetune.

For 16 GB+ devices you can skip `-q` and use `--no-float16` for full float32 precision if you notice any numerical issues.

---

## Image-to-image (`img2img`)

Transform an existing image guided by a text prompt:

```bash
diffuse-mlx img2img photo.jpg "oil painting of a harbour at dusk, impressionist style" \
  --model sd --strength 0.75 -q
```

**`--strength`** controls how much the original image is preserved:
- `0.0` — output is identical to the input (no change)
- `1.0` — input image is completely ignored, purely text-driven
- `0.75` — a good starting point: retains composition and colours while applying the style

Lower strength values work well for style transfer; higher values for more drastic transformations.

---

## All options

### `diffuse-mlx generate`

```
diffuse-mlx generate PROMPT [OPTIONS]

  --model           [sdxl|sd|dalle2]  default: sdxl
  --n-images        INT               Number of images to generate (default: 4)
  --steps           INT               Diffusion steps (default: 2 for sdxl, 50 for others)
  --cfg             FLOAT             Guidance weight (default: 0.0 for sdxl, 7.5 for others)
  --negative-prompt TEXT              Negative prompt (default: "")
  --n-rows          INT               Grid rows in output image (default: 1)
  --decoding-batch-size INT           VAE decoding batch size (default: 1)
  --no-float16                        Use float32 instead of float16
  -q, --quantize                      Quantize model weights
  --preload-models                    Preload all weights before generation
  --output          PATH              Output file (default: out.png)
  --seed            INT               Random seed for reproducibility
  -v, --verbose                       Print peak memory usage
```

### `diffuse-mlx img2img`

```
diffuse-mlx img2img IMAGE PROMPT [OPTIONS]

  --model           [sdxl|sd|dalle2]  default: sdxl
  --strength        FLOAT             Transformation strength 0.0–1.0 (default: 0.9)
  --n-images        INT               Number of images to generate (default: 4)
  --steps           INT               Diffusion steps
  --cfg             FLOAT             Guidance weight
  --negative-prompt TEXT              Negative prompt
  --no-float16                        Use float32
  -q, --quantize                      Quantize model weights
  --preload-models                    Preload all weights
  --output          PATH              Output file (default: out.png)
  --seed            INT               Random seed
  -v, --verbose                       Print peak memory usage
```

---

## See also

If you're interested in running **Flux** models locally on Apple Silicon, check out [mflux](https://github.com/filipstrand/mflux) — a similar project that brings the Flux family of models to MLX with a comparable CLI experience.

---

## Credits

Core MLX implementation ported from [Apple's mlx-examples](https://github.com/ml-explore/mlx-examples) (Copyright © Apple Inc.). The stable diffusion library files are reproduced verbatim under the terms of the original Apple MIT license.

The DALL-E 2 finetune model (`snwy/SD1.5-DALLE-2`) is by [snwy on HuggingFace](https://huggingface.co/snwy/SD1.5-DALLE-2). Check the model card for its license terms.
