Metadata-Version: 2.1
Name: auto1111sdk
Version: 0.0.5
Summary: SDK for Automatic 1111.
Author: Auto1111 SDK
Author-email: saketh.kotamraju@gmail.com
Keywords: python,Automatic 1111,Stable Diffusion Web UI,image generation,stable diffusion,civit ai
Classifier: Development Status :: 2 - Pre-Alpha
Classifier: Intended Audience :: Developers
Classifier: Programming Language :: Python :: 3
Classifier: Operating System :: Unix
Classifier: Operating System :: MacOS :: MacOS X
Classifier: Operating System :: Microsoft :: Windows
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: GitPython ==3.1.32
Requires-Dist: Pillow ==9.5.0
Requires-Dist: accelerate ==0.21.0
Requires-Dist: basicsr ==1.4.2
Requires-Dist: blendmodes ==2022
Requires-Dist: clean-fid ==0.1.35
Requires-Dist: einops ==0.4.1
Requires-Dist: fastapi ==0.94.0
Requires-Dist: gfpgan ==1.3.8
Requires-Dist: gradio ==3.41.2
Requires-Dist: httpcore ==0.15
Requires-Dist: inflection ==0.5.1
Requires-Dist: jsonmerge ==1.8.0
Requires-Dist: kornia ==0.6.7
Requires-Dist: lark ==1.1.2
Requires-Dist: numpy ==1.23.5
Requires-Dist: omegaconf ==2.2.3
Requires-Dist: open-clip-torch ==2.20.0
Requires-Dist: piexif ==1.1.3
Requires-Dist: psutil ==5.9.5
Requires-Dist: pytorch-lightning ==1.9.4
Requires-Dist: realesrgan ==0.3.0
Requires-Dist: resize-right ==0.0.2
Requires-Dist: safetensors ==0.3.1
Requires-Dist: scikit-image ==0.21.0
Requires-Dist: timm ==0.9.2
Requires-Dist: tomesd ==0.1.3
Requires-Dist: torch
Requires-Dist: torchdiffeq ==0.2.3
Requires-Dist: torchsde ==0.2.6
Requires-Dist: transformers ==4.30.2
Requires-Dist: httpx ==0.24.1
Requires-Dist: clip


# Auto 1111 SDK/Python Client

Auto 1111 SDK is a light-weight Python library for generating images, upscaling images, and editing images with diffusion models. It is designed to be a modular, light-weight Python client that encapsulates all the main features of the [Automatic 1111 Stable Diffusion Web Ui](https://github.com/AUTOMATIC1111/stable-diffusion-webui). Auto 1111 SDK offers 3 main core features currently:

- State of the Art Diffusion Pipelines that can run inference for in just a few lines of code. Our pipelines can currently run Text-to-Image, Image-to-Image, Inpainting, Outpainting, and Stable Diffusion Upscale. Our pipelines support the exact same parameters as the [Stable Diffusion Web UI](https://github.com/AUTOMATIC1111/stable-diffusion-webui), so you can easily replicate creations from the Web Ui on the SDK.
- Upscaling Pipelines that can run inference for any Esrgan or Real Esrgan upscaler in a few lines of code.
- An integration with Civit AI to directly download models from the website.

Join our [Discord!!](https://discord.gg/S7wRQqt6QV)

## Installation

We recommend installing Auto 1111 SDK in a virtual environment from PyPI or Conda. 

```bash
pip3 install auto1111sdk
```

## Quickstart

Generating images with Auto 1111 SDK is super easy. To run inference for Text-to-Image, Image-to-Image, Inpainting, Outpainting, or Stable Diffusion Upscale, we have 1 pipeline that can support all these operations. This saves a lot of RAM from having to create multiple pipeline objects with other solutions.

```python
from auto1111sdk import StableDiffusionPipeline

pipe = StableDiffusionPipeline("<Path to your local safetensors or checkpoint file>")

prompt = "a picture of a brown dog"
output = pipe.generate_txt2img(prompt = prompt, height = 1024, width = 768, steps = 10)

output[0].save("image.png")
```

## Documentation

We have more detailed examples/documentation of how you can use Auto 1111 SDK [here.](https://flush-ai.gitbook.io/automatic-1111-sdk/). 
For a detailed comparison between us and Huggingface diffusers, you can read [this.](https://flush-ai.gitbook.io/automatic-1111-sdk/auto-1111-sdk-vs-huggingface-diffusers).


## Features
- Original txt2img and img2img modes
- Real ESRGAN upscale and Esrgan Upscale (compatible with any pth file)
- Outpainting
- Inpainting
- Stable Diffusion Upscale
- Attention, specify parts of text that the model should pay more attention to
    - a man in a `((tuxedo))` - will pay more attention to tuxedo
    - a man in a `(tuxedo:1.21)` - alternative syntax
    - select text and press `Ctrl+Up` or `Ctrl+Down` (or `Command+Up` or `Command+Down` if you're on a MacOS) to automatically adjust attention to selected text (code contributed by anonymous user)
- Composable Diffusion: a way to use multiple prompts at once
    - separate prompts using uppercase AND
    - also supports weights for prompts: a cat :1.2 AND a dog AND a penguin :2.2
- Works with a variety of samplers
- Download models directly from Civit AI and RealEsrgan checkpoints

## Contributing

Auto1111 SDK is continuously evolving, and we appreciate community involvement. We welcome all forms of contributions - bug reports, feature requests, and code contributions.

Report bugs and request features by opening an issue on Github.
Contribute to the project by forking/cloning the repository and submitting a pull request with your changes.


## Credits
Licenses for borrowed code can be found in `Settings -> Licenses` screen, and also in `html/licenses.html` file.

- Automatic 1111 Stable Diffusion Web UI - https://github.com/AUTOMATIC1111/stable-diffusion-webui
- Stable Diffusion - https://github.com/Stability-AI/stablediffusion, https://github.com/CompVis/taming-transformers
- k-diffusion - https://github.com/crowsonkb/k-diffusion.git
- ESRGAN - https://github.com/xinntao/ESRGAN
- MiDaS - https://github.com/isl-org/MiDaS
- Ideas for optimizations - https://github.com/basujindal/stable-diffusion
- Cross Attention layer optimization - Doggettx - https://github.com/Doggettx/stable-diffusion, original idea for prompt editing.
- Cross Attention layer optimization - InvokeAI, lstein - https://github.com/invoke-ai/InvokeAI (originally http://github.com/lstein/stable-diffusion)
- Sub-quadratic Cross Attention layer optimization - Alex Birch (https://github.com/Birch-san/diffusers/pull/1), Amin Rezaei (https://github.com/AminRezaei0x443/memory-efficient-attention)
- Textual Inversion - Rinon Gal - https://github.com/rinongal/textual_inversion (we're not using his code, but we are using his ideas).
- Idea for SD upscale - https://github.com/jquesnelle/txt2imghd
- Noise generation for outpainting mk2 - https://github.com/parlance-zz/g-diffuser-bot
- CLIP interrogator idea and borrowing some code - https://github.com/pharmapsychotic/clip-interrogator
- Idea for Composable Diffusion - https://github.com/energy-based-model/Compositional-Visual-Generation-with-Composable-Diffusion-Models-PyTorch
- xformers - https://github.com/facebookresearch/xformers
- Sampling in float32 precision from a float16 UNet - marunine for the idea, Birch-san for the example Diffusers implementation (https://github.com/Birch-san/diffusers-play/tree/92feee6)
