Use Stable Diffusion openvino with poetry

0xkoji - Sep 2 '22 - - Dev Community

Photo by Andrea De Santis on Unsplash

If you use social media, you may see an image or images generated by machine learning technology recently.

DALLE 2

https://openai.com/dall-e-2/
You can use DALLE 2 for free, but you may need to wait for a month maybe more.

Then recently another one has been released. That is Stable Diffusion. It is pretty similar to DALLE 2. If you give text and some parameters, it generates pretty nice image. You can use Stable Diffusion without waiting for a month which is super nice, right? However, it requires a GPU. If you don't have a GPU or cannot access to a GPU probably you 😭 (What am I supposed to do?)

About Stable Diffusion

https://stability.ai/blog/stable-diffusion-public-release

GitHub logo CompVis / stable-diffusion

A latent text-to-image diffusion model

Stable Diffusion

Stable Diffusion was made possible thanks to a collaboration with Stability AI and Runway and builds upon our previous work:

High-Resolution Image Synthesis with Latent Diffusion Models
Robin Rombach* Andreas Blattmann* Dominik Lorenz, Patrick Esser, Björn Ommer
CVPR '22 Oral | GitHub | arXiv | Project page

txt2img-stable2 Stable Diffusion is a latent text-to-image diffusion model. Thanks to a generous compute donation from Stability AI and support from LAION, we were able to train a Latent Diffusion Model on 512x512 images from a subset of the LAION-5B database. Similar to Google's Imagen, this model uses a frozen CLIP ViT-L/14 text encoder to condition the model on text prompts. With its 860M UNet and 123M text encoder, the model is relatively lightweight and runs on a GPU with at least 10GB VRAM. See this section below and the model card.

Requirements

A suitable…

Then you can try stable_diffusion.openvino. You don't need a GPU to run this!!!

stable_diffusion.openvino

Implementation of Text-To-Image generation using Stable Diffusion on Intel CPU or GPU.

Requirements

  • Linux, Windows, MacOS
  • Python <= 3.9.0
  • CPU or GPU compatible with OpenVINO.

Install requirements

  • Set up and update PIP to the highest version
  • Install OpenVINO™ Development Tools 2022.3.0 release with PyPI
  • Download requirements
python -m pip install --upgrade pip
pip install openvino-dev[onnx,pytorch]==2022.3.0
pip install -r requirements.txt
Enter fullscreen mode Exit fullscreen mode

Generate image from text description

usage: demo.py [-h] [--model MODEL] [--device DEVICE] [--seed SEED] [--beta-start BETA_START] [--beta-end BETA_END] [--beta-schedule BETA_SCHEDULE]
               [--num-inference-steps NUM_INFERENCE_STEPS] [--guidance-scale GUIDANCE_SCALE] [--eta ETA] [--tokenizer TOKENIZER] [--prompt PROMPT] [--params-from PARAMS_FROM]
               [--init-image INIT_IMAGE] [--strength STRENGTH] [--mask MASK] [--output OUTPUT]
optional arguments:
  -h, --help            show this help message and exit
  --model MODEL         model name
  --device DEVICE       inference device [CPU, GPU]
  --seed SEED           random seed for generating consistent images per prompt
  --beta-start BETA_START
                        LMSDiscreteScheduler::beta_start
  --beta-end BETA_END   LMSDiscreteScheduler::beta_end
  --beta-schedule BETA_SCHEDULE
                        LMSDiscreteScheduler::beta_schedule
  --num-inference-steps NUM_INFERENCE_STEPS
                        num inference steps
  --guidance-scale GUIDANCE_SCALE
                        guidance scale
  --eta ETA
Enter fullscreen mode Exit fullscreen mode

The readme is very straightforward, so probably you won't have any issues to run the demo.py and try a python script for streamlit.

However, there might be an issue if you use python already with python version manager and anaconda or etc.

Then, you can use poetry to avoid messing up and keep your python dev env clean.

install poetry

There are 2 ways to install poetry.

  1. using pip
  2. using curl

Installation
https://python-poetry.org/docs/#installation

Create a project folder

$ poetry new poetry-stable-diffusion
Enter fullscreen mode Exit fullscreen mode

Install packages

$ poetry add package_name@package_version
Enter fullscreen mode Exit fullscreen mode

However, you don't need to do this. You can use the following pyproject.toml I tested already.

In this case, I used python 3.8.12.
If you don't have python 3.8, I highly recommend you to install it with [pyenv](https://github.com/pyenv/pyenv).

[tool.poetry]
name = "stablediffusion"
version = "0.1.0"
description = "test Stable Diffusion"
authors = ["koji"]

[tool.poetry.dependencies]
python = "^3.8"
numpy = "1.19.5"
transformers = "4.16.2"
diffusers = "0.2.4"
tqdm = "4.64.0"
openvino = "2022.1.0"
huggingface-hub = "0.9.0"
streamlit = "1.12.0"
watchdog = "2.1.9"
opencv-python = "4.5.2.54"
scipy = "1.6.1"

[tool.poetry.dev-dependencies]

[build-system]
requires = ["poetry-core>=1.0.0"]
build-backend = "poetry.core.masonry.api"
Enter fullscreen mode Exit fullscreen mode

What you need to do set up the env is to run one command!

$ poetry install
Enter fullscreen mode Exit fullscreen mode

Clone repo

$ git clone https://github.com/bes-dev/stable_diffusion.openvino.git
$ cd stable_diffusion.openvino
Enter fullscreen mode Exit fullscreen mode

Run demo.py

$ poetry run python demo.py --prompt "cyberpunk New York City"
Enter fullscreen mode Exit fullscreen mode

generated image

cyberpunk new york city

The generating process will take a few minutes (in my case it takes around 3 minutes)

my mac spec

$ system_profiler SPHardwareDataType
Hardware:

    Hardware Overview:

      Model Name: MacBook Pro
      Model Identifier: MacBookPro16,1
      Processor Name: 8-Core Intel Core i9
      Processor Speed: 2.3 GHz
      Number of Processors: 1
      Total Number of Cores: 8
      L2 Cache (per Core): 256 KB
      L3 Cache: 16 MB
      Hyper-Threading Technology: Enabled
      Memory: 16 GB
      System Firmware Version: 1916.0.28.0.0 (iBridge: 20.16.365.5.4,0)
      OS Loader Version: 564.40.2.0.1~4
      Serial Number (system): C02CP2ESMD6Q
      Hardware UUID: FFCE331E-4543-5DBE-8F98-E329E0A69F91
      Provisioning UDID: FFCE331E-4543-5DBE-8F98-E329E0A69F91
      Activation Lock Status: Disabled
Enter fullscreen mode Exit fullscreen mode
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .