Photo by Andrea De Santis on Unsplash
If you use social media, you may see an image or images generated by machine learning technology recently.
DALLE 2
https://openai.com/dall-e-2/
You can use DALLE 2 for free, but you may need to wait for a month maybe more.
Then recently another one has been released. That is Stable Diffusion. It is pretty similar to DALLE 2. If you give text and some parameters, it generates pretty nice image. You can use Stable Diffusion without waiting for a month which is super nice, right? However, it requires a GPU. If you don't have a GPU or cannot access to a GPU probably you 😭 (What am I supposed to do?)
About Stable Diffusion
https://stability.ai/blog/stable-diffusion-public-release
A latent text-to-image diffusion model
Then you can try stable_diffusion.openvino
. You don't need a GPU to run this!!!
stable_diffusion.openvino
Implementation of Text-To-Image generation using Stable Diffusion on Intel CPU or GPU.
Requirements
Linux, Windows, MacOS
Python <= 3.9.0
CPU or GPU compatible with OpenVINO.
Install requirements
Set up and update PIP to the highest version
Install OpenVINO™ Development Tools 2022.3.0 release with PyPI
Download requirements
python -m pip install --upgrade pip
pip install openvino-dev[onnx,pytorch]==2022.3.0
pip install -r requirements.txt
Enter fullscreen mode
Exit fullscreen mode
Generate image from text description
usage: demo.py [-h] [--model MODEL] [--device DEVICE] [--seed SEED] [--beta-start BETA_START] [--beta-end BETA_END] [--beta-schedule BETA_SCHEDULE]
[--num-inference-steps NUM_INFERENCE_STEPS] [--guidance-scale GUIDANCE_SCALE] [--eta ETA] [--tokenizer TOKENIZER] [--prompt PROMPT] [--params-from PARAMS_FROM]
[--init-image INIT_IMAGE] [--strength STRENGTH] [--mask MASK] [--output OUTPUT]
optional arguments:
-h, --help show this help message and exit
--model MODEL model name
--device DEVICE inference device [CPU, GPU]
--seed SEED random seed for generating consistent images per prompt
--beta-start BETA_START
LMSDiscreteScheduler::beta_start
--beta-end BETA_END LMSDiscreteScheduler::beta_end
--beta-schedule BETA_SCHEDULE
LMSDiscreteScheduler::beta_schedule
--num-inference-steps NUM_INFERENCE_STEPS
num inference steps
--guidance-scale GUIDANCE_SCALE
guidance scale
--eta ETA …
Enter fullscreen mode
Exit fullscreen mode
The readme is very straightforward, so probably you won't have any issues to run the demo.py
and try a python script for streamlit
.
However, there might be an issue if you use python already with python version manager and anaconda or etc.
Then, you can use poetry to avoid messing up and keep your python dev env clean.
install poetry
There are 2 ways to install poetry.
using pip
using curl
Installation
https://python-poetry.org/docs/#installation
Create a project folder
$ poetry new poetry-stable-diffusion
Enter fullscreen mode
Exit fullscreen mode
Install packages
$ poetry add package_name@package_version
Enter fullscreen mode
Exit fullscreen mode
However, you don't need to do this. You can use the following pyproject.toml
I tested already.
In this case, I used python 3.8.12.
If you don't have python 3.8, I highly recommend you to install it with [pyenv](https://github.com/pyenv/pyenv)
.
[tool.poetry]
name = "stablediffusion"
version = "0.1.0"
description = "test Stable Diffusion"
authors = [ "koji" ]
[tool.poetry.dependencies]
python = "^3.8"
numpy = "1.19.5"
transformers = "4.16.2"
diffusers = "0.2.4"
tqdm = "4.64.0"
openvino = "2022.1.0"
huggingface-hub = "0.9.0"
streamlit = "1.12.0"
watchdog = "2.1.9"
opencv-python = "4.5.2.54"
scipy = "1.6.1"
[tool.poetry.dev-dependencies]
[build-system]
requires = ["poetry-core> = 1.0 . 0 "]
build-backend = "poetry.core.masonry.api"
Enter fullscreen mode
Exit fullscreen mode
What you need to do set up the env is to run one command!
$ poetry install
Enter fullscreen mode
Exit fullscreen mode
Clone repo
$ git clone https://github.com/bes-dev/stable_diffusion.openvino.git
$ cd stable_diffusion.openvino
Enter fullscreen mode
Exit fullscreen mode
Run demo.py
$ poetry run python demo.py --prompt "cyberpunk New York City"
Enter fullscreen mode
Exit fullscreen mode
generated image
The generating process will take a few minutes (in my case it takes around 3 minutes)
my mac spec
$ system_profiler SPHardwareDataType
Hardware:
Hardware Overview:
Model Name: MacBook Pro
Model Identifier: MacBookPro16,1
Processor Name: 8-Core Intel Core i9
Processor Speed: 2.3 GHz
Number of Processors: 1
Total Number of Cores: 8
L2 Cache ( per Core) : 256 KB
L3 Cache: 16 MB
Hyper-Threading Technology: Enabled
Memory: 16 GB
System Firmware Version: 1916.0.28.0.0 ( iBridge: 20.16.365.5.4,0)
OS Loader Version: 564.40.2.0.1~4
Serial Number ( system) : C02CP2ESMD6Q
Hardware UUID: FFCE331E-4543-5DBE-8F98-E329E0A69F91
Provisioning UDID: FFCE331E-4543-5DBE-8F98-E329E0A69F91
Activation Lock Status: Disabled
Enter fullscreen mode
Exit fullscreen mode