How to use Stable Diffusion - Stable Diffusion Art (2024)

Want to learn Stable Diffusion AI? This beginner’s guide is for newbies with zero experience with Stable Diffusion or other AI image generators. You will get an overview of Stable Diffusion and some basic useful tips.

This is the first part of the beginner’s guide series.
Read part 2: Prompt building.
Read part 3: Inpainting.
Read part 4: Models.

Table of Contents

  • What is Stable Diffusion?
  • How to use Stable Diffusion?
  • What’s the advantage of Stable Diffusion?
  • Is Stable Diffusion AI Free?
  • What Can Stable Diffusion Do?
    • 1. Generate images from text
    • 2. Generate an image from another image
    • 3. Photo Editing
    • 4. Make videos
  • How do you use Stable Diffusion AI?
    • Online generator
    • Advanced GUI
  • How to build a good prompt?
  • Rules of thumb for building good prompts
    • Be detailed and specific
    • Use powerful keywords
  • What are those parameters, and should I change them?
  • How many images should I generate?
  • Common ways to fix defects in images
    • Face Restoration
    • Fixing small artifacts with inpainting
  • What are custom models?
    • Which model should I use?
    • How to train a new model?
  • Negative prompts
  • How to make large prints with Stable Diffusion?
  • How to control image composition?
    • Image-to-image
    • ControlNet
    • Regional prompting
    • Depth-to-image
  • Generating specific subjects
    • Realistic people
    • Animals
  • What is unstable diffusion?
  • Next Step

What is Stable Diffusion?

Stable Diffusion AI is a latent diffusion model for generating AI images. The images can be photorealistic, like those captured by a camera, or in an artistic style as if produced by a professional artist.

The best part is that it is free – you can run it on your PC.

How to use Stable Diffusion?

You need to give it a prompt that describes an image. For example:

gingerbread house, diorama, in focus, white background, toast , crunch cereal

Stable Diffusion turns this prompt into images like the ones below.

How to use Stable Diffusion - Stable Diffusion Art (1)
How to use Stable Diffusion - Stable Diffusion Art (2)
How to use Stable Diffusion - Stable Diffusion Art (3)

You can generate as many variations as you want from the same prompt.

What’s the advantage of Stable Diffusion?

There are similar text-to-image generation services like DALLE and MidJourney. Why Stable Diffusion? The advantages of Stable Diffusion AI are

  • Open-source: Many enthusiasts have created free tools and models.
  • Designed for low-power computers: It’s free or cheap to run.

Is Stable Diffusion AI Free?

Stable Diffusion is free to use when running on your own Windows or Mac machines. An online service will likely cost you a modest fee because someone needs to provide you with the hardware to run on.

What Can Stable Diffusion Do?

1. Generate images from text

The most basic usage of Stable Diffusion is text-to-image (txt2img). Here are some examples of images you can generate with Stable Diffusion.

Anime style

How to use Stable Diffusion - Stable Diffusion Art (4)
How to use Stable Diffusion - Stable Diffusion Art (5)

Photorealistic style

How to use Stable Diffusion - Stable Diffusion Art (6)
How to use Stable Diffusion - Stable Diffusion Art (7)

Learn how to generate realistic people and realistic street humans.

Landscape

How to use Stable Diffusion - Stable Diffusion Art (8)
How to use Stable Diffusion - Stable Diffusion Art (9)

Fantasy

How to use Stable Diffusion - Stable Diffusion Art (10)
How to use Stable Diffusion - Stable Diffusion Art (11)

Artistic style

How to use Stable Diffusion - Stable Diffusion Art (12)
How to use Stable Diffusion - Stable Diffusion Art (13)

Animals

How to use Stable Diffusion - Stable Diffusion Art (14)
How to use Stable Diffusion - Stable Diffusion Art (15)

Learn how to generate animals.

Take out the guesswork for becoming an AI artist. Learn Stable Diffusion step-by-step.

Get the Beginner’s Guide

2. Generate an image from another image

Image-to-image (img2img) transforms one image to another using Stable Diffusion AI.

Below is an example of transforming my drawing of an apple into a photo-realistic one. (Tutorial)

How to use Stable Diffusion - Stable Diffusion Art (17)

3. Photo Editing

You can use inpainting to regenerate part of an AI or real image. This is the same as Photoshop’s new generative fill function, but free.

4. Make videos

There are two main ways to make videos with Stable Diffusion: (1) from a text prompt and (2) from another video.

Deforum is a popular way to make a video from a text prompt. You have probably seen one of them on social media. It looks like this.

The second way is to stylize a video using Stable Diffusion. See the video-to-video tutorial.

How to use Stable Diffusion - Stable Diffusion Art (18)
How to use Stable Diffusion - Stable Diffusion Art (19)

This is a more advanced topic. It is best to master text-to-image and image-to-image before diving into it.

How do you use Stable Diffusion AI?

Online generator

For absolute beginners, I recommend using a free online generator. You can start generating without the hassle of setting things up.

Advanced GUI

The downside of free online generators is that the functionalities are pretty limited.

Use a more advanced GUI (Graphical User Interface) if you’ve outgrown them. A whole array of tools are at your disposal. To name a few:

  • Advanced prompting techniques.
  • Regenerate a small part of an image with Inpainting.
  • Generate images based on an input image (Image-to-image)
  • Edit an image by giving an instruction.

AUTOMATIC1111 is a popular choice. See the Quick Start Guide for setting up the Google Colab cloud server. Running it on your PC is also a good option if you have the right PC. See install guides for Windows and Mac.

How to build a good prompt?

There’s a lot to learn to craft a good prompt. But the basic is to describe your subject in as much detail as possible. Make sure to include powerful keywords to define the style.

Using a prompt generator is a great way to learn a step-by-step process and important keywords. It is essential for beginners to learn a set of powerful keywords and their expected effects. This is like learning vocabulary for a new language. You can also find a short list of keywords and notes here.

A shortcut to generating high-quality images is to reuse existing prompts. Head to the prompt collection, pick an image you like, and steal the prompt! The downside is that you may not understand why it generates high-quality images. Read the notes and change the prompt to see the effect.

Alternatively, use image collection sites like PlaygroundAI. Pick an image you like and remix the prompt. But it could be like finding a needle in a haystack for a high-quality prompt.

Treat the prompt as a starting point. Modify to suit your needs.

Rules of thumb for building good prompts

Two rules: (1) Be detailed and specific, and (2) use powerful keywords.

Be detailed and specific

Although AI advances in leaps and bounds, Stable Diffusion still cannot read your mind. You need to describe your image in as much detail as possible.

Let’s say you want to generate a picture of a woman in a street scene. A simplistic prompt

a woman on street

gives you an image like this:

How to use Stable Diffusion - Stable Diffusion Art (20)

Well, you may not want the generate a grandma, but this technically matches your prompt. You cannot blame Stable Diffusion…

So instead, you should write more.

a young lady, brown eyes, highlights in hair, smile, wearing stylish business casual attire, sitting outside, quiet city street, rim lighting

How to use Stable Diffusion - Stable Diffusion Art (21)

See the drastic difference. So work on your prompt-building skills!

Use powerful keywords

Some keywords are more powerful than others. Examples are

  • Celebrity names (e.g. Emma Watson)
  • Artist names (e.g. van Gogh)
  • Art medium (e.g. illustration, painting, photograph)

Using them carefully can steer the image in the direction you want.

You can learn more about prompt building and example keywords in the basics of building prompts.

Want to cheat? Like doing homework, you can use ChatGPT to generate prompts!

What are those parameters, and should I change them?

Most online generators allow you to change a limited set of parameters. Below are some important ones:

  • Image size: The size of the output image. The standard size is 512×512 pixels. Changing it to portrait or landscape size can have a big impact on the image. For example, use portrait size to generate a full-body image.
  • Sampling steps: Use at least 20 steps. Increase if you see a blurry image.
  • CFG scale: Typical value is 7. Increase if you want the image to follow the prompt more.
  • Seed value: -1 generates a random image. Specify a value if you want the same image.

See recommendations for other settings.

How many images should I generate?

You should always generate multiple images when testing a prompt.

I generate 2-4 images at a time when making big changes to the prompt so that I can speed up the search. I would generate 4 at a time when making small changes to increase the chance of seeing something usable.

Some prompt only works half of the time or less. So don’t write off a prompt based on one image.

Common ways to fix defects in images

When you see stunning AI images shared on social media, there’s a good chance they have undergone a series of post-processing steps. We will go over some of them in this section.

Face Restoration

It’s well-known in the AI artist community that Stable Diffusion is not good at generating faces. Very often, the faces generated have artifacts.

We often use image AI models that are trained for restoring faces, for example, CodeFormer, which AUTOMATIC1111 GUI has built-in support. See how to turn it on.

Do you know there’s an update to v1.4 and v1.5 models to fix eyes? Check out how to install a VAE.

Fixing small artifacts with inpainting

It is difficult to get the image you want on the first try. A better approach is to generate an image with good composition. Then repair the defects with inpainting.

Below is an example of an image before and after inpainting. Using the original prompt for inpainting works 90% of the time.

There are other techniques to fix things. Read more about fixing common issues.

What are custom models?

The official models released by Stability AI and their partners are called base models. Some examples of base models are Stable Diffusion 1.4, 1.5, 2.0, and 2.1.

Custom models are trained from the base models. Currently, most of the models are trained from v1.4 or v1.5. They are trained with additional data for generating images of particular styles or objects.

Only the sky is the limit when it comes to custom models. It can be anime style, Disney style, or the style of another AI. You name it.

Below is a comparison of 5 different models.

How to use Stable Diffusion - Stable Diffusion Art (25)

It is also easy to merge two models to create a style in between.

Which model should I use?

Stick with the base models if you are starting out. There are pretty to learn and play with to keep you busy for months.

The three main versions of Stable Diffusion are v1, v2, and Stable Diffusion XL (SDXL).

  • v1 models are 1.4 and 1.5.
  • v2 models are 2.0 and 2.1.
  • SDXL 1.0

You may think you should start with the newer v2 models. People are still trying to figure out how to use the v2 models. Images from v2 are not necessarily better than v1’s.

There were series of SDXL models released: SDXL beta, SDXL 0.9, and the latest SDXL 1.0.

I recommend using the v1.5 and SDXL 1.0 models if you are new to Stable Diffusion.

How to train a new model?

An advantage of using Stable Diffusion is that you have total control of the model. You can create your own model with a unique style if you want. Two main ways to train models: (1) Dreambooth and (2) embedding.

Dreambooth is considered more powerful because it fine-tunes the weight of the whole model. Embeddings leave the model untouched but find keywords to describe the new subject or style.

You can experiment with the Colab notebook in the dreambooth article.

Negative prompts

You put what you want to see in the prompt. You put what you don’t want to see in the negative prompt. Not all Stable Diffusion services support negative prompts. But it is valuable for v1 models and a must for v2 models. It doesn’t hurt for a beginner to use a universal negative prompt. Read more about negative prompts:

  • How does negative prompt work?
  • How to use negative prompts?

How to make large prints with Stable Diffusion?

Stable Diffusion’s native resolution is 512×512 pixels for v1 models. You should NOT generate images with width and height that deviates too much from 512 pixels. Use the following size settings to generate the initial image.

  • Landscape image: Set the height to 512 pixels. Set the width to higher, e.g. 768 pixels (2:3 aspect ratio)
  • Portrait image: Set the width to 512 pixels. Set the height to higher, e.g. 768 pixels (3:2 aspect ratio)

If you set the initial width and height too high, you will see duplicate subjects.

The next step is to upscale the image. The free AUTOMATIC1111 GUI comes with some popular AI upscalers.

  • Read this tutorial for a beginner’s guide to AI upscalers.
  • Read this tutorial for more advanced usage.

How to control image composition?

Stable Diffusion technology is rapidly improving. There are a few ways.

Image-to-image

You can ask Stable Diffusion to roughly follow an input image when generating a new one. It’s called image-to-image. Below is an example of using an input image of an eagle to generate a dragon. The composition of the output image follows the input.

How to use Stable Diffusion - Stable Diffusion Art (26)
How to use Stable Diffusion - Stable Diffusion Art (27)

ControlNet

ControlNet similarly uses an input image to direct the output. But it can extract specific information, for example, human poses. Below is an example of using ControlNet to copy a human pose from the input image.

How to use Stable Diffusion - Stable Diffusion Art (28)
How to use Stable Diffusion - Stable Diffusion Art (29)

In addition to human poses, ControlNet can extract other information, such as outlines.

Regional prompting

You can specify prompts for certain parts of images using an extension called Regional Prompter. This technique is very helpful for drawing objects only in certain parts of the image.

Below is an example of placing a wolf at the bottom left corner and skulls at the bottom right corner.

How to use Stable Diffusion - Stable Diffusion Art (30)

Read the Regional Prompter tutorial to learn more to use it.

Depth-to-image

Depth-to-image is another way to control composition through an input image. It can detect the foreground and the background of the input image. The output image will follow the same foreground and background. Below is an example.

How to use Stable Diffusion - Stable Diffusion Art (31)
How to use Stable Diffusion - Stable Diffusion Art (32)

Generating specific subjects

Realistic people

You can use Stable Diffusion to generate photo-style realistic people. Let’s see some samples.

How to use Stable Diffusion - Stable Diffusion Art (33)
How to use Stable Diffusion - Stable Diffusion Art (34)
How to use Stable Diffusion - Stable Diffusion Art (35)
How to use Stable Diffusion - Stable Diffusion Art (36)

It comes down to using the right prompt and special model trained to produce photo-style realistic humans. Learn more in the tutorial for generating realistic people.

Animals

Animals are popular subjects among Stable Diffusion users.

Here are some samples.

How to use Stable Diffusion - Stable Diffusion Art (37)
How to use Stable Diffusion - Stable Diffusion Art (38)
How to use Stable Diffusion - Stable Diffusion Art (39)
How to use Stable Diffusion - Stable Diffusion Art (40)

Read the tutorial for generating animals to learn how to.

What is unstable diffusion?

Unstable Diffusion is a company that develops Stable Diffusion models for AI p*rn. They made headlines when their Kickstarter fundraising campaign got shut down. So far, they have not released any models publicly.

The company is not related to Stability AI, the company that released Stable Diffusion AI.

Next Step

So, you have completed the first tutorial of the Beginner’s Guide!

Check out the Stable Diffusion Course for a step-by-step guided course.

Or continue to part 2 below.

This is part 1 of the beginner’s guide series.
Read part 2: Prompt building.
Read part 3: Inpainting.
Read part 4: Models.

How to use Stable Diffusion - Stable Diffusion Art (2024)

FAQs

How does Stable Diffusion art work? ›

As a diffusion model, Stable Diffusion differs from many other image generation models. In principle, diffusion models use Gaussian noise to encode an image. Then, they use a noise predictor together with a reverse diffusion process to recreate the image.

Can you make NSFW with Stable Diffusion? ›

Since Stable Diffusion does not produce NSFW content by default, users must use creative methods to bypass the limitations. Sadly, it's not as simple as engineering the right prompt for the task, as the limitations on Stable Diffusion will kick in every time.

Are you allowed to sell Stable Diffusion art? ›

Moreover, some users even claim that only paid users can sell the images ahead under their copyright. However, those who are free users can only generate art for personal use and still have to give credit to Stable Diffusion.

Can I train Stable Diffusion with my own images? ›

You don't need a coding background to join the adventure. We're here to walk you through the process with the utmost simplicity, showcasing the minimum lines of code required to bring your vision to life. Train Stable Diffusion with Your Own Images and Generate Art That's Uniquely Yours!

How many steps should I use for Stable Diffusion? ›

Some noise is removed with each step, resulting in a higher-quality image over time. The repetition stops when the desired number of steps completes. Around 25 sampling steps are usually enough to achieve high-quality images. Using more may produce a slightly different picture, but not necessarily better quality.

Is using Stable Diffusion legal? ›

Yes, using Stable Diffusion AI for commercial purposes is legal as long as you have the necessary permissions and licenses.

Will Stable Diffusion be banned? ›

Civitai Temoprary Banned Stable Diffusion 3

Due to ambiguity in the licensing terms associated with Stable Diffusion 3 (SD3), Civitai has temporarily banned the use of all SD3-based models and all models or LoRAs trained on content generated from SD3-based models, including utilities such as ControlNet.

Does Dalle 2 allow NSFW? ›

Some well-known AI art generators like Midjourney, stable diffusion and dalle 2 have NSFW content filter. However, there are still some creative ones which generate adult art and images as you wish.

Can you use Stable Diffusion images commercially? ›

One of the most appealing aspects of Stable Diffusion is its open-source nature. The original Stable Diffusion model, commonly referred to as Stable Diffusion 1.5 or SD1. 5, was released in August 2022 under the CreativeML Open RAIL++-M License, allowing for both commercial and non-commercial use.

What image settings are needed for Stable Diffusion? ›

More Stable Diffusion image settings

Let's start with the two basic ones: Aspect ratio: The default is 1:1, but you also can select 7:4, 3:2, 4:3, 5:4, 4:5, 3:4, 2:3, and 7:4 if you want a wider image. Image count: You can generate anywhere between one and ten images for each prompt.

How does Stable Diffusion get images? ›

Stable Diffusion first changes the text prompt into a text representation, numerical values that summarize the prompt. The text representation is used to generate an image representation, which summarizes an image depicted in the text prompt. This image representation is then upscaled into a high-resolution image.

How does the Stable Diffusion model work? ›

Stable Diffusion uses latent images encoded from training data as input. Further, given an image zo, the diffusion algorithm progressively add noise to the image and produces a noisy image zt, with t being how many times noise is added. When t is large enough, the image approximates pure noise.

How does Stable Diffusion animation work? ›

Stable Diffusion Animation

By starting with an intermediate output, the model will generate samples that are similar to each other, resulting in a smoother animation. Finally, the generated samples are interpolated with Google's FILM (Frame Interpolation for Large Scene Motion) for extra smoothness.

Does Stable Diffusion use copyrighted images? ›

The training model for Stable Diffusion has a lot of copyrighted images mixed together into an output which makes the plagiarism non-obvious, but let's reduce the set by 1 image. Shouldn't affect the output too much, right? Maybe some prompt will have a slightly different image. Now let's reduce it by another image.

Top Articles
Latest Posts
Article information

Author: Carmelo Roob

Last Updated:

Views: 5315

Rating: 4.4 / 5 (45 voted)

Reviews: 92% of readers found this page helpful

Author information

Name: Carmelo Roob

Birthday: 1995-01-09

Address: Apt. 915 481 Sipes Cliff, New Gonzalobury, CO 80176

Phone: +6773780339780

Job: Sales Executive

Hobby: Gaming, Jogging, Rugby, Video gaming, Handball, Ice skating, Web surfing

Introduction: My name is Carmelo Roob, I am a modern, handsome, delightful, comfortable, attractive, vast, good person who loves writing and wants to share my knowledge and understanding with you.