Where to Start with AI Art in 2024

AI image generators continue to evolve at a rapid pace. Knowing where to start is difficult with an ever-changing landscape.

Here’s a list of the many popular options available along with considerations to keep in mind when deciding which is best for the project at hand.

The Current State of AI Generators as of March 2024

illustration, ai robot at a computer, whimsical
Image generated by OpenAI’s DALL·E

We have the following broken down into free services, paid services, Stable Diffusion derivatives, and API services.

Free Services

If you want to generate an image quickly using a free service, multiple options are available, but quality and customization may vary.

Note: Rate-limiting is likely to be in place for these services.

There are many different paid services available. Many offer free generation over a limited time period and can be good for testing the quality of the outputs.

ServiceModel SelectionBase Plan Cost per Month
Adobe FireflyNo$5
Blue WillowYes (Limited)$10 – $99
ChatGPT PlusNo$20
ClipDropYes (Limited)$0 – $9
CrAIyonNo$5 – $20
Deep AIYes (Limited)$5
DreamNo$10
DreamStudioYes (Limited)Varies (Credits)
GetImgYes$12 – $49
Leonardo AIYes$12 – $60
Lexica AIYes (Limited)$10 – $60
MidjourneyYes (Limited)$10 – $120
Mage.SpaceYes$0 – $30
Night CafeYes$6 – $50
OpenArtYes$12 – $56
Playground AINo$15 – $45
Starry AINo$5 – $20 /wk
Limited model selection may include different versions of proprietary models by the service provider. For example, Midjourney offers v5, v6, niji, etc.

Considerations for Selecting a Service

Before selecting a service, think about the following:

  • Style Strength & Model Selection: Services may be stronger at one type of output over another. If possible, review the “members gallery” or “community” section of the service to see what others are creating and if they align with your needs. Specially trained models for certain styles will provide better outputs.
  • Licensing: If you plan to use the outputs for commercial purposes, make sure the terms are clear and that you own the outputs.
  • Performance: Services may have different tiers of performance. Slower performance may cause speed bumps in productivity. Additionally, some companies may allow rolling over unused credits month-to-month.
  • Interface Capabilities: Textual prompts to images are the standard across all services. However, simple editing, remixing, outpainting, uploading reference images, and other features may or may not be available.
  • Resolution: Outputs of the images may be limited to 512×512, 512×768, 1024×1024, etc. If you are using the outputs for print, make sure the resolution is high enough or that upscaling is provided by the service.
  • Prompting: Since services have different models, the way you prompt them will be different. Member galleries and community sections can help you understand how to prompt the model for effective results.

Stable Diffusion Models (And The Many Fine-Tuned Variants)

Stable Diffusion is a series of open-source models available by Stability AI. Over the past couple of years, many different models have been made available – here’s an article going over the notable differences between them.

While you can download the original weights for the various models, thousands of fine-tuned models are freely available online. These often offer better outputs as they have been trained on specialized datasets.

However, given the open-source nature, user-created fine-tuned models are often designed for very narrow use cases including anime, photo realism, animals, explicit content, and more.

Note: Running diffusion models locally is best performed on a computer with a GPU.

Apps & Interfaces

Many interfaces have quite a number of variables available to adjust.

While this may overwhelm beginners, spending time learning about it will give you a broader understanding of its capabilities. The knowledge gained is applicable to various models and services, making it a valuable investment in your skills.

Here are some ways to generate images using a Stable Diffusion model:

Note: You’ll need to download a diffusion model (next section) to use these interfaces.

Models

Stability AI, Playground, and PixArt are among the more popular companies or organizations that have released models to the open-source community, which has spawned thousands of fine-tuned models freely available for research and/or commercial purposes.

Here’s where to find models:

  • Civitai – Contains SFW/NSFW content, proceed with caution
  • Hugging Face – Research-related models
  • Modelverse – Research-related models
  • Tensor.Art – Contains SFW/NSFW content, proceed with caution

Software-as-a-Service (SaaS)

Both Run Diffusion and Think Diffusion are an all-in-one package of popular app interfaces and model selection. Within minutes, you can have an enterprise-grade GPU running your favorite model. These can be good for creators who may have planned out a project locally and then want to offload the heavy lifting to a cloud-based service.

Cloud-Based GPU Services

Cloud-based GPU services offer access to consumer and enterprise-level GPUs at a fraction of the cost of owning one. While these services are designed primarily for developers, templates exist that offer one-click deployment of Stable Diffusion apps.

These services are particularly useful for fine-tuning models, training LoRAs, and other advanced tasks that require significant processing power rather than one-off image generation.

API Services

These enterprise services are a common choice for businesses who want an API to integrate image generation into their websites, apps, or other services.

Note for Power Users and DevelopersHugging Face has a well-documented process on using the Diffusers library for inference and training.

What Makes one Service or Model Different from Another?

Nearly all modern AI art tools rely on the same underlying technology: diffusion models.

These generative models are trained on an expansive dataset of images (typically starting with LAION-5B), which contains in total 5.85 billion CLIP-filtered image-text pairs, along with other datasets.

Once trained, the model will understand concepts or how elements go together. As you provide a prompt, the model will generate an image that is representative of the prompt.

AssemblyAI put together a good overview of the process if you want to understand this at a deeper level.

Knowing this is important as what truly sets one service or model apart from others is the training of the models and the datasets that were used. Due to copyright and privacy concerns, the training data is not typically disclosed. Therefore, as a user, an effective way to know which service or model will work best for you is to look at the quality of the outputs and if they align with your needs.

Understanding the Limitations of AI Art Tools

While progress is accelerating, AI art tools still have limitations in what they can achieve. Even state-of-the-art tools struggle with hands, textual display, some verbs, interaction of multiple characters, reflections, and more.

To mitigate limitations, a smart approach to prompting, basic photo editing, and more advanced models including the use of LoRAs, ControlNets, can help achieve the desired output.

Near-Future Developments

The notable advances in early 2024 include prompt alignment, speed, character consistency, and multi-subject prompts.

The release of Stable Diffusion 3 looks to address many of these limitations, which should further advance the capabilities of many tools and services. The full research paper on Stable Diffusion 3 is available here for further reading.

Additionally, Midjourney recently released consistent character generation – which can have many real-world applications including storybooks, web comics, etc.

Leave a Comment