Spring AI ImageModel Example: Generate Image from Text

This Spring tutorial discusses the basics of Spring AI APIs for image generation using OpenAI’s DALL-E and Stability AI with examples.

Spring AI

LLM models such as Dall-E, Stable Diffusion (used by Stability), Midjourney, Imagen (by Google), GauGAN (by Nvidia), Pixray, etc. are capable of generating images from the supplied input text or prompt. Spring AI module has builtin support for text-to-image generation using the following providers:

  • DALL-E (by OpenAI)
  • DALL-E (by Azure OpenAI)
  • Stable Diffusion (by Stability)
  • CogView (by ZhiPuAI)
  • CogView (by QianFan)

This tutorial discusses the basics of Spring AI for image generation and demonstrates their usage with simple-to-follow examples.

1. Maven

We start with adding the required dependencies in the project. To add the support of OpenAI APIs, we add the spring-ai-openai-spring-boot-starter dependency. When we add this dependency, OpenAiImageClient is automatically configured as imageClient bean. To disable the autoconfiguration, set the property ‘spring.ai.openai.image.enabled‘ to ‘false‘.

<dependency>
    <groupId>org.springframework.ai</groupId>
    <artifactId>spring-ai-openai-spring-boot-starter</artifactId>
</dependency>

To add the support of Stability APIs, we add the spring-ai-stability-ai-spring-boot-starter dependency. It autoconfigures the imageClient bean with an instance of StabilityAiImageClient. To disable the autoconfiguration, set the property ‘spring.ai.stabilityai.image.enabled‘ to ‘false‘.

<dependency>
    <groupId>org.springframework.ai</groupId>
    <artifactId>spring-ai-stability-ai-spring-boot-starter</artifactId>
</dependency>

2. Image Generation API

In Spring AI, the following is the list of all the classes and interfaces used in text-to-image generation:

Core Interface/ClassDescription
ImageModelIt is a functional interface with a single method ‘call(imagePrompt)’ and returns the ImageResponse.
ImageMessageIt encapsulates the text to use and the weight that the text should have in influencing the generated image.

ImagePrompt
It encapsulates a list of ImageMessage objects and optional request options.
ImageOptionsIt encapsulates the optional request options to be passed to the Image generation model
ImageResponseIt holds the AI Model’s output containing one of potentially multiple ImageGeneration output objects resulting from a single prompt.
ImageGenerationIt represents the output response and related metadata about a single result.
ImageGenerationMetadataIt represents the metadata associated with a single ImageResponse.

The general syntax to use an ImageClient for generating images for the provided text/prompt is:

ImageResponse response = imageModel.call(

  new ImagePrompt("A cat chasing a mouse",
	  ImageOptions.builder()
	    .withQuality("hd")
	    .withN(1)
	    .withHeight(1024)
	    .withWidth(1024)
	    .build())
);

3. Creating ImageModel

Spring AI provides two classes implementing the ImageModel interface.

  • OpenAiImageModel: For generating images using OpenAI’s DALL-E.
  • StabilityAiImageModel: For generating images using Stability.

We can create ‘ImageModel‘ bean in the application’s configuration file. We can create an instance of OpenAiImageModel or StabilityAiImageModel, based on the project’s needs.

@Configuration
public class AppConfiguration {

	//Define any one bean as per requirements

	// For OpenAI
	@Bean
	ImageModel imageModel(@Value("${spring.ai.openai.api-key}") String apiKey) {
	  return new OpenAiImageClient(new OpenAiImageApi(apiKey));
	}

	//For Stability
	/*@Bean
	ImageModel imageModel(@Value("${spring.ai.stability.api-key}") String apiKey) {
	  return new StabilityAiImageClient(new StabilityAiApi(apiKey));
	}*/
}

The API key is read from the application.properties file which in turn the value from environment variables. This helps in keeping the key out of the source code, thus improving the application’s security.

spring.ai.openai.api-key=${OPENAI_API_KEY}
# OR
spring.ai.stability.api-key=${STABILITY_API_KEY}

4. Image Generation Controller

Next, we write a REST controller that accepts the user inputs into the application and responds with the URL of the generated image. The API consumer can then use the image URL to download the image in his application or browser.

In the following example, we are generating an image using OpenAI’s image generation API. The OpenAI generates a single image for an input text in model DALL-E 3.

import org.springframework.ai.image.*;
import org.springframework.web.bind.annotation.GetMapping;
import org.springframework.web.bind.annotation.RequestParam;
import org.springframework.web.bind.annotation.RestController;

@RestController
public class OpenAiImageController {

  private final ImageModel imageModel;

  public OpenAiImageController(ImageModel imageModel) {
    this.imageModel = imageModel;
  }

  @GetMapping("/image-gen")
  public String imageGen(@RequestParam String message) {

    ImageOptions options = ImageOptionsBuilder.builder()
        .withModel("dall-e-3")
        .withHeight(1024)
        .withWidth(1024)
        .build();

    ImagePrompt imagePrompt = new ImagePrompt(message, options);
    ImageResponse response = imageModel.call(imagePrompt);
    String imageUrl = response.getResult().getOutput().getUrl();

    return "redirect:" + imageUrl;
  }
}

The same code can be used for Stability API as well. We only need to configure the StabilityAiImageClient in place of OpenAiImageClient as discussed in the previous section.

5. Base URL Property

By default, the URL for image generation is selected with the default value of properties: spring.ai.openai.image.base-url or spring.ai.stabilityai.base-url. If we wish to change the endpoint URL, we can override these properties:

spring.ai.openai.image.base-url=api.openai.com

//OR

spring.ai.stabilityai.image.base-url=api.stability.ai/v1

There are several other image generation properties to govern the interaction between our application, and OpenAI APi / Stability API. You can read more about them in the official docs.

6. Demo

Let us test the image generation API by sending an input prompt and verifying the generated image’s content. The prompt we are sending is: A cat chasing a mouse.

As expected, we get the URL of the generated image. Copy the URL in a browser window for verification:

Great, we have successfully generated an image from the input prompt using Spring AI’s OpenAI image generation API.

7. Conclusion

This short Spring AI text to image generation example discussed the basics of image generation API related to OpenAI and Stability AI’s APIs. We also demonstrated the uses of API with an example. In the demo, we created an image from the input prompt and verified the generated image as well.

Happy Learning !!

Source Code on Github

Weekly Newsletter

Stay Up-to-Date with Our Weekly Updates. Right into Your Inbox.

Comments

Subscribe
Notify of
0 Comments
Most Voted
Newest Oldest
Inline Feedbacks
View all comments

About Us

HowToDoInJava provides tutorials and how-to guides on Java and related technologies.

It also shares the best practices, algorithms & solutions and frequently asked interview questions.