Abstract
Gemini API now generates images via Flash Experimental and Imagen 3. This report introduces image evolution within conversations using Gemini API with Google Apps Script.
Introduction
Recently, image generation was supported in the Gemini API using Gemini 2.0 Flash Experimental and Imagen 3. I have already reported a simple sample script for generating images using the Gemini API with Google Apps Script. Ref In practice, you might want to evolve the generated images within a conversation. In this report, I would like to introduce a sample script demonstrating this using Google Apps Script.
Flow
The flow of this approach is as follows.
The prompts are given to Gemini in order within the chat. Thus, the answer to the first prompt is used as feedback for the second prompt, and the answer to the second prompt is created by incorporating the answer to the first prompt. This allows for the generation of evolved images.
Usage
1. Get API key
In order to use the scripts in this report, please use your API key. Ref This API key is used to use Gemini API.
2. Google Apps Script project
Please create a Google Apps Script project. You can use the standalone script type and the container-bound script type. Ref
3. Install library
In this script, GeminiWithFiles of a Google Apps Script library is used. This library is used for using Gemini API with Google Apps Script. Please install it. You can see how to install it at here.
4. Script
Please copy and paste the following scripts to the script editor of the created Google Apps Script project.
This is a sample script. Of course, you can modify the prompt texts in the following script to suit your situation.
/**
* ### Description
* Do chat by including texts and images with Gemini API.
*
* @param {String} apiKey API key for Gemini API.
* @param {Array} prompts Prompts.
* @return {void}
*/
function chat_(apiKey, prompts) {
const g = new GeminiWithFiles.geminiWithFiles({
apiKey,
model: "models/gemini-2.0-flash-exp",
generationConfig: { responseModalities: ["TEXT", "IMAGE"] },
});
prompts.forEach((q, i) => {
const res = g.chat({ parts: [{ text: q }], role: "user" });
const imageObj = res.candidates[0].content.parts.find((e) => e.inlineData);
if (imageObj) {
console.log("Image was created.");
const imageBlob = Utilities.newBlob(
Utilities.base64Decode(imageObj.inlineData.data),
imageObj.inlineData.mimeType
);
DriveApp.createFile(imageBlob.setName(`${i + 1}_${q}`));
} else {
console.log("Image was not created.");
console.log(res.candidates[0].content.parts[0].text);
}
});
}
5. Sample 1
In order to test the 1st sample, please copy and paste the following script to the script editor and set your API key.
function sample1() {
const apiKey = "###", // Please set your API key.
const prompts = [
"What is Google Apps Script? This is my 1st question.",
"What is Google documents? This is my 2nd question.",
"What is my Google spreadsheets? This is my 3rd question.",
"What is my Google slides? This is my 4th question.",
"What is my 2nd question?"
];
chat_(apiKey, prompts);
}
When this script is run, the answer to What is my 2nd question?
is Your second question was: "What is Google documents?"
. This result demonstrates that the chat conversation flows smoothly.
6. Sample 2
In order to test the 1st sample, please copy and paste the following script to the script editor and set your API key.
function sample2() {
const apiKey = "###", // Please set your API key.
const prompts = [
"Create an image of a clean whiteboard.",
"Add an illustration of an apple drawn with a whiteboard marker to the upper left on the whiteboard. Don't stick out it from the whiteboard.",
"Add an illustration of an orange drawn with a whiteboard marker to the upper right on the whiteboard. Don't stick out it from the whiteboard.",
"Add an illustration of a banana drawn with a whiteboard marker to the bottom left on the whiteboard. Don't stick out it from the whiteboard.",
"Add an illustration of a kiwi drawn with a whiteboard marker to the bottom right on the whiteboard. Don't stick out it from the whiteboard.",
];
chat_(apiKey, prompts);
}
When this script is run, the following images are generated in the root folder. From these images, it is found that each image is correlated and evolves based on the prompts. This is due to the chat using the history of the Gemini API.
On the other hand, when the images are generated by the same prompts without the chat, the generated images are as follows. You can see that each image is not correlated with the prompts.
Summary
From this report, it was found that using the chat is better for generating images that evolve with the conversation.
Applications
As an application of the results from this report, I presented it in “Create Visualized Recipe Instructions with Gemini using Google Apps Script”. I believe that various applications can be applied more.
Note
- Currently, using images in chat is necessary to achieve this, but as of March 18, 2025, the Google Gen AI SDK for Python does not support images in chat. Ref This will be resolved in a future update. Therefore, I used Google Apps Script.
- The top image was generated by Gemini from the introduction.