Abstract
This report presents a Google Apps Script for generating visualized cooking recipes with text and images using the Gemini API, leveraging its image generation capabilities.
Introduction
Recently, image generation was supported in the Gemini API using Gemini 2.0 Flash Experimental and Imagen 3. I have already reported a simple sample script for generating images using the Gemini API with Google Apps Script. Ref Google Apps Script seamlessly integrates with Google Docs, Sheets, and Slides. In this report, I would like to introduce a script for creating a recipe for cooking a dish, including texts and images, using the Gemini API with Google Apps Script.
Usage
1. Get API key
In order to use the scripts in this report, please use your API key. Ref This API key is used to use Gemini API.
2. Google Apps Script project
Please create a Google Apps Script project. You can use the standalone script type and the container-bound script type. Ref
3. Install library
In this script, GeminiWithFiles of a Google Apps Script library is used. This library is used for using Gemini API with Google Apps Script. Please install it. You can see how to install it at here.
4. Script
Please copy and paste the following scripts to the script editor of the created Google Apps Script project.
This is a sample script. Of course, you can modify the prompt text in the following script to suit your situation.
Class object
/**
* Class object for creating recipe using Gemini API.
* Author: Kanshi Tanaike
* @class
*/
class ConvertTablesToPDFBlobs {
/**
* @param {Object} object Object using this script.
* @param {String} object.apiKey API key for Gemini API.
* @param {String} object.cook Dish name.
* @param {String} object.width Width of image on Google Document.
*/
constructor(object = {}) {
const { apiKey, width = 150, cook, prompt, question } = object;
this.apiKey = apiKey;
this.width = width;
this.cook = cook;
this.prompt =
prompt ||
`Describe the recipe for creating ${cook}. Return only the step of the generated recipe. You are required to create the recipe within 5 steps. Return each step of the recipe in each element in an array. Don't format the result value as Markdown.`;
this.question =
question ||
`The following text is a recipe for cooking ${cook}. Create images for each step of the recipe. Don't insert text in the created image.`;
}
/**
* ### Description
* Main method of ConvertTablesToPDFBlobs.
*
* @return {void}
*/
run() {
const generatedRecipe = this.generateRecipe_();
if (generatedRecipe.length == 0) {
console.warn("No recipe.");
return;
}
const steps = generatedRecipe.map(
({ text }, i) => `Step ${i + 1}. ${text}`
);
console.log(steps);
const recipeData = this.generateImagesFromRecipe_([
this.question,
...steps,
]);
this.putRecipeToDocs_(recipeData);
}
/**
* ### Description
* Generate recipe as text.
*
* @return {Array} Array including the generated recipe.
* @private
*/
generateRecipe_() {
console.log("Generate recipe as text.");
const jsonSchema = {
description: this.prompt,
type: "array",
items: {
type: "object",
properties: {
text: { description: "Each step of the recipe.", type: "string" },
},
required: ["text"],
},
};
const g = new GeminiWithFiles.geminiWithFiles({
apiKey: this.apiKey,
model: "models/gemini-2.0-flash-exp",
tools: [{ googleSearch: {} }],
responseMimeType: MimeType.JSON,
});
const res = g.generateContent({ jsonSchema });
const resultText = res;
let obj = [];
try {
obj = JSON.parse(resultText);
} catch (e) {
const r = /^```.*([\s\S\w]*)```/g.exec(resultText);
if (r) {
try {
obj = JSON.parse(r[1].trim());
} catch (e) {
return [];
}
}
}
console.log("Done.");
return obj;
}
/**
* ### Description
* Generate recipe as image.
*
* @param {Array} recipe Generated recipe.
* @return {Array} 2-dimensional array including the recipe and the image blobs.
* @private
*/
generateImagesFromRecipe_(recipe) {
console.log("Generate recipe as image.");
const history = [];
const res = recipe.reduce((ar, q, i) => {
const g = new GeminiWithFiles.geminiWithFiles({
apiKey: this.apiKey,
exportRawData: true,
history,
model: "models/gemini-2.0-flash-exp",
generationConfig: { responseModalities: ["TEXT", "IMAGE"] },
});
const res = g.generateContent({ parts: [{ text: q }], role: "user" });
history.push({ parts: [{ text: q }], role: "user" });
const content = res.candidates[0].content;
const image = content.parts.find(({ inlineData }) => inlineData);
if (i > 0 && image) {
console.log(`Creating image: ${q}`);
const imageBlob = Utilities.newBlob(
Utilities.base64Decode(image.inlineData.data),
image.inlineData.mimeType
);
ar.push([q, imageBlob]);
}
history.push(res.candidates[0].content);
return ar;
}, []);
if (res.length == 0) {
throw new Error("Recipe images couldn't be created. Please try again.");
}
console.log("Done.");
return res;
}
/**
* ### Description
* Put recipe to a Google Document.
*
* @param {Array} ar 2-dimensional array including the recipe and the image blobs.
* @return {void}
* @private
*/
putRecipeToDocs_(ar) {
console.log("Put the generated recipe into a Google Document.");
const doc = DocumentApp.create(`Recipe of ${this.cook}`);
const body = doc.getBody();
body
.appendParagraph(`Cooking ${this.cook}`)
.setHeading(DocumentApp.ParagraphHeading.HEADING1);
const table = body
.appendTable(ar.map((e) => [e[0], ""]))
.setColumnWidth(1, this.width);
ar.forEach(([, blob], i) => {
const image = table.getCell(i, 1).insertImage(0, blob);
const width = image.getWidth();
const height = image.getHeight();
image.setWidth(this.width).setHeight((this.width * height) / width);
image
.getParent()
.asParagraph()
.setAlignment(DocumentApp.HorizontalAlignment.CENTER);
});
console.log("Done.");
}
}
Main method
Please set your API key and the dish name you want to cook to apiKey
and dish
, respectively. When the function main
is run, a Google Document is created in the root folder. When you open it, you can see the visualized recipe.
function main() {
const object = {
apiKey: "###", // Please set your API key.
cook: "miso soup", // Please set the dish name you want.
width: 150, // Default value is 150. The width of each image on the document.
};
new ConvertTablesToPDFBlobs(object).run();
}
5. Testing
When the value of cook
is miso soup
, the following result is obtained in the created document on the root folder.
When the value of cook
is sukiyaki
, the following result is obtained in the created document on the root folder.
Summary
This report introduces a script for generating cooking recipes. Visualizing recipes with text and images allows for the visual creation of other procedural content, such as experimental protocols and operational steps. It is considered that the potential applications of this approach will be vast.
Note
- In the current stage, when grounding with Google Search is used with the Gemini API,
responseMimeType
for outputting a JSON object cannot be used. Therefore, I used the JSON schema in the prompt. This might be resolved in a future update. - As a technique to obtain images for each step of the recipe, I used a chat with Gemini. This generated the images as a series.
- Although I have tested this script, if the visualized recipe cannot be created, please run the script again. Or, please consider the modification of the prompt.