Generating Texts using Files Uploaded by Gemini 1.5 API

Gists

Generating Texts using Files Uploaded by Gemini 1.5 API

Abstract

The Gemini API allows the generating of text from uploaded files using Google Apps Script. It expands the potential of various scripting languages for diverse applications.

Introduction

With the release of the LLM model Gemini as an API on Vertex AI and Google AI Studio, a world of possibilities has opened up. Ref The Gemini API significantly expands the potential of various scripting languages and paves the way for diverse applications. Also, recently, Gemini 1.5 in AI Studio has been released. Ref In the near future, Gemini 1.5 API will be also released soon.

Recently, the files got to be able to be uploaded with Gemini API. Ref1 and Ref2 When this is used, the text is generated using the uploaded files. This report introduces the sample scripts for uploading the files and generating the texts using Google Apps Script.

Usage

In order to test this script, please do the following steps.

1. Create an API key

Please access https://makersuite.google.com/app/apikey and create your API key. At that time, please enable Generative Language API at the API console. This API key is used for this sample script.

This official document can be also seen. Ref.

2. Create a Google Apps Script project

In this report, Google Apps Script is used. Of course, the method introducing this report can be also used in other languages.

Please create a standalone Google Apps Script project. Of course, this script can be also used with the container-bound script.

And, please open the script editor of the Google Apps Script project.

3. Steps of script

Here, it introduces the following 4 sample scripts.

  1. Upload a file with “Method: files.list”.
  2. Confirm the uploaded file with “Method: media.upload”.
  3. Generate content using the uploaded file with “Method: models.generateContent”.
  4. Delete the uploaded file with “Method: files.delete”.

Limitation of the uploaded file Ref

Can only be used with model.generateContent or model.streamGenerateContent Automatic file deletion after 2 days Maximum 2GB per file, 20GB limit per project No downloads allowed

4. Script

Upload a file

You can see the official document at Method: media.upload. The sample script of Google Apps Script is as follows.

Please set your API key and the file ID of the image file. Here, PNG image is used.

function sample1() {
  const apiKey = "###"; // Please set your API key.
  const fileId = "###"; // Please set the file ID of the image file. Here, PNG image is used.

  const url = `https://generativelanguage.googleapis.com/upload/v1beta/files?uploadType=multipart&key=${apiKey}`;
  const metadata = {
    file: { displayName: DriveApp.getFileById(fileId).getName() },
  };
  const payload = {
    metadata: Utilities.newBlob(JSON.stringify(metadata), "application/json"),
    file: UrlFetchApp.fetch(
      `https://drive.google.com/thumbnail?sz=w1000&id=${fileId}`,
      { headers: { authorization: "Bearer " + ScriptApp.getOAuthToken() } }
    ).getBlob(),
  };
  const options = {
    method: "post",
    payload: payload,
    muteHttpExceptions: true,
  };
  const res = UrlFetchApp.fetch(url, options).getContentText();
  console.log(res);
}

When this script is run, the following value is returned.

{
  "file": {
    "name": "files/###",
    "displayName": "###",
    "mimeType": "image/jpeg",
    "sizeBytes": "123456",
    "createTime": "2024-03-30T01:23:00.000000Z",
    "updateTime": "2024-03-30T01:23:00.000000Z",
    "expirationTime": "2024-04-01T01:23:00.000000Z",
    "sha256Hash": "###",
    "uri": "https://generativelanguage.googleapis.com/v1beta/files/###"
  }
}

The values of mimeType and uri are used with generateContent.

In this sample, I used uploadType=multipart because of the small size of the image file. If you want to upload a large file, I think that resumable upload can be also used.

Get the file list

You can see the official document at Method: files.list. The sample script of Google Apps Script is as follows.

Please set your API key.

function sample2() {
  const apiKey = "###"; // Please set your API key.

  const url = `https://generativelanguage.googleapis.com/v1beta/files?pageSize=100&key=${apiKey}`;
  const res = UrlFetchApp.fetch(url);
  console.log(res.getContentText());
}

When this script is run, the following value is returned.

{
  "files": [
    {
      "name": "files/###",
      "displayName": "###",
      "mimeType": "image/jpeg",
      "sizeBytes": "123456",
      "createTime": "2024-03-30T01:23:00.000000Z",
      "updateTime": "2024-03-30T01:23:00.000000Z",
      "expirationTime": "2024-04-01T01:23:00.000000Z",
      "sha256Hash": "###",
      "uri": "https://generativelanguage.googleapis.com/v1beta/files/###"
    },
    ,
    ,
    ,
  ]
}

When the number of files is more than 100, please retrieve all files using pageToken.

Generate content from the uploaded file

You can see the official document at Method: models.generateContent. The sample script of Google Apps Script is as follows.

Please set your API key, the URI of the uploaded file, and the mimeType of the file.

function sample3() {
  const apiKey = "###"; // Please set your API key.
  const fileUri = "https://generativelanguage.googleapis.com/v1beta/files/###"; // Please set your file uri of the uploaded file.
  const mimeType = "image/jpeg"; // Please set the mimeType of the uploaded file.

  const q = "Describe the image and count apples in the image.";
  const model = "models/gemini-1.5-pro-gf-fc";
  const baseUrl = `https://generativelanguage.googleapis.com/v1beta/${model}`;
  const payload = {
    contents: [{ parts: [{ text: q }, { fileData: { fileUri, mimeType } }] }],
  };
  const options = {
    payload: JSON.stringify(payload),
    contentType: "application/json",
    muteHttpExceptions: true,
  };
  const res = UrlFetchApp.fetch(
    `${baseUrl}:generateContent?key=${apiKey}`,
    options
  );
  console.log(res.getContentText());
}

In this sample, the following image created by Gemini was uploaded as a sample file and was used.

Generating Texts using Files Uploaded by Gemini 1.5 API

There are 12 apples including 7 red apples, 4 green apples, and 1 yellow apple are shown in the image.

When this script is run, the following generated contents are returned.

Delete the uploaded file

You can see the official document at Method: files.delete. The sample script of Google Apps Script is as follows.

Please set your API key and the name of the uploaded file.

function sample4() {
  const apiKey = "###"; // Please set your API key.
  const name = "files/###"; // Please set the name of the uploaded file.

  const url = `https://generativelanguage.googleapis.com/v1beta/${name}?key=${apiKey}`;
  const res = UrlFetchApp.fetch(url, { method: "delete" });
  console.log(res.getContentText()); // {}
}

In this case, an empty object like {} is returned.

In the current stage, the expiration time of the uploaded file is 2 days. So, the uploaded file is automatically deleted 2 days later.

Summary

In this report, we present sample scripts for using the Gemini API’s generateContent function with uploaded files. Our findings are as follows:

Note

 Share!