Guide to Function Calling with Gemini and Google Apps Script

Gists

Abstract

Powerful AI tool Gemini’s API release (Vertex AI & Google AI Studio) opens doors for diverse applications. Its recent upgrade to version 1.5 boosts capabilities. This report demonstrates using simple Google Apps Script function calls to leverage Gemini’s power for both data retrieval and content generation.

Introduction

The recent release of the LLM model Gemini as an API on Vertex AI and Google AI Studio unlocks a world of possibilities. Ref Excitingly, Gemini 1.5 was just announced, further expanding its capabilities. Ref I believe Gemini significantly expands the potential in various situations and paves the way for diverse applications. Notably, the Gemini API can retrieve new data and generate content through function calls. In this report, I introduce the basic flow of function calling in the Gemini API using a simple Google Apps Script.

Workflow of function calling

The official document is here.

The above figure illustrates a simple workflow for function calls to the Gemini API, implemented using Google Apps Script.

Steps:

  1. User poses a question.
  2. Google Apps Script sends a request to the Gemini API.
  3. The Gemini API identifies a suitable function within the Google Apps Script project.
  4. Google Apps Script executes the chosen function. It has also access to Google Docs (Documents, Spreadsheets, Slides, etc.) and any associated corpora.
  5. The function running in Google Apps Script returns a response value.
  6. Google Apps Script transmits the response value back to the Gemini API.
  7. The Gemini API returns its own response.
  8. The combined response reaches the user.

The Gemini API can both directly call selected functions and integrate their outputs into its own response, offering enhanced flexibility.

In this report, Google Apps Script is used. However, this workflow can be also used for other languages.

Usage

In order to test this script, please do the following flow.

1. Create an API key

Please access https://makersuite.google.com/app/apikey and create your API key. At that time, please enable Generative Language API at the API console. This API key is used for this sample script.

This official document can be also seen. Ref.

2. Create a Google Apps Script project

Please create a standalone Google Apps Script project. Of course, this script can be also used with the container-bound script.

And, please open the script editor of the Google Apps Script project.

3. Sample script 1

This is a sample script without the function calling. Please set your API key.

function sample1() {
  const q = "What is Tanaike? Return answer within 50 words.";
  const apiKey = "###"; // Please set your API key.

  const url = `https://generativelanguage.googleapis.com/v1beta/models/gemini-pro:generateContent?key=${apiKey}`;
  const payload = { contents: [{ parts: [{ text: q }] }] };
  const res = UrlFetchApp.fetch(url, {
    payload: JSON.stringify(payload),
    contentType: "application/json",
  });
  const obj = JSON.parse(res.getContentText());
  if (
    obj.candidates &&
    obj.candidates.length > 0 &&
    obj.candidates[0].content.parts.length > 0
  ) {
    console.log(obj.candidates[0].content.parts[0].text);
  } else {
    console.warn("No response.");
  }
}

Here, the sample question is What is Tanaike? Return answer within 50 words.. Tanaike is me. You can see about me at my site. When this script is run, the following result is obtained.

Tanaike is a traditional Japanese rainwater harvesting pond used for irrigation and flood control. Typically found in rural areas, tanaike are earthen ponds constructed on slopes or in valleys to collect and store rainwater runoff. They play a vital role in agricultural communities by providing a reliable water source during dry periods and preventing flooding during heavy rainfall.

This is not about me. In order to add the information about me, the function calling is used at the next section.

4. Sample script 2

This is a sample script with the function calling. Please set your API key.

const apiKey = "###"; // Please set your API key.

const functions = {
  params_: {
    getTanaike: {
      description: "Get information about Tanaike. Value is a text.",
    },
  },
  getTanaike: (
    _ // ref: https://tanaikech.github.io/about/
  ) =>
    "As a Japanese scientist holding a Ph.D. in Physics, I am also a Google Developer Expert (GDE) in Google Workspace and a Google Cloud Champion Innovator. I am driven by a deep curiosity to explore, think creatively, and ultimately create new things. Specifically, I have a passion for crafting innovative solutions that are entirely novel, solutions that haven't yet been introduced to the world. It's in this spirit that I approach innovation. Interestingly, these new ideas often come to me during sleep, which I then strive to bring to life in the real world. Thankfully, some of these have already found practical applications.",
};

function doGemini_(text) {
  const function_declarations = Object.keys(functions).flatMap((k) =>
    k != "params_"
      ? {
          name: k,
          description: functions.params_[k].description,
          parameters: functions.params_[k]?.parameters,
        }
      : []
  );
  const url = `https://generativelanguage.googleapis.com/v1beta/models/gemini-pro:generateContent?key=${apiKey}`;
  const contents = [{ parts: [{ text }], role: "user" }];
  let check = true;
  const results = [];
  do {
    const payload = { contents, tools: [{ function_declarations }] };
    const res = UrlFetchApp.fetch(url, {
      payload: JSON.stringify(payload),
      contentType: "application/json",
    });
    const { candidates } = JSON.parse(res.getContentText());
    if (!candidates[0].content?.parts) {
      results.push(candidates[0]);
      break;
    }
    const parts = candidates[0].content?.parts || [];
    check = parts.find((o) => o.hasOwnProperty("functionCall"));
    if (check) {
      contents.push({ parts: parts.slice(), role: "model" });
      const functionName = check.functionCall.name;
      const res2 = functions[functionName](check.functionCall.args || null);
      contents.push({
        parts: [
          {
            functionResponse: {
              name: functionName,
              response: { name: functionName, content: res2 },
            },
          },
        ],
        role: "function",
      });
      parts.push({ functionResponse: res2 });
    }
    results.push(...parts);
  } while (check);
  return results.pop().text;
}

// Please run this function.
function main() {
  const q = "What is Tanaike? Return answer within 50 words.";
  const res = doGemini_(q);
  console.log(res);
}

In this sample script, the functions for using with the function calling are declared in the variable of functions. The propertiy of params_ is the request body for Gemini. The property of “getTanaike” in params_ is the name of function. And, the property of “getTanaike” out of params_ is the function for executing.

When this script is run, the following result is obtained. It is found that the information about Tanaike retrieved a function “getTanaike” is returned by summarizing within 50 words.

Tanaike is a Japanese scientist holding a Ph.D. in Physics. He is also a Google Developer Expert (GDE) in Google Workspace and a Google Cloud Champion Innovator. He is passionate about crafting innovative solutions that are entirely novel, solutions that haven't yet been introduced to the world.

The above script is a simple sample script for helping understand the function calling of Gemini using Google Apps Script. Now, there is a Google Apps Script library for simply using the workflow of this report created by Martin Hawksey. This library empowers users to seamlessly utilize Gemini’s functionalities with Google Apps Script. Ref

 Share!