Expanding Gemini API's Capabilities: A Practical Solution for Web Content Summarization

Gists

Abstract

This study proposes a workaround to address the Gemini API’s current inability to directly process web content from URLs. By utilizing Google Apps Script, the method extracts relevant information from a specified URL and feeds it into the API for summarization. This approach offers a solution for generating comprehensive summaries from web-based content until the API’s limitations are resolved.

Introduction

While Gemini API offers powerful text generation capabilities, it currently faces limitations in directly accessing and processing web content from URLs. When prompted to summarize an article at a specific URL like Summarize the article at the following URL. https://###, the API often returns an error message indicating its inability to retrieve the necessary information. This limitation arises from the API’s current design, which may not be equipped to handle web requests and parse HTML content.

While this limitation is expected to be addressed in future updates, to address this constraint and provide a viable solution for generating summaries from web pages, as a current workaround, this report proposes a method utilizing Google Apps Script. By leveraging the script’s ability to interact with web APIs and manipulate HTML data, we can effectively extract relevant information from a given URL and feed it into Gemini API for summarization. This approach offers a workaround for the API’s current limitations and enables us to generate more comprehensive and accurate summaries from web-based content.

Step of workaround

  1. Retrieve HTML from a URL.
  2. Convert the source URL of the “img” tag to the data URL.
  3. Convert HTML to PDF.
  4. Upload PDF to Gemini.
  5. Generate content using the uploaded PDF.

Usage

1. Create Google Apps Script project

Please create a Google Apps Script project. In this case, both the container-bound script and the standalone script can be used.

2. Create an API key

Please access https://ai.google.dev/gemini-api/docs/api-key and create your API key. At that time, please enable Generative Language API at the API console. This API key is used for this sample script.

This official document can be also seen. Ref.

Of course, if you can link the Google Cloud Platform Project to the Google Apps Script Project in the copied Spreadsheet, you can also use the access token.

3. Install a Google Apps Script library

In this script, a Google Apps Script library GeminiWithFiles is used. So, please install it. You can see how to install it at here.

4. Script

Please copy and paste the following script to the script editor of your created Google Apps Script project.

Please set your API key and URLs to the function main.

/**
 * ### Description
 * Convert HTML of the inputted URL to PDF blob.
 *
 * @param {Object} object Object for running this method.
 * @param {String} object.url URL you want to use.
 * @param {Boolean} object.convertByGoogleDoc When this is true, in order to convert HTML to PDF, Google Document is used. I think that the most cases are not required to use this. But, if you use this, please set "convertByGoogleDoc" as true. The default value is false.
 *
 * @return {Blob} PDF blob converted from HTML of the URL is returned.
 */
function convertHTMLToPDFBlob_(object) {
  const { url, convertByGoogleDoc = false } = object;
  console.log(`--- Get HTML from "${url}".`);
  const res = UrlFetchApp.fetch(url, { muteHttpExceptions: true });
  let text = res.getContentText();
  if (res.getResponseCode() != 200) {
    throw new Error(text);
  }
  console.log(`--- Convert image data.`);

  // Convert the source URL of img tag to the data URL.
  text.matchAll(/<img.*?>/g).forEach(e => {
    const t = e[0].match(/src\=["'](http.*?)["']/);
    if (t) {
      const imageUrl = t[1];
      const r = UrlFetchApp.fetch(imageUrl.trim(), { muteHttpExceptions: true });
      if (r.getResponseCode() == 200) {
        const blob = r.getBlob();
        const dataUrl = `data:${blob.getContentType()};base64,${Utilities.base64Encode(blob.getBytes())}`;
        text = text.replace(imageUrl, dataUrl);
      }
    }
  });

  // For medium
  if (url.includes("medium.com")) {
    text.matchAll(/<picture>.*?<\/picture>/g).forEach(e => {
      const t = e[0].match(/srcSet\=["'](http.*?)["']/);
      if (t) {
        const imageUrl = t[1].split(" ")[0].trim();
        const r = UrlFetchApp.fetch(imageUrl.trim(), { muteHttpExceptions: true });
        if (r.getResponseCode() == 200) {
          const blob = r.getBlob();
          const dataUrl = `data:${blob.getContentType()};base64,${Utilities.base64Encode(blob.getBytes())}`;
          text = text.replace(e[0], `<img src="${dataUrl}"`);
        }
      }
    });
  }

  let pdfBlob;
  if (convertByGoogleDoc) {
    console.log(`--- Convert HTML to PDF blob with Google Docs.`);
    const doc = Drive.Files.create({ name: "temp", mimeType: MimeType.GOOGLE_DOCS }, Utilities.newBlob(text, MimeType.HTML));
    pdfBlob = DriveApp.getFileById(doc.id).getBlob().setName(url);
    Drive.Files.remove(doc.id);
  } else {
    console.log(`--- Convert HTML to PDF blob.`);
    pdfBlob = Utilities.newBlob(text, MimeType.HTML).getAs(MimeType.PDF).setName(url);
  }
  console.log(`--- Complately converted HTML to PDF blob.`);
  return pdfBlob;
}


// Please run this function.
function main() {
  const apiKey = "###"; // Please set your API key.

  // Please set your URLs you want to use.
  // These are the samples.
  const urls = [
    "https://tanaikech.github.io/2024/06/15/unlock-smart-invoice-management-gemini-gmail-and-google-apps-script-integration/",
    "https://tanaikech.github.io/2024/08/08/a-novel-approach-to-learning-combining-gemini-with-google-apps-script-for-automated-qa/",
  ];

  // Prompt
  const jsonSchema = {
    description: "Summarize the articles of the following PDF files within 100 words, respectively. Return the result as an array.",
    type: "array",
    items: {
      type: "object",
      properties: {
        url: { type: "string", description: "Filename" },
        summary: { type: "string", description: "Summary of PDF." }
      },
      required: ["summary"],
      additionalProperties: false,
    }
  };
  const q = `Follow JSON schema.<jsonSchema>${JSON.stringify(jsonSchema)}</jsonSchema>`

  // Upload PDF
  const blobs = urls.map(url => convertHTMLToPDFBlob_({ url, convertByGoogleDoc: false }));

  // Generate content with Gemini API.
  const g = GeminiWithFiles.geminiWithFiles({ apiKey, response_mime_type: "application/json" });
  const fileList = g.setBlobs(blobs).uploadFiles();
  const res = g.withUploadedFilesByGenerateContent(fileList).generateContent({ q });
  console.log(res);
}

When main is run, the following result is obtained.

[
  {
    "url": "https://tanaikech.github.io/2024/06/15/unlock-smart-invoice-management-gemini-gmail-and-google-apps-script-integration/",
    "summary": "This article describes an invoice processing application built with Google Apps Script that leverages Gemini, a large language model, to automate the parsing of invoices received as email attachments and streamlines the processing of invoices. It details how the application retrieves emails from Gmail, uses the Gemini API to parse the extracted invoices, and leverages time-driven triggers for automatic execution."
  },
  {
    "url": "https://tanaikech.github.io/2024/08/08/a-novel-approach-to-learning-combining-gemini-with-google-apps-script-for-automated-qa/",
    "summary": "This article proposes a novel learning method using Gemini to automate Q&A generation, addressing the challenges of manual Q&A creation. By integrating with Google tools, this approach aims to enhance learning efficiency, accessibility, and personalization while reducing costs. It presents a groundbreaking learning approach that integrates Gemini with widely used Google tools: Forms, Spreadsheets, and Apps Script as an application implemented on Google Spreadsheet."
  }
]

You can see that the summaries of the HTML contents of each URL could be correctly generated.

When you want to use a single URL, please use it like const urls = ["###URL###"];.

IMPORTANT

  • This script convertHTMLToPDFBlob_ can convert HTML data to PDF data. In order to achieve this, the source URLs of the “img” tags are converted to data URLs. I guess that most HTMLs can be converted to PDF data. However, there are several methods for showing images. For example: medium.com. In this case, the image is shown in the tags of picture and the source URL is put into the srcSet tags. Therefore, I included the script for parsing HTML from medium. If you test your URLs and the correct PDF cannot be created, please check the HTML and modify the script convertHTMLToPDFBlob_ to your situation.
  • This script convertHTMLToPDFBlob_ cannot automatically log in to retrieve HTML data that requires authentication. Please note this limitation.
  • When converting HTML data to PDF using Google Apps Script, there are a few approaches. Two common methods involve using Google Document and the getAs function. It’s important to note that these methods can sometimes produce different results. By default, the convertHTMLToPDFBlob_ script uses the getAs method. If you’re not satisfied with the resulting PDF, you can try converting it using Google Document. To do this, modify the function call as follows: From convertHTMLToPDFBlob_({ url, convertByGoogleDoc: false }) to convertHTMLToPDFBlob_({ url, convertByGoogleDoc: true }).

 Share!