Gists
Abstract A new library, MimeTypeApp, simplifies using Gmail messages and attachments with the Gemini API for tasks like text analysis. It converts unsupported formats for seamless integration with Google Apps Script and Gemini.
Introduction Recently, I published MimeTypeApp, a Google Apps Script library that simplifies parsing Gmail messages, including attachments, for use with the Gemini API. Ref This library addresses a key challenge: Gmail attachments come in various MIME types, while the Gemini API currently only accepts a limited set for processing.
GeminiWithFiles was updated to v2.0.3 v2.0.3 (November 19, 2024)
I modified the specification of setFileIdsOrUrlsWithResumableUpload. From v2.0.3, when you use this method, please include propertiesService: PropertiesService.getScriptProperties() into the initial object as follows. Because, when PropertiesService.getScriptProperties() is used in the library, the values are put into the library. When I created Ref and Ref, I supposed that the script is used by copying and pasting instead of the library.
Gists
Abstract Gemini excels at text generation with RAG for large datasets, but smaller ones benefit from prompting or data upload. This report explores using Gemini 1.5 Flash/Pro with RAG on medium-sized, Google Spreadsheet-stored datasets for improved accuracy and effectiveness.
Introduction Gemini’s text generation capabilities have seen significant advancements with the Retrieval-Augmented Generation (RAG). This approach excels for large datasets, where embedding data and querying the model leads to high-quality answers.
Gists
Abstract This research explores “pseudo function calling” in Gemini API using prompt engineering with JSON schema, bypassing model dependency limitations.
Introduction Large Language Models (LLMs) like Gemini and ChatGPT offer powerful functionalities, but their capabilities can be further extended through function calling. This feature allows the LLM to execute pre-defined functions with arguments generated based on the user’s prompt. This unlocks a wide range of applications, as demonstrated in these resources (see References).
GeminiWithFiles was updated to v2.0.2 v2.0.2 (September 26, 2024)
As the option for generationConfig, the properties response_schema and temperature were added. You can see the detail information here https://github.com/tanaikech/GeminiWithFiles
Gists
Abstract This report presents a method to train AI to effectively generate content from smaller, structured datasets using Python. Gemini’s token processing capabilities are leveraged to effectively utilize limited data, while techniques for interpreting CSV and JSON formats are explored.
Introduction In the era of rapidly advancing artificial intelligence (AI), the ability to analyze and leverage large datasets is paramount. While RAG (Retrieval Augmented Generation) environments are often ideal for such tasks, there are scenarios where content generation needs to be achieved with smaller datasets.
Gists
Abstract This report improves Gmail email labeling with Gemini API using JSON schema and leverages advancements in Gemini 1.5 Flash for faster processing.
Introduction As Gemini continues to evolve, existing scripts utilizing its capabilities can be revisited to improve efficiency and accuracy. This includes the process of flexible labeling for Gmail emails using the Gemini API. I have previously explored this topic in two reports:
December 19, 2023: Demonstrating Gmail label selection based solely on prompts.
Gists
Abstract This report presents a method to optimize AI-generated scripts for processing costs using Gemini and Google Apps Script. By incorporating external knowledge from sources like StackOverflow, we demonstrate the effective generation of efficient scripts that minimize overhead while maintaining desired outcomes. This approach can be considered a dynamic pseudo-RAG technique.
Introduction The proliferation of generative AI, exemplified by Google Gemini, has led to a surge in AI-generated scripts. This trend is evident in the growing number of questions on platforms like StackOverflow that involve AI-generated scripts.
Gists
Abstract A script using resumable upload with file streams is proposed to enhance file handling within the Gemini Generative AI API for Node.js. This script allows uploading from web URLs and local storage, efficiently handles large files, and offers potential reusability with other Google APIs.
Description The @google/generative-ai library provides a powerful way to interact with the Gemini Generative AI API using Node.js. This enables developers to programmatically generate creative text formats, translate languages, write different kinds of creative content, and answer your questions in an informative way, all powered by Gemini’s advanced AI models.
Gists
Abstract This study proposes a workaround to address the Gemini API’s current inability to directly process web content from URLs. By utilizing Google Apps Script, the method extracts relevant information from a specified URL and feeds it into the API for summarization. This approach offers a solution for generating comprehensive summaries from web-based content until the API’s limitations are resolved.
Introduction While Gemini API offers powerful text generation capabilities, it currently faces limitations in directly accessing and processing web content from URLs.
Gists
Abstract Linking a Google Apps Script project to a GCP project enables you to export logs from the Class console to Logs Explorer for simplified analysis and debugging. By overcoming the limitations of in-script logging methods, this report outlines a method for exporting logs using the Cloud Logging API with Google Apps Script.
Introduction While developing applications with Google Apps Script, the Class console is a valuable tool for debugging individual components.
Gists
Abstract This report proposes a novel learning method using Gemini to automate Q&A generation, addressing the challenges of manual Q&A creation. By integrating with Google tools, this approach aims to enhance learning efficiency, accessibility, and personalization while reducing costs.
Introduction Mastering a new subject often demands a significant time commitment. A proven strategy for efficient learning is through question-and-answer (Q&A) practice. This method typically involves constructing a dataset of pertinent Q&A pairs and subsequently engaging in repeated practice until desired proficiency levels are achieved.
GeminiWithFiles was updated to v2.0.1 v2.0.1 (August 4, 2024)
From this version, codeExecution can be used. Ref You can see the detail information here https://github.com/tanaikech/GeminiWithFiles
UnlockSmartInvoiceManagementWithGeminiAPI was updated to v1.0.3. v1.0.3 (August 3, 2024)
On August 3, 2024, I upated GeminiWithFiles (https://github.com/tanaikech/GeminiWithFiles). In this version, PDF data can be processed with Gemini API without async/await. So, I updated UnlockSmartInvoiceManagementWithGeminiAPI. You can see the detail information here https://github.com/tanaikech/UnlockSmartInvoiceManagementWithGeminiAPI
GeminiWithFiles was updated to v2.0.0 v2.0.0 (August 3, 2024)
From this version, the following changes were made. PDF data can be directly used. Ref By this, PDFApp is not required to be used. By this, the script can be used without async/await. As the default, functions: {} is used. So, the default function calling was removed. Because in the current stage, JSON output can be easily returned using a JSON schema and response_mime_type.
Gists
Abstract Gemini API now enables direct PDF processing for content generation, eliminating image conversion and reducing costs. This report provides a sample script to demonstrate this new capability and its potential applications.
Introduction Gemini API has recently introduced the ability to directly process PDF data for content generation, significantly enhancing its capabilities. Previously, to utilize PDF data for content creation, it was necessary to convert each PDF page into a separate image format.
UnlockSmartInvoiceManagementWithGeminiAPI was updated to v1.0.2. v1.0.2 (July 23, 2024)
On July 23, 2024, I noticed that PDF data could be directly parsed by Gemini API. It is considered that this is due to the update by the Google side. So, I updated setBlobs([blob], true) to setBlobs([blob], false) of the method parseInvoiceByGemini_. By this modification, the PDF blob is directly used with Gemini API. Ref You can see the detail information here https://github.
Gists
Abstract Uploads in Google Apps Script are limited to 50 MB, hindering work with large datasets. This report introduces a script with uploadType=resumable to overcome this limit, enabling uploads over 50 MB to Gemini and other services.
Introduction This report explores the limitations of data upload size using Google Apps Script and introduces a script to overcome these limitations. In the current stage, Gemini API can generate content using the uploaded data to Gemini.
GeminiWithFiles was updated to v1.0.7. v1.0.7 (July 4, 2024)
From this version, when doCountToken: true and exportTotalTokens: true are used in the object of the argument of geminiWithFiles, the total tokens are returned. In this case, the returned value is an object like {returnValue: "###", totalTokens: ###}. Ref You can see the detail information here https://github.com/tanaikech/GeminiWithFiles
UnlockSmartInvoiceManagementWithGeminiAPI was updated to v1.0.1. v1.0.1 (June 17, 2024)
In order to easily customize the value of “jsonSchema” for generating content with Gemini API, I added it as a new sheet of “jsonSchema” sheet in the Spreadsheet. When you customize it, you can edit the cell “A1” of the “jsonSchema” sheet. By this, the script generates content with Gemini API using your customized JSON schema. The cell “A2” is the number of characters of “A1”.
GeminiWithFiles was updated to v1.0.6. v1.0.6 (June 15, 2024)
Included the script of PDFApp in this library. You can see the detail information here https://github.com/tanaikech/GeminiWithFiles
Gists You can see the presentation of this application at https://www.youtube.com/watch?v=Dc2WPQkovZE. Abstract This report describes an invoice processing application built with Google Apps Script. It leverages Gemini, a large language model, to automatically parse invoices received as email attachments and automates the process using time-driven triggers. Introduction The emergence of large language models (LLMs) like ChatGPT and Gemini has significantly impacted various aspects of our daily lives. One such example
GeminiWithFiles was updated to v1.0.5. v1.0.5 (June 7, 2024)
Spelling mistakes in the warning message were modified. The wait time for changing the value of state for the movie file is changed from 5 seconds to 10 seconds per cycle. You can see the detail information here https://github.com/tanaikech/GeminiWithFiles
My post was featured in the section “Community cuts” of “The overwhelmed person’s guide to Google Cloud: week of May 23”.
Gitst
Abstract This report builds on prior work using Gemini 1.0 Pro to expand Google Apps Script error messages. It highlights how the script’s execution time limit created a bottleneck, but the introduction of Gemini 1.5 Flash eliminates this issue.
Introduction After the release of the Gemini API, I previously reported on “Expanding Error Messages of Google Apps Script using Gemini Pro API with Google Apps Script”. Ref In that report, I utilized the Gemini 1.
GeminiWithFiles was updated to v1.0.4. v1.0.4 (May 29, 2024)
Recently, when model.countToken is used with the uploaded files, I confirmed that an error like You do not have permission to access the File ### or it may not exist. occurred. In order to handle this issue, I modified the library. In order to use the movie files for generateContent, I modified the library. Ref You can see the detail information here https://github.
Gists
Abstract The Gemini API traditionally required specific prompts for desired output formats. This report explores two new GenerationConfig properties: “response_mime_type” and “response_schema”. These allow developers to directly specify formats like JSON, enhancing control and predictability. We analyze and compare the effectiveness of both properties for controlling Gemini API output formats.
Introduction One of the key challenges when working with the Gemini API is ensuring the output data is delivered in the format your application requires.
GeminiWithFiles was updated to v1.0.3. v1.0.3 (May 17, 2024)
Bugs were removed. You can see the detail information here https://github.com/tanaikech/GeminiWithFiles
Gists
Abstract This report examines leveraging Gemini 1.5 API with Google Apps Script to automate sample input creation during script reverse engineering. Traditionally, this process is manual and time-consuming, especially for functions with numerous test cases. Gemini 1.5 API’s potential to streamline development by automating input generation is explored through applying reverse engineering techniques to Google Apps Script samples.
Introduction With the release of Gemini 1.5 API, users gained the ability to process more complex data, opening doors for various application developments.
Gists
Overview These are sample scripts in Python and Node.js for controlling the output format of the Gemini API using JSON schemas.
Description In a previous report, “Taming the Wild Output: Effective Control of Gemini API Response Formats with response_mime_type,” I presented sample scripts created with Google Apps Script. Ref Following its publication, I received requests for sample scripts using Python and Node.js. This report addresses those requests by providing sample scripts in both languages.
GeminiWithFiles was updated to v1.0.2. v1.0.2 (May 7, 2024)
For generating content, parts was added. From this version, you can select one of q, jsonSchema, and parts. From this version, systemInstruction can be used. In order to call the function call, toolConfig was added to the request body. You can see the detail information here https://github.com/tanaikech/GeminiWithFiles
GeminiWithFiles was updated to v1.0.1. v1.0.1 (May 2, 2024)
response_mime_type got to be able to be used for controlling the output format. Ref You can see the detail information here https://github.com/tanaikech/GeminiWithFiles
Gists
Abstract This report explores controlling output formats for the Gemini API. Traditionally, prompts dictated the format. A new property, “response_mime_type”, allows specifying the format (e.g., JSON) directly. Testing confirms this property improves control over output format, especially for complex JSON schemas. The recommended approach is to combine a detailed JSON schema with “response_mime_type” for clear and consistent outputs.
Introduction One of the key challenges when working with the Gemini API is ensuring the output data is in the format your application requires.
Overview This is a Google Apps Script library for Gemini API with files.
A new Google Apps Script library called GeminiWithFiles simplifies using Gemini, a large language model, to process unstructured data like images and PDFs. GeminiWithFiles can upload files, generate content, and create descriptions from multiple images at once. This significantly reduces workload and expands possibilities for using Gemini.
Description Recently, Gemini, a large language model from Google AI, has brought new possibilities to various tasks by enabling the use of unstructured data as structured data.
Gists
Abstract A new Google Apps Script library, “GeminiWithFiles”, simplifies using the powerful Gemini 1.5 AI model. It lets users directly upload files for content generation or create descriptions for many images at once, making it much faster than prior methods. This is helpful for tasks involving large amounts of text or images.
Introduction Recently, Gemini, a family of Google’s most capable AI models, has revolutionized various tasks by allowing unstructured data to be used as structured data.
Gists
Abstract The Gemini API generates different outputs depending on the prompts. This report explains how to use function calling in the new Gemini 1.5 API to control the output format (string, number, etc.) within a script during a chat session. This allows for more flexibility in using the Gemini API’s results.
Introduction The appearance of Gemini has already brought a wave of innovation to various fields. When the Gemini API returns a response, the format of the response is highly dependent on the input text provided as a prompt.
Gists
Abstract This report explores using Gemini, a new AI model, to parse invoices in Gmail attachments. Traditional text searching proved unreliable due to invoice format variations. Gemini’s capabilities can potentially overcome this inconsistency and improve invoice data extraction.
Introduction After Gemini, a large language model from Google AI, has been released, it has the potential to be used for modifying various situations, including information extraction from documents. In my specific case, I work with invoices in PDF format.
Gists
Abstract A new large language model (LLM) called Gemini with an API is now available, allowing developers to analyze vast amounts of data. This report explores trends in Google Apps Script by using the Gemini 1.5 API to analyze questions on Stack Overflow.
Introduction The release of the LLM model Gemini as an API on Vertex AI and Google AI Studio has opened a world of possibilities. Ref The Gemini API significantly expands the potential of various scripting languages, paving the way for diverse applications.
Gists
Abstract The Gemini API allows the generating of text from uploaded files using Google Apps Script. It expands the potential of various scripting languages for diverse applications.
Introduction With the release of the LLM model Gemini as an API on Vertex AI and Google AI Studio, a world of possibilities has opened up. Ref The Gemini API significantly expands the potential of various scripting languages and paves the way for diverse applications.
Gists
Abstract The Gemini API unlocks potential for diverse applications but requires consistent output formatting. This report proposes a method using question phrasing and API calls to craft a bespoke output, enabling seamless integration with user applications. Examples include data categorization and obtaining multiple response options.
Introduction With the release of the LLM model Gemini as an API on Vertex AI and Google AI Studio, a world of possibilities has opened up.
Gists
Abstract Gemini API on Vertex AI/Studio unlocks new applications with data retrieval and content generation through function calls. This report explores using the API for reverse engineering with a sample interpreter in Google Apps Script.
Introduction The recent release of the LLM model Gemini as an API on Vertex AI and Google AI Studio unlocks a vast potential for new applications and methodologies. It significantly expands capabilities across diverse situations, paving the way for groundbreaking applications.
Gists
Abstract The Gemini API can now do semantic searches, going beyond content generation. This means it can understand the meaning of your search and provide better results, even if your words don’t exactly match the data. This report introduces the enhanced search capabilities of the Gemini API.
Introduction The Gemini API expands its potential beyond content generation to encompass powerful semantic search capabilities. Searching existing data is crucial in various situations.
CorporaApp was updated to v1.0.3. v1.0.3 (March 6, 2024)
New method of getChunk was added. When this method is used, you can retrieve a single chunk using the resource name of chunk. You can see the detail information here https://github.com/tanaikech/CorporaApp
Gists
Abstract The Gemini API enables both content generation and semantic search, managing data effectively. This report introduces a Gemini-powered similarity viewer for easy visualization of complex text similarity scores, using Google Spreadsheet and Apps Script.
Introduction The Gemini API unlocks new possibilities, extending its capabilities beyond content generation to encompass semantic search. Within this context, the API excels at efficiently managing data within corpora. While semantic search provides valuable similarity scores (chunkRelevanceScore) for text pairs, interpreting these numerical values can be cumbersome.
CorporaApp was updated to v1.0.2. v1.0.2 (February 26, 2024)
New method of setAccessToken was added. When this method is used, you can use the access token retrieved from the service account. Default access token is retrieved by ScriptApp.getOAuthToken(). You can see the detail information here https://github.com/tanaikech/CorporaApp
Gists
Abstract New Gemini API opens doors for developers to integrate its AI power into apps, potentially impacting education, healthcare, and business. The latest Gemini 1.5 brings even more features. This report showcases an image bot using Gemini as one example of its diverse applications. Showcasing its diverse application potential across various fields.
Introduction The recent release of Gemini as an accessible API on Vertex AI and Google AI Studio empowers developers to integrate its vast capabilities into their applications, potentially revolutionizing fields like education, healthcare, and business.
Gists
Abstract Powerful AI tool Gemini’s API release (Vertex AI & Google AI Studio) opens doors for diverse applications. Its recent upgrade to version 1.5 boosts capabilities. This report demonstrates using simple Google Apps Script function calls to leverage Gemini’s power for both data retrieval and content generation.
Introduction The recent release of the LLM model Gemini as an API on Vertex AI and Google AI Studio unlocks a world of possibilities.
CorporaApp was updated to v1.0.1. v1.0.1 (February 16, 2024)
New method of searchQueryWithGenerateAnswer was added. You can see the detail information here https://github.com/tanaikech/CorporaApp
Gists
Abstract New “semantic search” features in Gemini API help find desired information within its corpora. While using these features with Google Apps Script was complex, a new library simplifies the process. This report proposes using this library with Gemini-generated content to automate template processes in Google Docs and Slides, creating a more flexible workflow.
Introduction The semantic search opens up a new wind for finding the expected values. Recently, the APIs for managing corpora have been added to Gemini API.
Gists
Description In the current stage, v1beta of Gemini API can use the corpora. Ref When the corpora are used, the values can be searched with the semantic search. In the current stage, 5 corpora can be created in a single project. And, each corpus can have 10,000 documents and 1,000,000 chunks. In this report, I would like to introduce a method for achieving the semantic search using the corpora with Google Apps Script.
Gists
Description I have published “Flexible Labeling for Gmail using Gemini Pro API with Google Apps Script” on December 19, 2023. Today, I published “Categorization using Gemini Pro API with Google Apps Script”.
In this report, as part 2, I would like to introduce 2 sample scripts for flexible labeling for Gmail using the semantic search and the function calling of Gemini Pro API with Google Apps Script.
Usage In order to test this script, please do the following flow.
Gists
Abstract This report explores using the Gemini Pro API with Google Apps Script to achieve flexible data categorization.
Introduction The recent release of the LLM model Gemini as an API on Vertex AI and Google AI Studio opens a world of possibilities. Ref and Ref I believe Gemini API significantly expands the potential of Google Apps Script and paves the way for diverse applications. In this report, I present the flexible categorization of data using Gemini Pro API with Google Apps Script.
Gists
Abstract Gemini API unlocks semantic search for Google Apps Script, boosting its power beyond automation. This report explores the result of attempting the semantic search using Gemini Pro API with Google Apps Script.
Introduction The recent release of the LLM model Gemini as an API on Vertex AI and Google AI Studio opens a world of possibilities. Ref and Ref I believe Gemini API significantly expands the potential of Google Apps Script and paves the way for diverse applications.
Gists
Description When the generated text can be automatically inserted into the cursor position of Google Document, Google Spreadsheet, and Google Slide, it will be useful for users. This report introduces sample scripts for achieving this.
Sample scripts Here, I would like to introduce 3 sample scripts for a Google Document, a Google Spreadsheet, and a Google Slide.
Create an API key These sample scripts request Gemini Pro API using an API key.
Gists
Abstract It is considered that when the current error message of Google Apps Script is expanded, it will be useful for a lot of users. This report introduces a sample script for expanding the error message of Google Apps Script using Gemini Pro API with Google Apps Script.
Introduction The recent release of the LLM model Gemini as an API on Vertex AI and Google AI Studio opens a world of possibilities.
Gists
Abstract The release of Gemini API is expected to expand the future of Google Apps Script. This report introduces a sample script for flexible email labeling in Gmail using Gemini API with Google Apps Script.
Introduction The recent release of the LLM model Gemini as an API on Vertex AI and Google AI Studio opens a world of possibilities. Ref and Ref I believe Gemini API significantly expands the potential of Google Apps Script and paves the way for diverse applications.
Gists
Abstract Gemini LLM, now a Vertex AI/Studio API, unlocks easy document summarization and image analysis via Google Apps Script. This report details an example script for automatically creating the description of the files on Google Drive and highlights seamless integration options with API keys.
Introduction Recently, the LLM model Gemini has been released and is now available as an API on Vertex AI and Google AI Studio. Ref and Ref This report presents a simple Google Apps Script example for automatically creating descriptions of files on Google Drive using the Gemini Pro API.