Unlock Smart Invoice Management: Gemini, Gmail, and Google Apps Script Integration

Gists

You can see the presentation of this application at https://www.youtube.com/watch?v=Dc2WPQkovZE.

Abstract

This report describes an invoice processing application built with Google Apps Script. It leverages Gemini, a large language model, to automatically parse invoices received as email attachments and automates the process using time-driven triggers.

Introduction

The emergence of large language models (LLMs) like ChatGPT and Gemini has significantly impacted various aspects of our daily lives. One such example is their ability to automate tasks previously requiring manual effort. In my case, Gemini has streamlined the processing of invoices I receive as email attachments in PDF format.

Before Gemini, I manually reviewed each invoice, extracting crucial information. However, with Gemini’s release, this tedious process transformed. As previously reported, Gemini can automatically parse invoices. Ref Furthermore, my subsequent reports explored the ability to:

  • Return parsed invoice data in JSON format Ref Ref
  • Upload files to Gemini for content generation, as demonstrated through my creation of a dedicated library Ref

These advancements allowed me to expand a simple invoice parsing script into a full-fledged application. This report details an application built using Google Apps Script. The application retrieves emails containing invoices from Gmail, utilizes the Gemini API to parse the extracted invoices, and leverages time-driven triggers for automatic execution. This approach ensures seamless processing of all invoice-related emails.

Repository of this application

https://github.com/tanaikech/UnlockSmartInvoiceManagementWithGeminiAPI

Why Google Apps Script?

I chose Google Apps Script to create this application for several reasons:

  • Ease of Use: Google Apps Script is a low-code platform, making it accessible to users with no coding experience.
  • Seamless Integration: It integrates effortlessly with Gmail, Google Docs (including Docs, Sheets, Slides, and more), and various Google APIs.
  • Cloud-Based Automation: As a cloud-based scripting language, Google Apps Script can be triggered to run by time-based triggers.- Personal opinion: I would like to introduce the advantages of Google Apps Script.

Origin for constructing this application

The origin for constructing this application is as follows.

This application was created by integrating those sources.

Comments:

Before the response_mime_type property was released, I used function calls to control the output format. However, after the release of response_mime_type, it became the preferred method for controlling the output format. I also tested using the response_schema property when it was released, but found that the combination of response_mime_type and a JSON schema in the prompt provides the most control over the output format. As a result, this application parses invoices using the combination of response_mime_type and a JSON schema in the prompt.

Usage

1. Copy Google Spreadsheet

Please copy a Google Spreadsheet including this application to your Google Drive. So, please access the following URL.

https://docs.google.com/spreadsheets/d/17UjnOcz8vyrpDI6pNhms4d2ZXAy6aWTYgmTkvZcuqaA/copy

When you open the script editor of this Spreadsheet, you can see the following script files.

2. Create an API key

Please access https://ai.google.dev/gemini-api/docs/api-key and create your API key. At that time, please enable Generative Language API at the API console. This API key is used for this sample script.

This official document can be also seen. Ref.

Of course, if you can link the Google Cloud Platform Project to the Google Apps Script Project in the copied Spreadsheet, you can also use the access token.

3. Setup

Open the “configuration” sheet and enter your API key. If you want to use the access token, leave the API key field blank and set useAccessToken to TRUE. You can also set other parameters as needed.

By default, this application checks emails in the “INBOX” label that include invoices. To search emails from specific labels, set the label containing invoices to labelName. The script will then search for emails with that label.

4. Testing

Clicking the “START” button on the “Configuration” sheet runs the application’s script. However, you might see “No emails were processed” in the dialog even after running the script. In this case, to test the application, please send an email to your account that includes an invoice as a PDF attachment. Then, click the “START” button again.

The script execution triggers a time-driven trigger, which automatically runs the application based on the cycleMinTimeDrivenTrigger interval.

Sample invoice 1

This sample invoice is from here. When this application parses this invoice, the following JSON is returned.

{
  "check": {
    "invoice": true,
    "invalidCheck": false,
    "invalidPoints": null
  },
  "parse": {
    "name": "invoice1.png",
    "invoiceTitle": "INVOICE",
    "invoiceDate": "6月 4, 2024",
    "invoiceNumber": "100",
    "invoiceDestinationName": "Nazar Neill",
    "invoiceDestinationAddress": "Downtown Pets\n123 South Street\nManhattan, NY 15161",
    "totalCost": "$4350",
    "table": [
      ["DESCRIPTION", "HOURS", "RATE", "AMOUNT"],
      ["Pour cement foundation", "4.00", "$150.00", "$600"],
      ["Framing and drywall", "16.00", "$150.00", "$2400"],
      ["Tiling and flooring install", "9.00", "$150.00", "$1350"]
    ]
  }
}

When you see this JSON, you can see that the values invalidCheck and invalidPoints are false and null, respectively. From this, it is found that this invoice has no issues.

Sample invoice 2

This sample invoice is from here. When this application parses this invoice, the following JSON is returned.

{
  "check": {
    "invoice": true,
    "invalidCheck": true,
    "invalidPoints": "The total amount calculated is incorrect. The total amount should be a summation of prices, but it's not. In addition, the table has an item without item number and description."
  },
  "parse": {
    "name": "invoice2.png",
    "invoiceTitle": "INVOICE",
    "invoiceDate": "2024/6/7",
    "invoiceNumber": "10654",
    "invoiceDestinationName": null,
    "invoiceDestinationAddress": "123 Avenue A,\nBurbank, CA 56789",
    "totalCost": "$1,077.25",
    "table": [
      ["Item #", "Description", "Qty", "Unit price", "Discount", "Price"],
      ["A875", "Peonies", "35", "$1.05", null, "$36.75"],
      ["K245", "Tulips", "25", "$2.00", null, "$50.00"],
      ["U123", "Buttercup", "30", "$1.35", null, "$40.50"],
      ["sample", "sample item", "10", "$10.00", null, "$1,000.00"]
    ]
  }
}

When you see this JSON, you can see that the values invalidCheck and invalidPoints are true and The total amount calculated is incorrect. The total amount should be a summation of prices, but it's not. In addition, the table has an item without item number and description., respectively. From this, it is found that this invoice has issues and the modification points are suggested.

Log

After processing the two invoices mentioned above, the application displays the following information in the “log” sheet:

  • Valid invoices: These will be displayed in green.
  • Invalid invoices: These will be displayed in red.

For invalid invoices, the application can automatically send an email response containing the necessary corrections. However, during my testing with various invoices, I encountered instances where the invalidCheck value was flagged as true even for invoices without errors. Due to this, I have temporarily set the notifyModificationPointsToSender value to FALSE by default. I anticipate the content generation accuracy to improve in future updates.

Also, when you want to directly see the target mail, you can see it by clicking the “searchUrl”. When you click this link, Gmail is opened and you can see the target mail.

Future update

  • Currently, Gemini 1.5 API cannot directly use PDF data. As a workaround, this sample converts each page of the PDF data to PNG images. When direct PDF processing is available in a future update, I expect the accuracy for generating content to improve.
    • On July 23, 2024, I confirmed that this was achieved. By this, I updated the script for directly using PDF data. You can confirm the latest script at my repository.

Note

  • The top abstract image was created by Gemini.

 Share!