Text-To-Speech (TTS) using Gemini API with Google Apps Script
Description
This script provides a simple example for generating Text-To-Speech (TTS) using the Gemini API within Google Apps Script. The Gemini API generates audio data in the audio/L16;codec=pcm;rate=24000 format, which is not directly playable. Since there’s no built-in method to convert this to a standard audio/wav format, this sample script includes a custom function to handle the conversion.
Limitations and Considerations
- The provided
convertL16ToWav_function is specifically designed for theaudio/L16;codec=pcm;rate=24000MIME type. Using it with other audio formats will result in an error. - The script uses a hardcoded WAV header. This header assumes specific audio parameters (e.g., sample rate, bit depth, number of channels) that match the Gemini API’s output for this format. If the Gemini API’s output format changes, this header might need adjustment.
Sample Script
Before running, replace "###" with your actual Gemini API key in the myFunction.