Workaround: Exporting Google Documents as HTML with Image Hyperlinks

Gists

This is a sample script for exporting Google Documents as HTML with the image hyperlinks using Google Apps Script.

Recently, it seems that the specification for exporting Google Documents as HTML data has been changed. When a Google Document are exported as HTML data before, the images in the Google Document were the image hyperlinks, which are publicly shared. But, in the current stage, when a Google Document is exported as HTML data, the images in the Google Document are the data URL (base64 data) of the images. I guess that this might be related to enhancing the security. When the Google Document is exported as a ZIP file, the HTML and images are separated. But, in this case, the images are required to be included in a specific folder like “/images”. I’m worried that this might bring another issue.

From the above situation, in the current stage, when a Google Document including images is exported as HTML data, the size of the HTML data is large. For example, when this is used as an email, the cost of resources for emails becomes high.

In this post, I would like to introduce the method for exporting a Google Document as HTML with the image hyperlinks without using the data URLs using Google Apps Script.

In this case, there are 2 patterns.

Pattern 1:

In this pattern, the images in the Google Document are created as the image files in the specific folder. And, the image files are publicly shared and the webContentLinks of those images are used.

function sample1() {
  const folderId = "###"; // Please set the folder ID. The image files are put into this folder.

  const doc = DocumentApp.getActiveDocument();
  const url = `https://docs.google.com/document/d/${doc.getId()}/export?format=html`;
  let html = UrlFetchApp.fetch(url, {
    headers: { Authorization: "Bearer " + ScriptApp.getOAuthToken() },
  }).getContentText();
  const folder = DriveApp.getFolderById(folderId);
  [...html.matchAll(/src\="(data:image.*?)"/g)].forEach(([, e], i) => {
    const [mimeType, data] = e.split(",");
    const temp = folder.createFile(
      Utilities.newBlob(Utilities.base64Decode(data), mimeType, `image${i + 1}`)
    );
    temp.setSharing(DriveApp.Access.ANYONE_WITH_LINK, DriveApp.Permission.VIEW);
    html = html.replace(
      e,
      `https://drive.google.com/uc?export=download&id=${temp.getId()}`
    );
  });

  console.log(html);
}

Pattern 2:

In this pattern, the images in the Google Document are created as the image files in the specific folder. And, the thumbnail links of the images are used.

Before you test this script, please enable Drive API at Advanced Google services. In this case, the image files are not required to be publicly shared.

function sample2() {
  const folderId = "###"; // Please set the folder ID. The image files are put into this folder.

  const doc = DocumentApp.getActiveDocument();
  const url = `https://docs.google.com/document/d/${doc.getId()}/export?format=html`;
  let html = UrlFetchApp.fetch(url, {
    headers: { Authorization: "Bearer " + ScriptApp.getOAuthToken() },
  }).getContentText();
  const folder = DriveApp.getFolderById(folderId);
  [...html.matchAll(/src\="(data:image.*?)"/g)].forEach(([, e], i) => {
    const [mimeType, data] = e.split(",");
    const temp = folder.createFile(
      Utilities.newBlob(Utilities.base64Decode(data), mimeType, `image${i + 1}`)
    );
    html = html.replace(
      e,
      Drive.Files.get(temp.getId()).thumbnailLink.replace("=s220", "=s1000")
    );
  });

  console.log(html);
}

Results

The results of both patterns are the same except for the image URLs. When the above sample scripts are used, the following result is obtained.

The sample images are from http://k3-studio.deviantart.com/.

From

This is a sample Google Document.

Workaround: Exporting Google Documents as HTML with Image Hyperlinks

To

This is the exported HTML from the sample Google Document. In this case, the image source is the direct link of the image by the above scripts.

Workaround: Exporting Google Documents as HTML with Image Hyperlinks

Note

Reference

 Share!