Workaround: Exporting Google Documents as HTML with Image Hyperlinks

Gists

This is a sample script for exporting Google Documents as HTML with the image hyperlinks using Google Apps Script.

Recently, it seems that the specification for exporting Google Documents as HTML data has been changed. When a Google Document are exported as HTML data before, the images in the Google Document were the image hyperlinks, which are publicly shared. But, in the current stage, when a Google Document is exported as HTML data, the images in the Google Document are the data URL (base64 data) of the images. I guess that this might be related to enhancing the security. When the Google Document is exported as a ZIP file, the HTML and images are separated. But, in this case, the images are required to be included in a specific folder like “/images”. I’m worried that this might bring another issue.

From the above situation, in the current stage, when a Google Document including images is exported as HTML data, the size of the HTML data is large. For example, when this is used as an email, the cost of resources for emails becomes high.

In this post, I would like to introduce the method for exporting a Google Document as HTML with the image hyperlinks without using the data URLs using Google Apps Script.

In this case, there are 2 patterns.

Pattern 1:

In this pattern, the images in the Google Document are created as the image files in the specific folder. And, the image files are publicly shared and the webContentLinks of those images are used.

function sample1() {
  const folderId = "###"; // Please set the folder ID. The image files are put into this folder.

  const doc = DocumentApp.getActiveDocument();
  const url = `https://docs.google.com/document/d/${doc.getId()}/export?format=html`;
  let html = UrlFetchApp.fetch(url, {
    headers: { Authorization: "Bearer " + ScriptApp.getOAuthToken() },
  }).getContentText();
  const folder = DriveApp.getFolderById(folderId);
  [...html.matchAll(/src\="(data:image.*?)"/g)].forEach(([, e], i) => {
    const [mimeType, data] = e.split(",");
    const temp = folder.createFile(
      Utilities.newBlob(Utilities.base64Decode(data), mimeType, `image${i + 1}`)
    );
    temp.setSharing(DriveApp.Access.ANYONE_WITH_LINK, DriveApp.Permission.VIEW);
    html = html.replace(
      e,
      `https://drive.google.com/uc?export=download&id=${temp.getId()}`
    );
  });

  console.log(html);
}

Pattern 2:

In this pattern, the images in the Google Document are created as the image files in the specific folder. And, the thumbnail links of the images are used.

Before you test this script, please enable Drive API at Advanced Google services. In this case, the image files are not required to be publicly shared.

function sample2() {
  const folderId = "###"; // Please set the folder ID. The image files are put into this folder.

  const doc = DocumentApp.getActiveDocument();
  const url = `https://docs.google.com/document/d/${doc.getId()}/export?format=html`;
  let html = UrlFetchApp.fetch(url, {
    headers: { Authorization: "Bearer " + ScriptApp.getOAuthToken() },
  }).getContentText();
  const folder = DriveApp.getFolderById(folderId);
  [...html.matchAll(/src\="(data:image.*?)"/g)].forEach(([, e], i) => {
    const [mimeType, data] = e.split(",");
    const temp = folder.createFile(
      Utilities.newBlob(Utilities.base64Decode(data), mimeType, `image${i + 1}`)
    );
    html = html.replace(
      e,
      Drive.Files.get(temp.getId()).thumbnailLink.replace("=s220", "=s1000")
    );
  });

  console.log(html);
}

Results

The results of both patterns are the same except for the image URLs. When the above sample scripts are used, the following result is obtained.

The sample images are from http://k3-studio.deviantart.com/.

From

This is a sample Google Document.

To

This is the exported HTML from the sample Google Document. In this case, the image source is the direct link of the image by the above scripts.

Note

  • As an important point, in this method, the direct links of the images are used. So, when you delete the created image files, the direct link cannot be used. Please be careful about this.
  • And also, in the case of pattern 2, the thumbnail link is used. The thumbnail link is publicly shared. And, it seems that this link might not be the permanent link. Please be careful about this.

Reference

 Share!