19 Apr 2018, 14:24

report / GasTips / Benchmark

Google Apps Script / UrlFetch

By Google’s update at January 19, 2018, fetchAll method was added to the UrlFetch service. When I saw the usage, I couldn’t find the detail information about the actual running state. So I investigated about it.

As the result, it was found that the fetchAll method is worked by the asynchronous processing. The returned data is reordered by the order of requests. By this, it was also found that if you want to retrieve the data from the several URL, the process cost of UrlFetchApp.fetchAll() is much lower than that of UrlFetchApp.fetch() using for loop.

The sample scripts for server side and client side are as follows.

Sample script for server side

In this report, 5 Web Apps were used as the servers. At first, 5 standalone projects were created and the following server script was put to each project. Then, Web Apps was deployed for each project. These deployed Web Apps were accessed from the script of client side. When the client accesses to the server, the server returns unix time (milliseconds) and server ID. From unix time, the order of actual fetch can be obtained.

function doGet(e) {
    var url = ScriptApp.getService().getUrl().split("/");
    var val = JSON.stringify({
        date: new Date().getTime().toString(),
        serverId: url[url.length - 2],
        key: e.parameter.key,
    });

    Utilities.sleep(5000); // Waiting for 5 seconds

    return ContentService.createTextOutput(val).setMimeType(ContentService.MimeType.JSON);
}

Sample script for client side

When you use this script, after you deployed Web Apps, please run a function of main(). The reason that I often use map() is here.

// As a sample, 5 URLs were used.
var actions = {
  "### ID for Web Apps URL ###": "action1", // https://script.google.com/macros/s/### ID for Web Apps URL ###/exec
  "### ID for Web Apps URL ###": "action2",
  "### ID for Web Apps URL ###": "action3",
  "### ID for Web Apps URL ###": "action4",
  "### ID for Web Apps URL ###": "action5",
};

function createReq() {
  return Object.keys(actions).map(function(e, i) {
    return {
      "url": "https://script.google.com/macros/s/" + e + "/exec?key=" + actions[e],
      "method": "get",
    };
  });
}

function dispResults(response, elapsedTime) {
  var res = response.map(function(e) {return JSON.parse(e)});
  var responsedOrder = res.map(function(e) {return actions[e.serverId]});
  var actualOrder = res.sort(function(x, y) {return x.date > y.date ? 1 : -1})
                       .map(function(e) {return actions[e.serverId]});
  return {
    elapsedTime: elapsedTime,
    responsedOrder: responsedOrder,
    actualOrder: actualOrder,
  };
}

// Using UrlFetchApp.fetchAll()
function useFetchAll(req) {
  var startTime = Date.now();
  var response = UrlFetchApp.fetchAll(req);
  var endTime = Date.now();
//  Logger.log(response);
  return dispResults(response, (endTime - startTime) / 1000);
}

// Using UrlFetchApp.fetch()
function useFetch(req) {
  var startTime = Date.now();
  var response = req.map(function(e) {return UrlFetchApp.fetch(e.url).getContentText()});
  var endTime = Date.now();
//  Logger.log(response);
  return dispResults(response, (endTime - startTime) / 1000);
}

// Run
function main() {
  var req = createReq();
  var results = [useFetchAll(req), useFetch(req)];
  Logger.log(JSON.stringify(results));
}

Result

For the results, elapsedTime, responsedOrder and actualOrder mean the elapsed time of 5 requests, the order of returned data and the order of actual fetching, respectively.

UrlFetchApp.fetchAll()

{
    "elapsedTime"    : 5.502,
    "responsedOrder" : ["action1","action2","action3","action4","action5"],
    "actualOrder"    : ["action5","action1","action3","action2","action4"]
}

UrlFetchApp.fetch()

{
    "elapsedTime"    : 27.011,
    "responsedOrder" : ["action1","action2","action3","action4","action5"],
    "actualOrder"    : ["action1","action2","action3","action4","action5"]
}

These 2 results are two of results in several working. These can be said that it is an average result. From these results, the following points can be seen.

In the server script, the sleep for 5 seconds is set. For fetchAll(), the total elapsed time is 5.5 s.
For fetchAll(), responsedOrder is the same to the order of requests. The order of requests is “action1”, “action2”, “action3”, “action4” and “action5”.
For fetchAll(), actualOrder is NOT the same to the order of requests. The order is difference every time.
For fetch(), the sleep for 5 seconds at each fetch is added. So the total elapsed time is 27.0 s.
For fetch(), responsedOrder is the same to the order of requests.
For fetch(), actualOrder is the same to the order of requests.

Summary

It was found that the fetchAll method is worked by the asynchronous processing.
After it worked by the asynchronous processing, the returned values is reordered by the order of requests.
It was also found that if you want to retrieve the data from the several URL, the process cost of UrlFetchApp.fetchAll() is much lower than that of UrlFetchApp.fetch() using for loop.

In this report, I could confirm 30 requests could be worked for the fetchAll method. But I couldn’t find the limitation of the number of requests.

Limitation of number of requests

I could confirm that 1000 requests work fine. But I couldn’t find the limitation.

Benchmark: fetchAll method in UrlFetch service for Google Apps Script