Benchmark: Importing CSV Data to Spreadsheet using Google Apps Script

Gists

Kanshi Tanaike

Introduction

Please be careful! This result can be only used for Google Apps Script.

There are a limit executing time for Google Apps Script (GAS). That is 6 minutes for Consumer and Google Apps free edition, and 30 minutes for G Suite and Early Access. 1 So many users always have to pay attention to reducing the process cost of scripts. So it is very important to know the process cost of various situations. I have already reported the costs for various processes as the reports. 2 In this report, the process cost for importing CSV data to Spreadsheet using GAS has been investigated.

As the result, it was found that the method using Sheets API is the lowest cost for importing CSV data to the Spreadsheet.

Experimental procedure

In order to measure the process cost for importing CSV data to a Spreadsheet, 4 patterns for importing it were used as shown in Table 1. The sample CSV data was created as a text file. The sample CSV data can be seen at here. The numbers of rows and columns of the CSV data are 50,000 and 10, respectively. Each value of the cells is a string value of 15 bytes. At each pattern, the CSV data is retrieved from the file. And then, the CSV data is parsed and put to the existing Spreadsheet. The CSV data is put to the 1st tab of the existing Spreadsheet.

The 4 patterns for importing the CSV data are shown in Table 1.

Tests Process
pattern1 CSV data is parsed using split(), and put the parsed values to Spreadsheet using setValues(). 3
pattern2 CSV data is parsed using Utilities.parseCsv(), and put the parsed values to Spreadsheet using setValues(). 4
pattern3 CSV data is converted to Spreadsheet using files.copy Drive API, and retrieved values from the converted Spreadsheet using getValues(), then, the retrieved values are put to Spreadsheet using setValues(). 5
pattern4 CSV data is directly put to Spreadsheet using pasteData of Sheets API. 6

The sample scripts of GAS which were used for this report are here. By the way, at GAS, the processing time is not stable as you know. So the average value for more than 100 times measurements was used for each data point which is shown by figures. At this time, the fluctuation of the average values was less than 1 %. I worry that each detailed-data point at my environment might be different from that at other user’s environment. But I think that the trend of this result can be used.

Results

Figure 1 shows the process costs for importing CSV data by 4 patterns.

Fig 1. Process costs for importing CSV data by 4 patterns. About pattern 1 to 4, you can see them at Table 1.

From Fig.1, the following results can be obtained.

  • Costs of pattern1 and pattern2 are almost the same.

    • When the scripts of pattern1 and pattern2 are seen, pattern1 parses the CSV file with the loop process, while pattern2 parses it with the method of Utilities.parseCsv(). From this, it is found that if you want to modify the values when the CSV data is parsed, pattern1 is suitable for this.
  • Costs of pattern3 is highest of all.

    • It is found that the cost for creating new file is very high. By this, the cost became highest of all patterns.
  • Costs of pattern4 is lowest of all.

    • It is found that when Sheets API is used, both parsing the CSV data and putting to Spreadsheet can be done by one API call. It is considered that the parse is processed at the internal server. By this, it is considered that the cost became lowest of all patterns.
    • pattern4 can reduce the cost by 56 % from pattern1 and pattern2.
    • pattern4 can reduce the cost by 72 % from pattern3.

Summary

In this report, the process cost for importing CSV data to Spreadsheet using GAS has been investigated. As the result, the following results were obtained.

  • It was found that pattern4, which uses pasteData of Sheets API, was the lowest cost of all.
  • When pattern4 is used for importing CSV data to Spreadsheet, the cost can be reduced by 56 % from pattern1 and pattern2, which use the method parsing and putting values.
  • When pattern4 is used for importing CSV data to Spreadsheet, the cost can be reduced by 72 % from pattern3, which use the method converting mimeType from CSV to Spreadsheet.

As a note, I have to describe that this is the result for Google Apps Script. For other languages, this result might be difference. And also, the process cost of this report might be modified by future update of Google.

References

  1. Current limitations at Quotas for Google Services
  2. Benchmarks for Google Apps Script
  3. split()
  4. Class Utilities
  5. Google Drive API
  6. Google Sheets API

Appendix

A1. Scripts

Here, the scripts for measuring the process cost are shown as follows. When you use these scripts, please enable Drive API and Sheets API at Advanced Google services.

Pattern1

CSV data is parsed using split(), and put the parsed values to Spreadsheet using setValues(). I have already know that the process cost of map() is the lowest of all loop processes for Google Apps Script. So I used this here.

var data = DriveApp.getFileById(csvFileId)
  .getBlob()
  .getDataAsString();
var values = data.split("\n").map(function(e) {
  return e.split(",");
});
var sheet = SpreadsheetApp.openById(spreadsheetId).getSheets()[0];
sheet.getRange(1, 1, values.length, values[0].length).setValues(values);

Pattern2

CSV data is parsed using Utilities.parseCsv(), and put the parsed values to Spreadsheet using setValues().

var data = DriveApp.getFileById(csvFileId)
  .getBlob()
  .getDataAsString();
var values = Utilities.parseCsv(data);
var sheet = SpreadsheetApp.openById(spreadsheetId).getSheets()[0];
sheet.getRange(1, 1, values.length, values[0].length).setValues(values);

Pattern3

CSV data is converted to Spreadsheet using files.copy Drive API, and retrieved values from the converted Spreadsheet using getValues(), then, the retrieved values are put to Spreadsheet using setValues().

var fileId = Drive.Files.copy(
  {
    title: "sample",
    parents: [{ id: folderId }],
    mimeType: MimeType.GOOGLE_SHEETS
  },
  csvFileId
).id;
var values = SpreadsheetApp.openById(fileId)
  .getSheets()[0]
  .getDataRange()
  .getValues();
var sheet = SpreadsheetApp.openById(spreadsheetId).getSheets()[0];
sheet.getRange(1, 1, values.length, values[0].length).setValues(values);
Drive.Files.remove(fileId);

Pattern4

CSV data is directly put to Spreadsheet using pasteData of Sheets API.

var data = DriveApp.getFileById(csvFileId)
  .getBlob()
  .getDataAsString();
var sheetId = SpreadsheetApp.openById(spreadsheetId)
  .getSheets()[0]
  .getSheetId();
var resource = {
  requests: [
    {
      pasteData: {
        data: data,
        coordinate: { sheetId: sheetId },
        delimiter: ","
      }
    }
  ]
};
Sheets.Spreadsheets.batchUpdate(resource, spreadsheetId);

A2. Sample CSV data

The sample CSV data which was used at this report is as follows. The numbers of rows and columns are 50,000 and 10, respectively. Each value of the cells is a string value of 15 bytes.

r1-c1##########,r1-c2##########,r1-c3##########,r1-c4##########,r1-c5##########,r1-c6##########,r1-c7##########,r1-c8##########,r1-c9##########,r1-c10##########
r2-c1##########,r2-c2##########,r2-c3##########,r2-c4##########,r2-c5##########,r2-c6##########,r2-c7##########,r2-c8##########,r2-c9##########,r2-c10##########
r3-c1##########,r3-c2##########,r3-c3##########,r3-c4##########,r3-c5##########,r3-c6##########,r3-c7##########,r3-c8##########,r3-c9##########,r3-c10##########
r4-c1##########,r4-c2##########,r4-c3##########,r4-c4##########,r4-c5##########,r4-c6##########,r4-c7##########,r4-c8##########,r4-c9##########,r4-c10##########
r5-c1##########,r5-c2##########,r5-c3##########,r5-c4##########,r5-c5##########,r5-c6##########,r5-c7##########,r5-c8##########,r5-c9##########,r5-c10##########
.
.
.

TOP

 Share!