Resumable Download of File from Google Drive using Drive API with Python

Gists

This is a sample script for achieving the resumable download of a file from Google Drive using Dive API with Python.

There might be a case in that you want to achieve the resumable download of a file from Google Drive using Dive API with Python. For example, when a large file is downloaded, the downloading might be stopped in the middle of downloading. At that time, you might want to resume the download. In this post, I would like to introduce the sample script of python.

In order to achieve the partial download from Google Drive, the property of Range: bytes=500-999 is required to be included in the request header. But, unfortunately, in the current stage, MediaIoBaseDownload cannot use this property. When MediaIoBaseDownload is used, all data is downloaded.

So, in order to achieve this goal, it is required to use a workaround. In this workaround, I proposed the following flow.

  1. Retrieve the filename and file size of the file on the Google Drive you want to download.
  2. Check the existing file by filename.
    • When there is no existing file, the file is downloaded as a new file.
    • When there is an existing file, the file is downloaded as a resumable download.
  3. Download the file content by requests.

When this flow is reflected in a sample script of python, it becomes as follows.

Sample script

service = build("drive", "v3", credentials=creds) # Here, please use your client.
file_id = "###" # Please set the file ID of the file you want to download.

access_token = creds.token # Acces token is retrieved from creds of service = build("drive", "v3", credentials=creds)

# Get the filename and file size.
obj = service.files().get(fileId=file_id, fields="name,size").execute()
filename = obj.get("name", "sampleName")
size = obj.get("size", None)
if not size:
    sys.exit("No file size.")
else:
    size = int(size)

# Check existing file.
file_path = os.path.join("./", filename) # Please set your path.
o = {}
if os.path.exists(file_path):
    o["start_byte"] = os.path.getsize(file_path)
    o["mode"] = "ab"
    o["download"] = "As resume"
else:
    o["start_byte"] = 0
    o["mode"] = "wb"
    o["download"] = "As a new file"
if o["start_byte"] == size:
    sys.exit("The download of this file has already been finished.")

# Download process
print(o["download"])
headers = {
    "Authorization": f"Bearer {access_token}",
    "Range": f'bytes={o["start_byte"]}-',
}
url = f"https://www.googleapis.com/drive/v3/files/{file_id}?alt=media"
with requests.get(url, headers=headers, stream=True) as r:
    r.raise_for_status()
    with open(file_path, o["mode"]) as f:
        for chunk in r.iter_content(chunk_size=10240):
            f.write(chunk)
  • When this script is run, a file of file_id is downloaded. When the downloaded is stopped in the middle of downloading, when you run the script again, the download is run as the resume. By this, the file content is appended to the existing file. I thought that this might be your expected situation.

Note

  • In this case, it supposes that the download file is not Google Docs files (Document, Spreadsheet, Slides, and so on). Please be careful about this.

  • This script supposes that your client service = build("drive", "v3", credentials=creds) can be used for downloading the file from Google Drive. Please be careful about this.

References

 Share!