gdown:解决在 Google Drive 上下载大型数据集常发生的失败现象

gdown:解决在 Google Drive 上下载大型数据集常发生的失败现象

Github 官网:https://github.com/wkentaro/gdown

安装

1
2
3
4
pip install gdown

# to upgrade
pip install --upgrade gdown

使用

  • 通过链接下载(大文件~500MB)
1
2
3
$ gdown https://drive.google.com/uc?id=1l_5RK28JRL19wpT22B-DY9We3TVXnnQQ
$ md5sum fcn8s_from_caffe.npz
256c2a8235c1c65e62e48d3284fbd384
  • 通过文件ID下载
1
$ gdown 1l_5RK28JRL19wpT22B-DY9We3TVXnnQQ
  • 下载文件夹
1
$ gdown https://drive.google.com/drive/folders/15uNXeRBIhVvZJIhL4yTw4IsStMhUaaxl -O /tmp/folder --folder
  • 通过Python下载
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
import gdown

# a file
url = "https://drive.google.com/uc?id=1l_5RK28JRL19wpT22B-DY9We3TVXnnQQ"
output = "fcn8s_from_caffe.npz"
gdown.download(url, output, quiet=False)

# same as the above, but with the file ID
id = "0B9P1L--7Wd2vNm9zMTJWOGxobkU"
gdown.download(id=id, output=output, quiet=False)

# same as the above, and you can copy-and-paste a URL from Google Drive with fuzzy=True
url = "https://drive.google.com/file/d/0B9P1L--7Wd2vNm9zMTJWOGxobkU/view?usp=sharing"
gdown.download(url=url, output=output, quiet=False, fuzzy=True)

# cached download with identity check via MD5
md5 = "fa837a88f0c40c513d975104edf3da17"
gdown.cached_download(url, output, md5=md5, postprocess=gdown.extractall)

# a folder
url = "https://drive.google.com/drive/folders/15uNXeRBIhVvZJIhL4yTw4IsStMhUaaxl"
gdown.download_folder(url, quiet=True, use_cookies=False)

# same as the above, but with the folder ID
id = "15uNXeRBIhVvZJIhL4yTw4IsStMhUaaxl"
gdown.download_folder(id=id, quiet=True, use_cookies=False)

gdown:解决在 Google Drive 上下载大型数据集常发生的失败现象
https://wangyinan.cn/gdown:解决在-Google-Drive-上下载大型数据集常发生的失败现象
作者
yinan
发布于
2023年12月24日
许可协议