Skip to content

Speed-up database downloads#1805

@aibaars

Description

@aibaars

Is your feature request related to a problem? Please describe.

I feel like it takes too long to download a CodeQL database from GitHub into VSCode.

Describe the solution you'd like

Use multi-threaded downloads to speed things up.

Describe alternatives you've considered
N/A

Additional context

For example the QL database from github/codeql is only 160MB, but it takes 2 minutes to download. If I concurrently download 10 chunks of the file the download takes less than 10 seconds. I wrote a small bash script to demonstrate.
A single 160MB chunk:

time sh script.sh github/codeql ql 1 gh api -H Accept: application/zip -H Range: bytes=0-165712932 /repos/github/codeql/code-scanning/codeql/databases/ql real 2m9.894s user 0m0.439s sys 0m1.426s 

and a download with 10 chunks of 16MB:

time sh script.sh github/codeql ql 10 gh api -H Accept: application/zip -H Range: bytes=0-16571293 /repos/github/codeql/code-scanning/codeql/databases/ql gh api -H Accept: application/zip -H Range: bytes=16571294-33142587 /repos/github/codeql/code-scanning/codeql/databases/ql gh api -H Accept: application/zip -H Range: bytes=33142588-49713881 /repos/github/codeql/code-scanning/codeql/databases/ql gh api -H Accept: application/zip -H Range: bytes=49713882-66285175 /repos/github/codeql/code-scanning/codeql/databases/ql gh api -H Accept: application/zip -H Range: bytes=66285176-82856469 /repos/github/codeql/code-scanning/codeql/databases/ql gh api -H Accept: application/zip -H Range: bytes=82856470-99427763 /repos/github/codeql/code-scanning/codeql/databases/ql gh api -H Accept: application/zip -H Range: bytes=99427764-115999057 /repos/github/codeql/code-scanning/codeql/databases/ql gh api -H Accept: application/zip -H Range: bytes=115999058-132570351 /repos/github/codeql/code-scanning/codeql/databases/ql gh api -H Accept: application/zip -H Range: bytes=132570352-149141645 /repos/github/codeql/code-scanning/codeql/databases/ql gh api -H Accept: application/zip -H Range: bytes=149141646-165712932 /repos/github/codeql/code-scanning/codeql/databases/ql real 0m9.752s user 0m1.069s sys 0m2.009s 

The script

#! /bin/bash nwo="$1" lang="$2" count="$3" URL="/repos/${nwo}/code-scanning/codeql/databases/${lang}" SIZE=$(gh api -H "Accept: application/zip" -H "Range: bytes=0-1" -i "${URL}"| tr -d '\r'| grep "Content-Range: bytes 0-1/"| cut -d / -f 2) CHUNK_SIZE=$(expr "${SIZE}" / "${count}") start=0 parts=""foriin$(seq $(expr "${count}" - 1))do end=$(expr "${start}" + "${CHUNK_SIZE}")echo gh api -H "Accept: application/zip" -H "Range: bytes=${start}-${end}""${URL}" gh api -H "Accept: application/zip" -H "Range: bytes=${start}-${end}""${URL}">"part-$i"& start=$(expr "${end}" + 1) parts="${parts}part-${i}"doneif [ "${start}"-lt"${SIZE}" ] ;thenecho gh api -H "Accept: application/zip" -H "Range: bytes=${start}-${SIZE}""${URL}" gh api -H "Accept: application/zip" -H "Range: bytes=${start}-${SIZE}""${URL}">"part-${count}" parts="${parts}part-${count}"fiwait cat $parts> database.zip rm -f $parts

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions