using gdrive and bash to clean up after your pi makes a mess on the cloud

04 Sep 2017

So my pi made a mess. A >10gb mess made of very small ~50kb files in the root of a google drive. I liked that drive. I missed it. And I was going to get it back.

Here’s how.

First try was select all in the drive. This doesn’t exist. If it does, please don’t tell me. At least not for a few days.

Next was google script. This has a five minute runtime and seems to delete 3 files per second. This job is going to take a lot more than 5 minutes.

But I found a command line utility that talks to google drive! Maybe it has a mass… nope.

I could write a python or perl or whatever script at this point, but strangely that didn’t occur to me. My path lead to a bash script.

This is somewhat surprising. It’s a bit embarrasing, like when you forget someone’s name but wait too long to ask it… but I’ve been working with computers for a long time and never once written a bash script. But you can’t be afraid of your ignorances, shrinking from them makes you smaller. And now I can feel the teacher marking me down for switching points of view, then further for pointing out that I realize I’m doing it.

Anyway, code time. Starting with gdrive list we see some ids. These must be what I need to delete. Now if we can grab the id (googled for awk here) awk -F ' ' '{print $1}' seems to work, had to change the $1 to get it to point to the right column. Still includes the first line, need a tail -n 2. Now I can test it just by pasting the pipeline in: gdrive list | awk -F ' ' '{print $1}' | tail -n +2. Cool, now I need to call the delete with each of these ids.

Here’s the script:

IFS=$'\n'       # make newlines the only separator
while OUTPUT=$(gdrive list | awk -F ' ' '{print $1}' | tail -n +2); do
for j in $OUTPUT
do
        echo "$j"
        gdrive delete -r "$j"
done
done

But it’s super slow. List only gives 30 ids… and maybe we can parallelize it?

IFS=$'\n'       # make newlines the only separator
while OUTPUT=$(gdrive list -m 999 | awk -F ' ' '{print $1}' | tail -n +2); do
for j in $OUTPUT
do
        echo "$j"
        gdrive delete -r "$j"&
done
wait
done

Holy rate limiting batman! Let’s slow it down to the google script speed.

IFS=$'\n'       # make newlines the only separator
while OUTPUT=$(gdrive list -m 999 | awk -F ' ' '{print $1}' | tail -n +2); do
for j in $OUTPUT
do
        echo "$j"
        gdrive delete -r "$j"&
        sleep .3
done
wait
done

There we go. I think gnu parallel might have solved this too, and it would have been a one liner then, though I’m not sure if it offers options to slow down the execution. And this did the trick, after running for a while.