Thursday, June 12, 2008

Garbage collect every git repository on your machine

One of the things I really like about git is that it doesn't automatically garbage collect and compact repositories. It's kind of like how you don't clean your room and take out the garbage (or at least I don't) every time you make a mess or throw something away. If you accidently throw something away, you can pull it out of the trash before it goes to the dump. You don't have to be slowed down by a clean up operation when you're focusing on getting things done.

However, there is a draw back to this. Git repositories DO start to slow down after a while if you aren't "cleaning your room". And, they won't be as efficient in disk space utilization.

Lucky for us, since we are working with computers here, we don't have to clean our rooms by hand. The following little bash script will crawl your whole hard drive, look for any git repositories, and then garbage collect, prune, and pack them, regaining your disk space and making your repositories operate faster:

#!/bin/sh
find . -type d -name .git | while read dir; do pushd "$dir"; git gc --prune; popd; done

PS: ref logs keep objects from being pruned. More on ref logs in a future post.

5 comments:

Jakub Narebski said...

Current version of git has "git gc --auto", which is invoked by some git commands, but by default it expires (removes) only 'old garbage'.

Jakub Narebski said...

Instead of 'pushd "$dir"; git gc --prune; popd;' you can simply use 'git --git-dir="$dir" gc --prune;'

Tim Harper said...

That's a great tip Jakub

I still slightly prefer the "pushd" operation, because pushd and popd will output the directory stack upon execution, which makes it a little more verbose.

I also know you can pass the -exec parameter to find and substitute in a parameter there. But, for some reason, the while loop is so much more legible to me than:

find . -type d -name .git -exec git --git-dir={} gc --prune \;

Anonymous said...

Thanks. And a nice use of pushd/popd and find.

Joel Parker Henderson said...

Great idea Tim & Jakub, thanks for posting the pushd & the exec ways.