One of the things I really like about git is that it doesn't automatically garbage collect and compact repositories. It's kind of like how you don't clean your room and take out the garbage (or at least I don't) every time you make a mess or throw something away. If you accidently throw something away, you can pull it out of the trash before it goes to the dump. You don't have to be slowed down by a clean up operation when you're focusing on getting things done.
However, there is a draw back to this. Git repositories DO start to slow down after a while if you aren't "cleaning your room". And, they won't be as efficient in disk space utilization.
Lucky for us, since we are working with computers here, we don't have to clean our rooms by hand. The following little bash script will crawl your whole hard drive, look for any git repositories, and then garbage collect, prune, and pack them, regaining your disk space and making your repositories operate faster:
#!/bin/sh find . -type d -name .git | while read dir; do pushd "$dir"; git gc --prune; popd; done
PS: ref logs keep objects from being pruned. More on ref logs in a future post.
5 comments:
Current version of git has "git gc --auto", which is invoked by some git commands, but by default it expires (removes) only 'old garbage'.
Instead of 'pushd "$dir"; git gc --prune; popd;' you can simply use 'git --git-dir="$dir" gc --prune;'
That's a great tip Jakub
I still slightly prefer the "pushd" operation, because pushd and popd will output the directory stack upon execution, which makes it a little more verbose.
I also know you can pass the -exec parameter to find and substitute in a parameter there. But, for some reason, the while loop is so much more legible to me than:
find . -type d -name .git -exec git --git-dir={} gc --prune \;
Thanks. And a nice use of pushd/popd and find.
Great idea Tim & Jakub, thanks for posting the pushd & the exec ways.
Post a Comment