# Removing Large Files From a Git Repository's History

Part of what makes Git so useful is that it keeps every version of every file you commit to a repository. For files that are changed frequently, this implies a lot of storage overhead. Git tries to keep this overhead manageable by compressing similar files in the repository’s history into packfiles, making storage of code and other text-based file formats very space-efficient.

However, this deduplicating approach to compression doesn’t work well for large binary files where the differences between versions are significant, such as images, PDFs, SQLite databases or other non-source data. This is why almost every Git guide discourages commiting these types of files1 – they make your repository balloon in size much more quickly than you might expect!

If you, like me, have resisted this advice2 and now want to purge a large file from your Git history, run the following commands:

FILE='path/to/file.ext'

# optional: make a backup of the file's current version
cp "$FILE" /tmp/ # purge the file from the history of all branches – this might take a while git filter-branch --index-filter "git rm -rf --cached --ignore-unmatch$FILE" HEAD
git gc

# optional: restore and commit backupped version
cp /tmp/$(basename "$FILE") "$FILE" git add "$FILE"