Sometimes you might have committed and pushed large files or files with sensitive data to your git repository. In this article I will explain how to remove this files.
I was playing around with automated image processing and have committed some raw images by accident. That's why I have been looking for a way or tool to remove the files and clean the git history.
After a quick search I found the tool BFG Repo-Cleaner.
Make a backup of your git repository before you start and contact all developers contributing. Removing files from the history will also change past commits and their ID.
First of all remove all unwanted files from your git repository, commit and push your changes. Then clone the repository using the --mirror
flag to create a bare repo.
$ git clone --mirror git@git.example.org:your/repo.git
At this point the BFG Repo-Cleaner can remove all unwanted files from the git repository and from its history.
$ java -jar ~/downloads/bfg-1.13.0.jar --delete-files my-files.jpg repo.git
[... more output with additional information ...]
Deleted files
-------------
Filename Git id
--------------------------------
my-files.jpg | 1fab5f18 (8,2 MB)
[... more output with additional information ...]
After the tool has modified the history git has to do some internal clean up.
cd repo.git
git reflog expire --expire=now --all && git gc --prune=now --aggressive
If everything has been finished successfully the repo can be pushed back. Be aware that you need the right to do a force push.
git push
Thats it.
You can find additional information and examples in the documentation of the bfg tool.
Links
- Website: BFG Repo-Cleaner (english)
- Website: Removing sensitive data from a repository (english)