Git/Large File Hunt

From Omnia
< Git
Revision as of 03:31, 29 June 2025 by Kenneth (talk | contribs)
Jump to navigation Jump to search

Kill Large Files

Find Large Files

git rev-list --objects --all | git cat-file --batch-check='%(objecttype) %(objectname) %(objectsize) %(rest)' | sed -n 's/^blob //p' | sort --numeric-sort --key=2 | cut -c 1-12,41- | $(command -v gnumfmt || echo numfmt) --field=2 --to=iec-i --suffix=B --padding=7 --round=nearest
  • git rev-list --objects --all: Lists all objects reachable from any reference.
  • git cat-file --batch-check='%(objecttype) %(objectname) %(objectsize) %(rest)': Provides details about each object, including type, name, size, and the rest of the line.
  • sed -n 's/^blob //p': Filters the output to include only blob objects and removes the "blob " prefix.
  • sort --numeric-sort --key=2: Sorts the output numerically based on the second field (object size).
  • cut -c 1-12,41-: Extracts the first 12 characters (object ID) and everything from the 41st character onwards (file name).
  • $(command -v gnumfmt || echo numfmt) --field=2 --to=iec-i --suffix=B --padding=7 --round=nearest: Formats the file size to a human-readable format.

ref: [1] [2]

--- Example

Junk files that should not have been committed:

...
b90d01c3dded   73MiB reserve-app/client/node_modules/.cache/default-development/9.pack
ee46b2ee4ad9   73MiB reserve-app/client/node_modules/.cache/default-development/4.pack
293ee8349dbd   74MiB reserve-app/client/node_modules/.cache/default-development/3.pack
02dbad83c8e6   74MiB reserve-app/client/node_modules/.cache/default-development/13.pack
60cbbaa4850f  100MiB mongo_local/logs/journal/TigerLog.0000000005
aa52a216f4fc  100MiB mongo_local/logs/journal/TigerPreplog.0000000001

---


Show full ID:

git rev-list --all --objects | grep <blob-id>
git rev-list --all --objects | grep 60cbbaa4850f
# or just rerun without cut
git rev-list --objects --all | git cat-file --batch-check='%(objecttype) %(objectname) %(objectsize) %(rest)' | sed -n 's/^blob //p' | sort --numeric-sort --key=2 | $(command -v gnumfmt || echo numfmt) --field=2 --to=iec-i --suffix=B --padding=7 --round=nearest


Find the commit that has these objects

git log --find-object=<blob-id> --all

Example:

git log --find-object 60cbbaa4850f --all