Git/Large File Hunt: Difference between revisions
< Git
(Created page with "== Kill Large Files == == Find Large Files == git rev-list --objects --all | git cat-file --batch-check='%(objecttype) %(objectname) %(objectsize) %(rest)' | sed -n 's/^blob //p' | sort --numeric-sort --key=2 | cut -c 1-12,41- | $(command -v gnumfmt || echo numfmt) --field=2 --to=iec-i --suffix=B --padding=7 --round=nearest * git rev-list --objects --all: Lists all objects reachable from any reference. * git cat-file --batch-check='%(objecttype) %(objectname) %(objec...") |
No edit summary |
||
Line 13: | Line 13: | ||
ref: [https://www.pixelite.co.nz/article/finding-and-deleting-large-files-in-a-git-repo/#:~:text=To%20find%20the%20largest%20files%20in%20a,fetching%20operations%2C%20and%20make%20developers%20less%20efficient.] [https://stackoverflow.com/questions/64397278/understanding-git-rev-list#:~:text=If%20you%20are%20using%20Git%20in%20the,all%20commits%20are%20reachable%20from%20all%20references.] | ref: [https://www.pixelite.co.nz/article/finding-and-deleting-large-files-in-a-git-repo/#:~:text=To%20find%20the%20largest%20files%20in%20a,fetching%20operations%2C%20and%20make%20developers%20less%20efficient.] [https://stackoverflow.com/questions/64397278/understanding-git-rev-list#:~:text=If%20you%20are%20using%20Git%20in%20the,all%20commits%20are%20reachable%20from%20all%20references.] | ||
--- Example | |||
Junk files that should not have been committed: | |||
<pre> | |||
... | |||
b90d01c3dded 73MiB reserve-app/client/node_modules/.cache/default-development/9.pack | |||
ee46b2ee4ad9 73MiB reserve-app/client/node_modules/.cache/default-development/4.pack | |||
293ee8349dbd 74MiB reserve-app/client/node_modules/.cache/default-development/3.pack | |||
02dbad83c8e6 74MiB reserve-app/client/node_modules/.cache/default-development/13.pack | |||
60cbbaa4850f 100MiB mongo_local/logs/journal/TigerLog.0000000005 | |||
aa52a216f4fc 100MiB mongo_local/logs/journal/TigerPreplog.0000000001 | |||
</pre> | |||
--- | |||
Show full ID: | |||
git rev-list --all --objects | grep <blob-id> | |||
git rev-list --all --objects | grep 60cbbaa4850f | |||
# or just rerun without cut | |||
git rev-list --objects --all | git cat-file --batch-check='%(objecttype) %(objectname) %(objectsize) %(rest)' | sed -n 's/^blob //p' | sort --numeric-sort --key=2 | $(command -v gnumfmt || echo numfmt) --field=2 --to=iec-i --suffix=B --padding=7 --round=nearest | |||
=== Find the commit that has these objects === | |||
git log --find-object=<blob-id> --all | |||
Example: | |||
git log --find-object 60cbbaa4850f --all |
Revision as of 03:31, 29 June 2025
Kill Large Files
Find Large Files
git rev-list --objects --all | git cat-file --batch-check='%(objecttype) %(objectname) %(objectsize) %(rest)' | sed -n 's/^blob //p' | sort --numeric-sort --key=2 | cut -c 1-12,41- | $(command -v gnumfmt || echo numfmt) --field=2 --to=iec-i --suffix=B --padding=7 --round=nearest
- git rev-list --objects --all: Lists all objects reachable from any reference.
- git cat-file --batch-check='%(objecttype) %(objectname) %(objectsize) %(rest)': Provides details about each object, including type, name, size, and the rest of the line.
- sed -n 's/^blob //p': Filters the output to include only blob objects and removes the "blob " prefix.
- sort --numeric-sort --key=2: Sorts the output numerically based on the second field (object size).
- cut -c 1-12,41-: Extracts the first 12 characters (object ID) and everything from the 41st character onwards (file name).
- $(command -v gnumfmt || echo numfmt) --field=2 --to=iec-i --suffix=B --padding=7 --round=nearest: Formats the file size to a human-readable format.
--- Example
Junk files that should not have been committed:
... b90d01c3dded 73MiB reserve-app/client/node_modules/.cache/default-development/9.pack ee46b2ee4ad9 73MiB reserve-app/client/node_modules/.cache/default-development/4.pack 293ee8349dbd 74MiB reserve-app/client/node_modules/.cache/default-development/3.pack 02dbad83c8e6 74MiB reserve-app/client/node_modules/.cache/default-development/13.pack 60cbbaa4850f 100MiB mongo_local/logs/journal/TigerLog.0000000005 aa52a216f4fc 100MiB mongo_local/logs/journal/TigerPreplog.0000000001
---
Show full ID:
git rev-list --all --objects | grep <blob-id> git rev-list --all --objects | grep 60cbbaa4850f
# or just rerun without cut git rev-list --objects --all | git cat-file --batch-check='%(objecttype) %(objectname) %(objectsize) %(rest)' | sed -n 's/^blob //p' | sort --numeric-sort --key=2 | $(command -v gnumfmt || echo numfmt) --field=2 --to=iec-i --suffix=B --padding=7 --round=nearest
Find the commit that has these objects
git log --find-object=<blob-id> --all
Example:
git log --find-object 60cbbaa4850f --all