Turns out git fails spectacularly when working with large files. I was surprised, but the behavior is pretty well documented. In typical git fashion, there is an obscure error message and an equally obscure command to fix it.
A real-life example (with repository names changed):
The Problem
artem@MBP:~/git$ git clone git@gitlab:has_a_large_file.git Cloning into 'has_a_large_file'... Identity added: /Users/artem/.ssh/devkey (/Users/artem/.ssh/devkey) remote: Counting objects: 6, done. error: git upload-pack: git-pack-objects died with error. fatal: git upload-pack: aborting due to possible repository corruption on the remote side. remote: Compressing objects: 100% (5/5), done. remote: fatal: Out of memory, malloc failed (tried to allocate 1857915877 bytes) remote: aborting due to possible repository corruption on the remote side. fatal: early EOF fatal: index-pack failed
I pushed the large file without issues, but couldn't pull it again because the remote was dying. The astute reader will notice the remote was running gitlab. The push also broke the gitlab web interface for the repository.
From my Googling, the problem is that the remote side is running out of memory when compressing a large file (read more about git packfiles here). Judging by the error, git attempts to malloc(size_of_large_file) and the malloc fails.
This situation raises conundrums that may only be answered by Master Git:
- Why was I able to push a large file, but not pull it?
- Why would one malloc(size_of_large_file) ?
- What happens when you push a >4Gb file to a 32-bit remote?
I was curious enough about the last one to look at the code: it will likely die gracefully (see line 49 of wrapper.c). Integer overflow likely avoided; would need to read more code much more carefully to be sure.
The Solution
git repack -a -f -d
Of course, repacking the remote but having non-repacked local repositories around may cause other problems.
Just For Fun
artem@MBP:~/temp/largerandomfile$ dd if=/dev/urandom of=./random_big_file bs=4096 count=1048577 1048577+0 records in 1048577+0 records out 4294971392 bytes transferred in 437.836959 secs (9809522 bytes/sec) artem@MBP:~/temp/largerandomfile$ git add random_big_file artem@MBP:~/temp/largerandomfile$ git commit -m "Added a big random file" [master 377db57] Added a big random file 1 file changed, 0 insertions(+), 0 deletions(-) create mode 100644 random_big_file artem@MBP:~/temp/largerandomfile$ git push origin master Counting objects: 4, done. Delta compression using up to 2 threads. Compressing objects: 100% (2/2), done. error: RPC failed; result=22, HTTP code = 413 KiB/s fatal: The remote end hung up unexpectedly Writing objects: 100% (3/3), 4.00 GiB | 18.74 MiB/s, done. Total 3 (delta 0), reused 1 (delta 0) fatal: recursion detected in die handler Everything up-to-date
Everything up-to-date, indeed.
No comments:
Post a Comment