Re: zfs and git upload-pack

From: Philipp <satanist+freebsd_at_bureaucracy.de>
Date: Tue, 09 Aug 2022 13:16:16 UTC
[2022-08-07 20:52] David Christensen <dpchrist@holgerdanske.com>
> On 8/7/22 12:13, Philipp Takacs wrote:
> > On Sun, 7 Aug 2022 11:12:20 -0700
> > David Christensen <dpchrist@holgerdanske.com> wrote:
> > 
> >> On 8/7/22 10:57, Philipp Takacs wrote:
> >>> On Sun, 7 Aug 2022 09:54:41 -0700
> >>> David Christensen <dpchrist@holgerdanske.com> wrote:
> >>>    
> >>>> On 8/7/22 01:28, Philipp wrote:
> >>>>> Hi all
> >>>>>
> >>>>> I host a quite uncommon git repository mostly out of binary
> >>>>> files. I have the problem every time this repo is cloned the host
> >>>>> allocate memory and going to swap. This leads to the host being
> >>>>> unusable and need to force rebooted.
> >>>>>
> >>>>> The repo is stored on a zfs and nullmounted in a jail to run the
> >>>>> git service over ssh. The host is a FreeBSD 13.1 with 4GB RAM and
> >>>>> 4GB swap.
> >>>>>
> >>>>> What I have noticed is that the biggest memory consumtion is from
> >>>>> mmap() a pack file. For the given repo this has the size of 6,7G.
> >>>>> I suspect this file is mapped in memory but not correctly
> >>>>> handled/unmaped (by the kernel) when not enough memory is
> >>>>> available.
> >>>>>
> >>>>> I have tested some options to solve/workaround this issue:
> >>>>>
> >>>>> * limit the zfs ARC size in loader.conf
> >>>>> * zfs set primarycache none for the dataset
> >>>>> * limit datasize, memoryuse and vmemoryuse via login.conf
> >>>>> * limit git packedGitLimit
> >>>>>
> >>>>> None of them have solved the issue.
>
>
> I would restore them to previous values.

I have done this. Now the behavior has changed. Now one clone was
succsessfull and at later clones stop with an error (Cannot allocate
memory). This is better but still not good.

> >>> this repo gets cloned a few times a month. Currently
> >>> the Host dies because one client try to clone this repo.
>
>
> What happens if the clone is attempted by a different user on the same 
> workstation?
>
>
> What happens if the clone is attempted from another workstation?

The same as described.

> >> Please post console sessions that demonstrate cloning without failure
> >> and cloning with failure.
> > 
> > Not sure what you mean. 
>
>
> Please post client console sessions that demonstrate correct operation 
> and failed operation.

Let me rephrase this: I'm not shure what you expect from this but ok:

successfull:

satanist@hell tmp$ git clone -v ssh://bigrepo@git.bureaucracy.de:2222/bigrepo
Cloning into 'bigrepo'...
remote: Objekte aufzählen: 9661, fertig.
remote: Gesamt 9661 (Delta 0), Wiederverwendet 0 (Delta 0), Pack wiederverwendet 9661
Receiving objects: 100% (9661/9661), 6.73 GiB | 5.96 MiB/s, done.
Resolving deltas: 100% (3/3), done.
Updating files: 100% (6591/6591), done.

unsuccessfull:

satanist@hell tmp$ git clone -v ssh://bigrepo@git.bureaucracy.de:2222/bigrepo
Cloning into 'bigrepo'...
remote: Enumerating objects: 9661, done.                                                      Rerror: git upload-pack: git-pack-objects died with error.iB/s
fatal: git upload-pack: aborting due to possible repository corruption on the remote side.
remote: fatal: packfile ./objects/pack/pack-6fee671a31a59454b539c88d674373d88ad67780.pack cannot be mapped: Cannot allocate memory
remote: aborting due to possible repository corruption on the remote side.
fatal: early EOF
fatal: index-pack failed

As mentioned earlier the "Cannot allocate memory" is new. The old
behavior was that the server was unusable till I restarted the server.
I currently don't know how this exactly looks on the client, but there
is not mutch info in the output.

> > This is a server, a client connect with a
> > git client over ssh and use git-upload-pack 
>
>
> https://git-scm.com/docs/git-upload-pack

Yes this programm, but I post hear because I susspect this is an freebsd
issue not an issue with git. This programm basicly mmap() some files,
parse them and write parts (based on stdin) of the content to stdout.

> > to receive the content of
> > the repo. The communication of the git client and git-upload-pack works
> > with stdin/stdout. I can give the logs of my git authorization handler
> > (inside jail):
>
> <snip>
>
> What file?

This file is called fugit.log and is created by a authorization handler
for git over ssh called fugit. I use fugit to manage authorization for
multible git repositories. See https://github.com/cbdevnet/fugit/blob/master/fugit

> > The last line mean the clone was finished[0]. But at this time
> > everything else on the host was unusable. Here the corresponding content
> > of /var/log/messages:
>
> <snip>
>
> > Between 12:00 and 14:00 the server started to be slow and running
> > ssh/mosh session stopped working. Starting new sessions over ssh was
> > not possible. The root login at 14:38 was me over ipmi try to somehow
> > get the the host working again. But I could login and only run top then
> > this session was also unusable. I have then restarted the server over
> > ipmi.
>
>
> It looks like the server is getting overloaded with incoming TCP 
> packets, the  client is closing connections, the client is timing out 
> when reconnecting, etc.. Near the end, I see jails being killed.  I do 
> not see reasons why.  Perhaps there are clues in other logs.  Perhaps 
> you can increase the logging verbosity of the Git and/or SSH services to 
> obtain clues.

From my perspective this looks a bit diffrent, because I could reproduce
the behavior. So it looks like the git-upload-pack and the corresponding
IO causes a lot of cache allocation. This leads to no free memory left
for the rest of the operation of the server.

I don't know where to increase the verbosity. ssh just starts a session
and goes on. The logs of fugit are already posted completly. git-upload-pack
does not log.

> Start the following command in a terminal on the server to monitor ZFS 
> disk activity (press Ctrl+C to exit):
>
> # zpool iostat -v 60

This looks quite normal here parts of the output:

Normal operation without a git clone running:

pool        alloc   free   read  write   read  write
----------  -----  -----  -----  -----  -----  -----
zroot       51.8G   408G      3     14  25.4K   150K
  mirror-0  51.8G   408G      3     14  25.4K   150K
    ada1p3      -      -      1      7  12.2K  74.8K
    ada0p3      -      -      1      7  13.2K  74.8K
----------  -----  -----  -----  -----  -----  -----

During a clone:

              capacity     operations     bandwidth 
pool        alloc   free   read  write   read  write
----------  -----  -----  -----  -----  -----  -----
zroot       51.8G   408G     22     24  2.87M   226K
  mirror-0  51.8G   408G     22     24  2.87M   226K
    ada1p3      -      -     11     12  1.47M   113K
    ada0p3      -      -     11     12  1.40M   113K
----------  -----  -----  -----  -----  -----  -----

As expected the read goes up during the the clone. But not
to a level I have conserne about the load.

> Start the following command in another terminal on the server to monitor 
> CPU and/or IO activity (press 'm' to switch between the two) (press 'q' 
> to exit):
>
> # top -S -s 60

This also looks as expected. The git process (chiled of git-upload-pack)
uses cpu and memory also creates IO. I have some output some secounds
befor the git was killed (sorted by RES):

Mem: 598M Active, 426M Inact, 166M Laundry, 1223M Wired, 1020M Free
ARC: 543M Total, 185M MFU, 82M MRU, 16K Anon, 8618K Header, 267M Other
     98M Compressed, 240M Uncompressed, 2,45:1 Ratio
Swap: 4096M Total, 17M Used, 4079M Free
 Displaying CPU statistics.
  PID USERNAME    THR PRI NICE   SIZE    RES STATE    C   TIME    WCPU COMMAND
45719 satanist      1  30    0  1289M   986M pipdwt   2   0:26  17,77% git
53388  10001       44  52    0  2755M   237M uwait    2   2:23   0,15% java

After the java process there are only processes with less then 100MB
reserved.

I don't know excactly, but it looks like RES and SIZE adds memory
allocation and memory mapped files. In this case I would argue there is
sufficient memory availible to drop, because it can be read from disk. 

Philipp