python 3 subprocess performance

Kubilay Kocak koobs at FreeBSD.org
Sun Apr 14 04:59:59 UTC 2019


On 12/04/2019 8:41 pm, Dima Pasechnik wrote:
> On Fri, Apr 12, 2019 at 9:46 AM Alexander Zagrebin <alex at zagrebin.ru> wrote:
>>
>> В Fri, 12 Apr 2019 09:36:13 +0200
>> Dima Pasechnik <dimpase+freebsd at gmail.com> пишет:
>>
>>> On Fri, Apr 12, 2019 at 9:11 AM Alexander Zagrebin <alex at zagrebin.ru>
>>> wrote:
>>>>
>>>> В Thu, 11 Apr 2019 17:32:42 +0200
>>>> Jan Bramkamp <crest at rlwinm.de> пишет:
>>>>
>>>>> The reason is that that python does something stupid (tm). It
>>>>> tries to close all file descriptors (except a few whitelisted
>>>>> ones) up to the maximum file descriptor number. It does this by
>>>>> asking the kernel for the maximum possible number and closing
>>>>> everything it doesn't want to keep. Some time later someone came
>>>>> up with an optimization (read the open file descriptors
>>>>> from /dev/fd). All of this pain and suffering is caused by good
>>>>> old Ulrich Drepper braindamage:
>>>>> https://sourceware.org/bugzilla/show_bug.cgi?id=10353.
>>>>>
>>>>> Most Linux distros have lower default file descriptor limits than
>>>>> FreeBSD making this workaround less painful. The correct solution
>>>>> would be to teach python3 about closefrom(2).
>>>>
>>>> Thank you for hint and testing!
>>>>
>>>> Indeed the problem is in closing more than 400,000 file descriptors
>>>> in loop. It seems that all current versions of Python are affected.
>>>> Python2 uses False as default value for the close_fds parameter of
>>>> the Popen constructor, so this issue is mostly not visible.
>>>> Python3 has changed this default to True.
>>>>
>>>> As Jan Bramkamp suggested, I've wrote simple patch to fix an issue
>>>> (see attached file). It seems the problem has gone.
>>>
>>> The attachment has been stripped out. Could you paste the diff into
>>> the message?
>>
>> Yes, sure.
>>
>> --- Modules/_posixsubprocess.c.orig     2018-12-24 00:37:14.000000000
>> +0300 +++ Modules/_posixsubprocess.c          2019-04-12
>> 09:25:21.549389000 +0300 @@ -235,11 +235,15 @@
>> _close_fds_by_brute_force(long start_fd, }
>>           start_fd = keep_fd + 1;
>>       }
>> +#if defined(__FreeBSD__)
>> +    closefrom(start_fd);
>> +#else
>>       if (start_fd <= end_fd) {
>>           for (fd_num = start_fd; fd_num < end_fd; ++fd_num) {
>>               close(fd_num);
>>           }
>>       }
>> +#endif
>>   }
>>
>>> If this is a Python issue, shouldn't this be reported upstream, on
>>> https://bugs.python.org ?
>>
>> May be. Rather, it is a FreeBSD-specific optimization.
> 
> Well, closefrom() is also available in Darwin (a.k.a. MacOSX :-)),
> OpenBSD and NetBSD. (It's not documented in current MacOSX, but it is
> there, I just checked)
> Anyway, FreeBSD Python maintainers will ask for an upstream PR.
> 
> I can do such a PR is noone else is willing to...
> 
> Dima
> 
> 

Hi Dima,

Issue exists for this:

https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=221700

Pending *upstreamable* patches for lang/python*, that we can carry 
locally until released.



More information about the freebsd-ports mailing list