svn commit: r324405 - head/sys/kern
Jason Eggleston
eggnet at gmail.com
Tue Oct 10 00:09:51 UTC 2017
https://reviews.freebsd.org/D12633
On Mon, Oct 9, 2017 at 5:01 PM, Jason Eggleston <eggnet at gmail.com> wrote:
> In tcp_input.c, this is where a RST is handled. so_error gets ECONNRESET,
> then tcp_close() is called.
>
> case TCPS_ESTABLISHED:
> case TCPS_FIN_WAIT_1:
> case TCPS_FIN_WAIT_2:
> case TCPS_CLOSE_WAIT:
> case TCPS_CLOSING:
> case TCPS_LAST_ACK:
> so->so_error = ECONNRESET;
> close:
> /* FALLTHROUGH */
> default:
> tp = tcp_close(tp);
>
> tcp_close() calls soisdisconnected() which sets this flag:
>
> so->so_state |= SS_ISDISCONNECTED;
>
> Followed by:
>
> ...
> socantrcvmore_locked(so);
> ...
> socantsendmore_locked(so);
> ...
>
> Those two functions set these flags:
>
> so->so_rcv.sb_state |= SBS_CANTRCVMORE;
> so->so_snd.sb_state |= SBS_CANTSENDMORE;
>
> A TCP session cannot survive a RST, and i/o calls will not work afterword,
> no matter how many times you try.
>
> Having said that, there's a race condition between when so_error is set
> and when the socket and socket buffer flags are set, and I agree with your
> strategy of:
>
> - have sendfile() match send()
> - The current send() behavior is probably fine as is
>
> I agree sendfile() should match write() and send(), by checking in this
> order:
>
> SBS_CANTSENDMORE
> so_error
> SS_ISCONNECTED
>
> and will prep a new review with your patch.
>
> On Mon, Oct 9, 2017 at 4:20 PM, Gleb Smirnoff <glebius at freebsd.org> wrote:
>
>> Sean & Jason,
>>
>> On Sat, Oct 07, 2017 at 11:30:57PM +0000, Sean Bruno wrote:
>> S> Author: sbruno
>> S> Date: Sat Oct 7 23:30:57 2017
>> S> New Revision: 324405
>> S> URL: https://svnweb.freebsd.org/changeset/base/324405
>> S>
>> S> Log:
>> S> Check so_error early in sendfile() call. Prior to this patch, if a
>> S> connection was reset by the remote end, sendfile() would just report
>> S> ENOTCONN instead of ECONNRESET.
>> S>
>> S> Submitted by: Jason Eggleston <jason at eggnet.com>
>> S> Reviewed by: glebius
>> S> Sponsored by: Limelight Networks
>> S> Differential Revision: https://reviews.freebsd.org/D12575
>> S>
>> S> Modified:
>> S> head/sys/kern/kern_sendfile.c
>> S>
>> S> Modified: head/sys/kern/kern_sendfile.c
>> S> ============================================================
>> ==================
>> S> --- head/sys/kern/kern_sendfile.c Sat Oct 7 23:10:16 2017
>> (r324404)
>> S> +++ head/sys/kern/kern_sendfile.c Sat Oct 7 23:30:57 2017
>> (r324405)
>> S> @@ -514,6 +514,11 @@ sendfile_getsock(struct thread *td, int s, struct
>> file
>> S> *so = (*sock_fp)->f_data;
>> S> if ((*so)->so_type != SOCK_STREAM)
>> S> return (EINVAL);
>> S> + if ((*so)->so_error) {
>> S> + error = (*so)->so_error;
>> S> + (*so)->so_error = 0;
>> S> + return (error);
>> S> + }
>> S> if (((*so)->so_state & SS_ISCONNECTED) == 0)
>> S> return (ENOTCONN);
>> S> return (0);
>>
>> Despite my initial positive review, now I am quite unsure on that.
>>
>> Problem is that sendfile(2) isn't defined by SUS, so there is no
>> distinctive final answer on that. Should we match other OSes?
>> Should we match our historic behaviour? Or should we match
>> write(2)/send(2) to socket, which are closest analogy. I probably
>> believe in the latter: sendfile(2) belongs to write(2)/send(2)
>> family.
>>
>> SUS specifies that write may return ECONNRESET. It also documents
>> EPIPE. However, our manual page documents only EPIPE for both
>> send(2) and write(2). For write we have:
>>
>> SOCKBUF_LOCK(&so->so_snd);
>> if (so->so_snd.sb_state & SBS_CANTSENDMORE) {
>> SOCKBUF_UNLOCK(&so->so_snd);
>> error = EPIPE;
>> goto out;
>> }
>> if (so->so_error) {
>> error = so->so_error;
>> so->so_error = 0;
>> SOCKBUF_UNLOCK(&so->so_snd);
>> goto out;
>> }
>>
>> Indeed, EPIPE will be returned prior to return/clear of so_error,
>> which supposedly is ECONNRESET.
>>
>> In the sendfile(2) implementation we see exactly same code:
>>
>> if (so->so_snd.sb_state & SBS_CANTSENDMORE) {
>> error = EPIPE;
>> SOCKBUF_UNLOCK(&so->so_snd);
>> goto done;
>> } else if (so->so_error) {
>> error = so->so_error;
>> so->so_error = 0;
>> SOCKBUF_UNLOCK(&so->so_snd);
>> goto done;
>> }
>>
>> But it isn't reached. Before due to SS_ISCONNECTED check, now
>> due to your change. Now we got two spots where so_error is
>> returned/cleared.
>>
>> For the receive family (read(2) and recv(2)), SUS specifies
>> ECONNRESET and our code does that. It also clears so_error.
>>
>> So, after your change at least one standard case is broken: an
>> application that polls/selects for both readability and
>> writeability of a socket. It discovers that socket is available
>> for writing, does sendfile(2), receives ECONNRESET. Then it does
>> read on the socket, and blocks(?) or returns a bogus error(?),
>> not sure. But definitely not ECONNRESET, since it is now cleared.
>>
>> Now, where does this ENOTCONN come from? These two lines:
>>
>> if (((*so)->so_state & SS_ISCONNECTED) == 0)
>> return (ENOTCONN);
>>
>> can be traced back all the way to original 1998 commit of sendfile.
>>
>> Here comes difference between sendfile(2) and send(2). In send(2),
>> this check is done _after_ so_error check, so your change is correct
>> in placing of so_error check wrt so_state check, but is incorrect
>> in placing wrt socket buffer state check.
>>
>> After all this considerations, I would suggest to revert your change,
>> and apply this patch. It would make behavior of sendfile(2) to match
>> send(2). I would appreciate more reviews around it.
>>
>> --
>> Gleb Smirnoff
>>
>
>
More information about the svn-src-all
mailing list