svn commit: r324405 - head/sys/kern

Tue Oct 10 07:17:15 UTC 2017

On Mon, Oct 09, 2017 at 04:20:52PM -0700, Gleb Smirnoff wrote:
>   Sean & Jason,
> 
> On Sat, Oct 07, 2017 at 11:30:57PM +0000, Sean Bruno wrote:
> S> Author: sbruno
> S> Date: Sat Oct  7 23:30:57 2017
> S> New Revision: 324405
> S> URL: https://svnweb.freebsd.org/changeset/base/324405
> S> 
> S> Log:
> S>   Check so_error early in sendfile() call.  Prior to this patch, if a
> S>   connection was reset by the remote end, sendfile() would just report
> S>   ENOTCONN instead of ECONNRESET.
> S>   
> S>   Submitted by:	Jason Eggleston <jason at eggnet.com>
> S>   Reviewed by:	glebius
> S>   Sponsored by:	Limelight Networks
> S>   Differential Revision:	https://reviews.freebsd.org/D12575
> S> 
> S> Modified:
> S>   head/sys/kern/kern_sendfile.c
> S> 
> S> Modified: head/sys/kern/kern_sendfile.c
> S> ==============================================================================
> S> --- head/sys/kern/kern_sendfile.c	Sat Oct  7 23:10:16 2017	(r324404)
> S> +++ head/sys/kern/kern_sendfile.c	Sat Oct  7 23:30:57 2017	(r324405)
> S> @@ -514,6 +514,11 @@ sendfile_getsock(struct thread *td, int s, struct file
> S>  	*so = (*sock_fp)->f_data;
> S>  	if ((*so)->so_type != SOCK_STREAM)
> S>  		return (EINVAL);
> S> +	if ((*so)->so_error) {
> S> +		error = (*so)->so_error;
> S> +		(*so)->so_error = 0;
> S> +		return (error);
> S> +	}
> S>  	if (((*so)->so_state & SS_ISCONNECTED) == 0)
> S>  		return (ENOTCONN);
> S>  	return (0);
> 
> Despite my initial positive review, now I am quite unsure on that.
> 
> Problem is that sendfile(2) isn't defined by SUS, so there is no
> distinctive final answer on that. Should we match other OSes?
> Should we match our historic behaviour? Or should we match
> write(2)/send(2) to socket, which are closest analogy. I probably
> believe in the latter: sendfile(2) belongs to write(2)/send(2)
> family.
> 
> SUS specifies that write may return ECONNRESET. It also documents
> EPIPE. However, our manual page documents only EPIPE for both
> send(2) and write(2). For write we have:
> 
>         SOCKBUF_LOCK(&so->so_snd);
>         if (so->so_snd.sb_state & SBS_CANTSENDMORE) {
>                 SOCKBUF_UNLOCK(&so->so_snd);
>                 error = EPIPE;
>                 goto out;
>         }
>         if (so->so_error) {
>                 error = so->so_error;
>                 so->so_error = 0;
>                 SOCKBUF_UNLOCK(&so->so_snd);
>                 goto out;
>         }
> 
> Indeed, EPIPE will be returned prior to return/clear of so_error,
> which supposedly is ECONNRESET.
> 
> In the sendfile(2) implementation we see exactly same code:
> 
>                 if (so->so_snd.sb_state & SBS_CANTSENDMORE) {
>                         error = EPIPE;
>                         SOCKBUF_UNLOCK(&so->so_snd);
>                         goto done;
>                 } else if (so->so_error) {
>                         error = so->so_error;
>                         so->so_error = 0;
>                         SOCKBUF_UNLOCK(&so->so_snd);
>                         goto done;
>                 }
> 
> But it isn't reached. Before due to SS_ISCONNECTED check, now
> due to your change. Now we got two spots where so_error is
> returned/cleared.
Do you mean that EPIPE could be returned from sendfile(2) after some
round of changes ?  It should be not.

EPIPE is a workaround for naive programs that use stdio and cannot detect
the broken pipe when used as an element in the  shell plumbing.  EPIPE
results in SIGPIPE terminating such programs (instead of looping endlessly
when write(2) returns error).

Any application using sendfile(2) must understand the API to properly handle
disconnects, so SIGPIPE workaround would be avoided.  Often such applications
are already sofisticated enough to block SIGPIPE, but if blocking becomes
a new requirement because EPIPE was not returned in earlier version of
the system, they might not block it still.

> 
> For the receive family (read(2) and recv(2)), SUS specifies
> ECONNRESET and our code does that. It also clears so_error.
> 
> So, after your change at least one standard case is broken: an
> application that polls/selects for both readability and
> writeability of a socket. It discovers that socket is available
> for writing, does sendfile(2), receives ECONNRESET. Then it does
> read on the socket, and blocks(?) or returns a bogus error(?),
> not sure. But definitely not ECONNRESET, since it is now cleared.
> 
> Now, where does this ENOTCONN come from? These two lines:
> 
> 	if (((*so)->so_state & SS_ISCONNECTED) == 0)
> 		return (ENOTCONN);
> 
> can be traced back all the way to original 1998 commit of sendfile.
> 
> Here comes difference between sendfile(2) and send(2). In send(2),
> this check is done _after_ so_error check, so your change is correct
> in placing of so_error check wrt so_state check, but is incorrect
> in placing wrt socket buffer state check.
> 
> After all this considerations, I would suggest to revert your change,
> and apply this patch. It would make behavior of sendfile(2) to match
> send(2). I would appreciate more reviews around it.
> 
> -- 
> Gleb Smirnoff

> Index: kern_sendfile.c
> ===================================================================
> --- kern_sendfile.c	(revision 324457)
> +++ kern_sendfile.c	(working copy)
> @@ -507,13 +507,6 @@ sendfile_getsock(struct thread *td, int s, struct
>  	*so = (*sock_fp)->f_data;
>  	if ((*so)->so_type != SOCK_STREAM)
>  		return (EINVAL);
> -	if ((*so)->so_error) {
> -		error = (*so)->so_error;
> -		(*so)->so_error = 0;
> -		return (error);
> -	}
> -	if (((*so)->so_state & SS_ISCONNECTED) == 0)
> -		return (ENOTCONN);
>  	return (0);
>  }
>  
> @@ -622,6 +615,12 @@ retry_space:
>  			SOCKBUF_UNLOCK(&so->so_snd);
>  			goto done;
>  		}
> +		if ((so->so_state & SS_ISCONNECTED) == 0) {
> +			SOCKBUF_UNLOCK(&so->so_snd);
> +			error = ENOTCONN;
> +			goto done;
> +		}
> +
>  		space = sbspace(&so->so_snd);
>  		if (space < rem &&
>  		    (space <= 0 ||