Network pipes

Thu Jul 24 02:14:30 PDT 2003

hi,    
i have the following questions:

* strange benchmark results! Given the description, I would expect 
  the "|@ rsh" and "|@ ssh" cases to give the same throughput, and
  in any case "| rsh" to be faster than "| ssh". How comes, instead,
  that the times differ by an order of magnitude ? Can you run the
  tests in similar conditions to appreciate the gains better ?

* I do not understand how can you remove the pipe in the remote host
  without modifying there both "sshd" and "sh" ?
  I think it would be very important to understand how much
  |@ depends on the behaviour of the remote daemon.

* the loss of encription on the channel is certainly something that might
  escape the attention of the user. I also wonder in how many cases you
  really need the extra performance to justify the extra plumbing
  mechanism.

* there are subtle implications of your new plumbing in the way
  processes are started. With "A | B | C" the shell first creates the
  pipes, then it can start the processes in any order, and they can
  individually fail to start without any direct consequence other
  than an I/O failure. "A |@ B |@ C" requires that you start things
  from the end of the chain (because you cannot start a process 
  until you have a [socket] descriptor from the next stage in the
  chain), and if a process fails to start you cannot even start the
  next one in the sequence. Not that this is bad, just very different
  from regular pipes.

All the above leaves me a bit puzzled on whether or not this is a
useful addition... In fact, i am not convinced that network pipes
should be implemented in the shell...

        cheers
        luigi

On Thu, Jul 24, 2003 at 11:19:49AM +0300, Diomidis Spinellis wrote:
> I am currently testing a set of modifications to /bin/sh that allow a
> user to create a pipeline over the network using a socket as its
> endpoints.  Currently a command like
> 
> tar cvf - / | ssh remotehost dd of=/dev/st0 bs=32k
> 
> has tar sending each block through a pipe to a local ssh process, ssh
> communicating through a socket with a remote ssh daemon and dd
> communicating with sshd through a pipe again.  The changed shell allows
> you to write
> 
> tar cvf - / |@ ssh remotehost -- dd of=/dev/st0 bs=32k | :
> 
> The effect of the above command is that a socket is created between the
> local and the remote host with the standard output of tar and the
> standard input of dd redirected to that socket.  Authentication is still
> performed using ssh (or any other remote login mechanism you specify
> before the -- argument), but the flow between the two processes is from
> then on not protected in terms of integrity and privacy.  Thus the
> method will mostly be useful within the context of a LAN or a VPN.
> 
> The authentication design requires the users to have a special command
> in their path on the remote host, but does not require an additional
> privileged server or the reservation of special ports.
> 
> By eliminating two processes, the associated context switches, the data
> copying, and (in the case of ssh) encryption performance is markedly
> improved:
> 
> dd if=/dev/zero bs=4k count=8192 | ssh remotehost -- dd of=/dev/null
> 33554432 bytes transferred in 17.118648 secs (1960110 bytes/sec)
> dd if=/dev/zero bs=4k count=8192 |@ ssh remotehost -- dd of=/dev/null |
> :
> 33554432 bytes transferred in  4.452980 secs (7535276 bytes/sec)
> 
> Even eliminating the encryption overhead by using rsh you can still see 
> 
> dd if=/dev/zero bs=4k count=8192 | rsh remotehost -- dd of=/dev/null
> 33554432 bytes transferred in 131.907130 secs (254379 bytes/sec)
> dd if=/dev/zero bs=4k count=8192 |@ rsh remotehost -- dd of=/dev/null |
> :
> 33554432 bytes transferred in 86.545385 secs (387709 bytes/sec)
> 
> My questions are:
> 
> 1. How do you feel about integrating these changes to the /bin/sh in
> -CURRENT?  Note that network pipes are a different process plumbing
> mechanism, so they really do belong to a shell; implementing them
> through a separate command would be inelegant.
> 
> 2. Do you see any problems with the new syntax introduced?
> 
> 3. After the remote process starts running standard error output is
> lost.  Do find this a significant problem?
> 
> 4. Both sides of the remote process are communication endpoints and have
> to be connected to other local processes via pipes.  Would it be enough
> to document this behaviour or should it be hidden from the user by means
> of forked read/write processes?
> 
> Diomidis - http://www.spinellis.gr
> _______________________________________________
> freebsd-hackers at freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
> To unsubscribe, send any mail to "freebsd-hackers-unsubscribe at freebsd.org"