UDP dont fragment bit

Robert Watson rwatson at FreeBSD.org
Wed Sep 21 05:48:46 PDT 2005


On Wed, 21 Sep 2005, Sten Daniel Sørsdal wrote:

> Robert Watson wrote:
>>
>> So if someone could generate some application pseudo-code that suggests
>> what specifically is necessary from the socket layer in order for the
>> application to function, we can talk about socket service extensions
>> that might support the application.  For example, a way to query
>> detailed error information rather than just the SO_ERROR socket option.
>> Or a longer haul PMTU data gathering mechanism for UDP sockets.  Or ways
>> for UDP applications to more usefully query the kernel for the TCP PMTU
>> data already being recorded.
>>
>> It sounds like for the bandwidth tester, IP raw sockets already provide 
>> what you need, since you want to be able to do fairly irregular UDP 
>> things (i.e., receive UDP packets with bad checksums, and see 
>> fragments).
>
> IP raw sockets? Sure, Everything can be solved the complicated way :o) 
> Some userland applications could benefit from having the option of DF 
> flag set/unset.

UDP sockets are defined as being a way to send and receive valid UDP 
datagrams.  Your list of things to receive included fragments and invalid 
datagrams.  While I agree with your comments below about things UDP 
applications want to do, I don't agree that we should teach UDP sockets to 
receive UDP datagram fragments or packets with bad checksums. 
Applications looking for non-accepted IP packets and complete ICMP 
messages should be using the raw socket interface.  Applications looking 
for post-processed abstracted interfaces to a datagram service should be 
using UDP sockets.  See below for discussion of enhancing UDP sockets.

> What about applications that wants to have a way of optimizing UDP
> transfers in their network path?
>
> Some networks filter icmp and fragments irresponsibly (imho) and 
> sometimes the combination of two or more networks that would cause 
> problems for multicast/video/voip applications.
>
> Sometimes in one network udp packets need fragmenting and in the next 
> network fragments need to get reassembled to pass a firewall which in 
> turn runs out of reassembling resources. ( It is more common to block 
> icmp messages about reassembly problems than DF problems IF a message is 
> generated in the first place. )
>
> Sure, all of this could be fixed the complicated way but what if one 
> already has an application that runs in unprivileged userland. How many 
> lines of code would a simple socket option plus the "tuning" code 
> require?

You're still not answering my question about application pseudo-code, 
however. Adding an IP_DF option to UDP sockets is easy, and can be done in 
ten lines of code or less.  Adding a way to provide detailed feedback on 
error conditions associated with UDP packets sent at arbitrary points in 
the path is not something that falls naturally out of the socket API, and 
will require non-trivial amounts of work.  Hence my asking about the 
structure and event model of your application: what exactly do you want to 
know about UDP packet delivery?

Specifically, what information do you as a developer need in order to 
handle asynchronous error delivery from UDP packet send, and how will it 
affect your application's interaction with the network stack? We can 
already deliver an synchronous EMSGSIZE when you try to send a UDP packet 
out of an interface with an MTU that is lower than the packet size, given 
a socket option to force IP_DF. However, if the packet hits a potential 
fragmentation problem out in the wide area network, that notification is 
completely asynchronous from packet transmission, and we will need a way 
to feed more detailed ICMP information to the application.  Right now 
asynchronous error delivery on a UDP socket is already fairly messy due to 
the fact that generally applications can only pick up the error when doing 
further I/O, confusing the issue of which operation actually generated the 
error.

Robert N M Watson


More information about the freebsd-net mailing list