probably stupid questions about select() and FS_SET in a multithreaded environment [ select() failed (Bad file descriptor) ]

Tijl Coosemans tijl at coosemans.org
Sun Oct 16 19:35:07 UTC 2011


On Sunday 16 October 2011 18:18:39 Vikash Badal wrote:
> Greetings,
> 
> Can some point me in the correction direction please.
> 
> I have a treaded socket application that has a problem with select()
> returning -1.
> The select() and accept() is taken care of in one thread. The worker
> threads deal with client requests after the new client connection is
> pushed to queue.
> 
> The logged error is :
> select() failed (Bad file descriptor) getdtablesize = 65536
> 
> Sysctls at the moment  are:
> kern.maxfiles: 65536 
> kern.maxfilesperproc: 65536
> 
> 
> <code>
> void client_accept(int listen_socket)
> {
> ...
>    while ( loop )
>    {
>       FD_ZERO(&socket_set);
>       FD_SET(listen_socket, &socket_set);
>       timeout.tv_sec = 1;
>       timeout.tv_usec = 0;
> 
>       rcode = select(listen_socket + 1, &socket_set, NULL, NULL, &timeout);
> 
>       if ( rcode < 0 )
>       {
>          Log(DEBUG_0, "ERROR: select() failed (%s) getdtablesize = %d",
>              strerror(errno), getdtablesize());
>          loop = 0;
>          sleep(30);
>          fcloseall();
>          assert(1==0);
>       }
> 
>       if ( rcode > 0 )
>       {
>           remotelen = sizeof(remote);
>           client_sock = accept(listen_socket, .....
>           
>           if (msgsock != -1 )
>           { 
>              // Allocate memory for request
>              request = malloc(sizeof(struct requests));
>              // test for malloc etc ...
>              // set request values ...
>              //
>              // Push request to a queue. 
>           }
>       }
> 
>    }
>  ...
> }
> void* tcpworker(void* arg)
> {
>    // initialise stuff
> 
>    While ( loop )
>    {
>       // pop request from queue
>       
>       If ( request != NULL )
>       {
>          // deal with request
>          free(request)
>       }
>    }   
> }
> 
> </code>
> When the problem occurs, i have between 1000 and 1400 clients
> connected.
> 
> Questions:
> 1. do i need to FD_CLR(client_sock,&socket_set) before i push to a
> queue ?
> 2. do i need to FD_CLR(client_sock, &socket_set) when this client
> request closes in the the tcpworker() function ?
> 3. would setting kern.maxfilesperproc and kern.maxfiles to higher
> values solve the problem or just take longer for the problem to
> re-appear.
> 4. should is replace select() with kqueue() as from google-ing it
> seems select() is not that great.

The size of an fd_set is limited by FD_SETSIZE which is 1024 by
default. So if you pass a descriptor larger than that to FD_SET() or
select(), you have a buffer overflow and memory beyond the fd_set can
become corrupted.

You can define FD_SETSIZE to a larger value before including
sys/select.h, but you should also verify if a descriptor is less than
FD_SETSIZE before using it with select or any of the fd_set macros and
return error if not.

kqueue doesn't have this problem, but it's not as portable as select.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 228 bytes
Desc: This is a digitally signed message part.
Url : http://lists.freebsd.org/pipermail/freebsd-questions/attachments/20111016/34bff227/attachment.pgp


More information about the freebsd-questions mailing list