Weird behavior with either reading or write()ing !?
Brandon Erhart
berhart at ErhartGroup.COM
Sat Apr 10 17:57:52 PDT 2004
Hello,
This is a rather odd bug/weird behavior. Confidence is high that it is not
logic in my code this time. Please read the following carefully!
In a web-crawling program I am writing, I deal with several thousand fds at
a time. I am using FreeBSD's KQueue to keep track of them all so that I may
be notified when
an event is pending on a given socket. The program works as it should for
about 75% of the connections. The other 25% don't work so well.
I have implemented read timeouts in the fashion that, whenever I am in the
callback function for data being wait to be read off an fd (EVFILT_READ or
whatever), I store the last time (via gettimeofday()) that data was read on
that socket. Then, in my main loop, I check all sockets to see if the last
time data was read isn't any greater than 10 seconds ago.
However, I am receiving a lot of read timeouts. I keep track of the last
response from the remote server, and the current state I'm in (E.G., sent
another GET request on a keepalive connection). In several cases, I had
received a response for the last page I requested, processed/parsed it, and
sent down another request. However, data never got back to me. Even after
10 seconds. Hell, even after 30 seconds in some cases.
What I am wondering is, is it possible for either my write() to be failing
it's ability to get data to the remote site (I check the return value of
write(), and its always returning the amount of bytes I am writing), or
possibly for data to be being "dropped" per-se on my end by the kernel (no
data waiting on the socket). I have all my sockets in O_NONBLOCK mode.
To test the possibility of perhaps KQueue not notifying me of data waiting,
or me not grabbing the event off the queue in time, I call a read() on the
socket one last time when I catch the read timeout. Most of the time (99%
of it) there is no data waiting.
This all seems to be random. It's never consistent (same server) over
several runs of the program.
Any ideas folks? This has completely stumped me.
Thanks for your support,
Brandon
More information about the freebsd-net
mailing list