weird network/DNS issues (nsd not returning answer)
alexmiroslav at gmail.com
Tue Mar 20 23:11:45 UTC 2018
I have a number of FreeBSD servers online. The other day, one of them
that I setup a month back started exhibiting really weird behavior. It
doesn't get answers back to queries made to my two DNS servers, both of
which are running nsd.
Initially I suspected pf or sshguard to be the issue, but this happens
with pf and sshguard turned off on all servers in question.
The other weird thing is that all other network traffic between these
servers are passing back and forth normally, only nsd replies are not
Here is the issue, roughly:
- given multiple servers, labeled, a-z
- servers k and z run nsd
- with the exception of server b, all other servers can communicate
normally with servers k and z
- with the exception of DNS queries, server b can communicate
normally with server k and z
- b can ping, ssh to, rsync, scp, to and from server k and z
The only issue is when b makes a DNS query to k or z. I see those two
servers get the query, and return the answer, but that answer never
reaches b. I have sniffed the network to confirm this.
# in these examples:
# b.example.org = 22.214.171.124, the server that is misbehaving
# k.example.org = 126.96.36.199, one of my DNS servers
# c.example.org = 188.8.131.52, another server of mine, which I am
looking up the DNS for
# b make initially query to k
14:11:46.912995 IP 184.108.40.206.18394 > 220.127.116.11.53: 22479+ A?
# k receives query and immediately returns the answer
14:11:46.931605 IP 18.104.22.168.18394 > 22.214.171.124.53: 22479+ A?
14:11:46.931854 IP 126.96.36.199.53 > 188.8.131.52.18394: 22479*-
1/2/1 A 184.108.40.206 (103)
# this second line, the answer, never makes it to b
# after a second or two, it makes another query:
14:11:51.969083 IP 220.127.116.11.12645 > 18.104.22.168.53: 22479+ A?
# k receives the second query and immediately returns the answer again
14:11:51.991267 IP 22.214.171.124.12645 > 126.96.36.199.53: 22479+ A?
14:11:51.991508 IP 188.8.131.52.53 > 184.108.40.206.12645: 22479*-
1/2/1 A 220.127.116.11 (103)
# there still nothing from tcpdump on b's interface that it
received the answer
# [DNS names and IPs have been changed above.]
Here's what it looks like from b's command line
$ host c.example.org k.example.org
# a few seconds delay
;; connection timed out; no servers could be reached
b has the same problem with my my other server z, which also runs nsd.
All my other servers can query k and z just fine. Only b is exhibiting
All the servers run pf/sshguard. But these rules/configs have not been
updated in months.
I did do one other thing to debug. I shutdown nsd on k, and setup a
listener on b like this
nc -l 10000
And on k, I did this:
ls /etc | sudo nc -s 18.104.22.168 -p 53 b.example.org 10000
This produced the contents of /etc on b. So that means that without nsd
in the picture, k is able to talk to b via port 53 just fine.
All the above servers in question are running FreeBSD 11.1-RELEASE-p6.
I'm not exactly sure how I can debug this problem further, I'm not sure
where the block is happening.
Any help appreciated.
More information about the freebsd-questions