Sendmail: host name lookup failure

J65nko BSD j65nko at gmail.com
Thu Jan 20 19:06:34 PST 2005


On Thu, 23 Dec 2004 15:09:08 +1030, Paul A. Hoadley
<paulh at logicsquad.net> wrote:
> On Mon, Dec 20, 2004 at 10:54:42PM +1030, Paul A. Hoadley wrote:
> 
> > I have actually solved the problem.  I intend to post a summary for
> > the archive when I return to the site later in the week, at which
> > time I'll be able to identify the OS/nameserver combination at
> > fault.
> 
> I am told it's running Windows 2000 DNS Server.  Presumably that's
> Microsoft's own DNS implementation built into Windows 2000.
> 
> > Here's a teaser, though: it's a Microsoft product (I just don't know
> > which), and it's returing SERVFAIL status for a AAAA record query.
> 
> Sometimes it behaves:
> 
> > dig tsb.coremedicalsolutions.com. AAAA
> 
> ; <<>> DiG 9.3.0 <<>> tsb.coremedicalsolutions.com. AAAA
> ;; global options:  printcmd
> ;; Got answer:
> ;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 8959
> ;; flags: qr aa rd ra; QUERY: 1, ANSWER: 0, AUTHORITY: 1, ADDITIONAL: 0
> 
> ;; QUESTION SECTION:
> ;tsb.coremedicalsolutions.com.  IN      AAAA
> 
> ;; AUTHORITY SECTION:
> coremedicalsolutions.com. 3600  IN      SOA     archibald2.coremedicalsolutions.com. marc.coremedicalsolutions.com. 1480 900 600 86400 3600
> 
> ;; Query time: 281 msec
> ;; SERVER: 192.168.10.2#53(192.168.10.2)
> ;; WHEN: Thu Dec 23 15:03:23 2004
> ;; MSG SIZE  rcvd: 98
> 
> But sendmail seems intent on asking for just about every permutation
> on each domain name invovled, so sometimes it returns the bogus
> answer:
> 
> > dig tsb AAAA
> 
> ; <<>> DiG 9.3.0 <<>> tsb AAAA
> ;; global options:  printcmd
> ;; Got answer:
> ;; ->>HEADER<<- opcode: QUERY, status: SERVFAIL, id: 43109
> ;; flags: qr rd ra; QUERY: 1, ANSWER: 0, AUTHORITY: 0, ADDITIONAL: 0
> 
> ;; QUESTION SECTION:
> ;tsb.                           IN      AAAA
> 
> ;; Query time: 245 msec
> ;; SERVER: 192.168.10.2#53(192.168.10.2)
> ;; WHEN: Thu Dec 23 15:04:42 2004
> ;; MSG SIZE  rcvd: 21
> 
> (By 'sometimes' I don't mean it's non-deterministic.  Every time
> sendmail asks for the AAAA record of an unqualified hostname, the
> nameserver responds with SERVFAIL.)
> 
> The consequence of this is that sendmail repeatedly defers delivery
> until the mail expires.
> 
> > Curiously, sendmail's WorkAroundBrokenAAAA option did not help, and
> > I don't know why.  Daryl Tester suggested using a mailertable entry,
> > and this worked.
> 
> I still don't know why WorkAroundBrokenAAAA isn't working in this
> case.

>From j65nko at andromache.utp.xnet Fri Jan 21 03:59:02 2005
Date: Fri, 21 Jan 2005 03:58:59 +0100 (CET)
From: J65nko BSD <j65nko at andromache.utp.xnet>
To: janko at andromache.utp.xnet


A couple of months ago some root servers started doing something they 
never did before:  handing out IPV6 referrals

$ dig +norecurse kpn.com @a.root-servers.net

; <<>> DiG 9.2.3 <<>> +norecurse kpn.com @a.root-servers.net
;; global options:  printcmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 25453
;; flags: qr; QUERY: 1, ANSWER: 0, AUTHORITY: 13, ADDITIONAL: 14

;; QUESTION SECTION:
;kpn.com.                       IN      A

;; AUTHORITY SECTION:
com.                    172800  IN      NS      A.GTLD-SERVERS.NET.
com.                    172800  IN      NS      G.GTLD-SERVERS.NET.
com.                    172800  IN      NS      H.GTLD-SERVERS.NET.
com.                    172800  IN      NS      C.GTLD-SERVERS.NET.
com.                    172800  IN      NS      I.GTLD-SERVERS.NET.
com.                    172800  IN      NS      B.GTLD-SERVERS.NET.
com.                    172800  IN      NS      D.GTLD-SERVERS.NET.
com.                    172800  IN      NS      L.GTLD-SERVERS.NET.
com.                    172800  IN      NS      F.GTLD-SERVERS.NET.
com.                    172800  IN      NS      J.GTLD-SERVERS.NET.
com.                    172800  IN      NS      K.GTLD-SERVERS.NET.
com.                    172800  IN      NS      E.GTLD-SERVERS.NET.
com.                    172800  IN      NS      M.GTLD-SERVERS.NET.

;; ADDITIONAL SECTION:
A.GTLD-SERVERS.NET.     172800  IN      AAAA    2001:503:a83e::2:30
A.GTLD-SERVERS.NET.     172800  IN      A       192.5.6.30
G.GTLD-SERVERS.NET.     172800  IN      A       192.42.93.30
H.GTLD-SERVERS.NET.     172800  IN      A       192.54.112.30
C.GTLD-SERVERS.NET.     172800  IN      A       192.26.92.30
I.GTLD-SERVERS.NET.     172800  IN      A       192.43.172.30
B.GTLD-SERVERS.NET.     172800  IN      AAAA    2001:503:231d::2:30
B.GTLD-SERVERS.NET.     172800  IN      A       192.33.14.30
D.GTLD-SERVERS.NET.     172800  IN      A       192.31.80.30
L.GTLD-SERVERS.NET.     172800  IN      A       192.41.162.30
F.GTLD-SERVERS.NET.     172800  IN      A       192.35.51.30
J.GTLD-SERVERS.NET.     172800  IN      A       192.48.79.30
K.GTLD-SERVERS.NET.     172800  IN      A       192.52.178.30
E.GTLD-SERVERS.NET.     172800  IN      A       192.12.94.30

;; Query time: 115 msec
;; SERVER: 198.41.0.4#53(a.root-servers.net)
;; WHEN: Fri Jan 21 01:06:01 2005
;; MSG SIZE  rcvd: 497

Somehow an IPV6 referral may entice a nameserver into actually issue a 
query via IPV6. BIND in the OpenBSD base install showed this behaviour. 
OBSD issued an errata with the following explanation:

"# 002: RELIABILITY FIX: November 10, 2004 BIND contains a bug which 
results in BIND trying to contact nameservers via IPv6, even in cases 
where IPv6 connectivity is non-existent. This results in unneccessary 
timeouts and thus slow DNS queries. A source code patch exists which 
remedies this problem."

It could be that the MS Windows DNS server, which is derived 
from/patterned after named or however you want to call it, suffers from a 
similar defect. Being not familiar with the MS DNS server, you could find 
out whether it it possible to disable IPV6. BIND has this option.

RE: resolving unqualified hostnames

Some programs respect the "search" or "domain" settings in 
"/etc/resolv.conf". An example from my local LAN:

cat /etc/resolv.conf
search utp.xnet
nameserver 192.168.222.10

$ host notexisting
Host notexisting not found: 3(NXDOMAIN)

tcpdump shows the following:

192.168.222.44.3904 > 192.168.222.10.53:  3323+ A? notexisting.utp.xnet. 
(38)
192.168.222.10.53 > 192.168.222.44.3904:  3323 NXDomain* 0/0/0 (38) 
192.168.222.44.12590 > i 192.168.222.10.53:  62093+ A? notexisting. (29) 
192.168.222.10.53 > 192.168.222.44.12590:  62093 NXDomain* 0/0/0 (29)

The "host" program uses the "search" directive to convert the unqualified 
name "notexisting" into a query for "notexisting.utp.xnet."  With an 
existing name "host" manages to succeed with this strategy

$ host parmenides
parmenides.utp.xnet has address 192.168.222.10

192.168.222.44.25740 > 192.168.222.10.53:  35854+ A? parmenides.utp.xnet. (37)
192.168.222.10.53 > 192.168.222.44.25740:  35854 1/0/0 A 192.168.222.10 (53)

Our friend dig however cannot resolve "parmenides" 
$ dig parmenides

; <<>> DiG 9.2.3 <<>> parmenides
;; global options:  printcmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NXDOMAIN, id: 5025
;; flags: qr aa rd ra; QUERY: 1, ANSWER: 0, AUTHORITY: 0, ADDITIONAL: 0

;; QUESTION SECTION:
;parmenides.                    IN      A

;; Query time: 2 msec
;; SERVER: 192.168.222.10#53(192.168.222.10)
;; WHEN: Fri Jan 21 01:42:49 2005
;; MSG SIZE  rcvd: 28

192.168.222.44.23821 > 192.168.222.10.53:  5025+ A? parmenides. (28)
192.168.222.10.53 > 192.168.222.44.23821:  5025 NXDomain* 0/0/0 (28)

The tcpdump output shows that dig doesn't use the "/etc/resolv.conf" 
search directive.

Please note that by lack of a FBSD box, these pasted examples were done on 
a OpenBSD box and sometimes using djbdns

RE: .ns.chariot.net.au. and ns2.chariot.net.au.

These nameservers are not configured correctly. This is the response from 
a correctly setup authorative nameserver. Please note the use of the 
+norecurse flag to mimic the queries of an resolving nameserver.

$  dig +norecurse ns.vuurwerk.nl @62.250.2.2

; <<>> DiG 9.2.3 <<>> +norecurse ns.vuurwerk.nl @62.250.2.2
;; global options:  printcmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 47182
;; flags: qr aa; QUERY: 1, ANSWER: 1, AUTHORITY: 3, ADDITIONAL: 2

;; QUESTION SECTION:
;ns.vuurwerk.nl.                        IN      A

;; ANSWER SECTION:
ns.vuurwerk.nl.         3600    IN      A       62.250.2.2

;; AUTHORITY SECTION:
vuurwerk.nl.            3600    IN      NS      ns.vuurwerk.nl.
vuurwerk.nl.            3600    IN      NS      ns2.vuurwerk.net.
vuurwerk.nl.            3600    IN      NS      ns3.vuurwerk.net.

;; ADDITIONAL SECTION:
ns2.vuurwerk.net.       3600    IN      A       212.204.221.71
ns3.vuurwerk.net.       3600    IN      A       213.136.0.173

;; Query time: 15 msec
;; SERVER: 62.250.2.2#53(62.250.2.2)
;; WHEN: Fri Jan 21 02:28:27 2005
;; MSG SIZE  rcvd: 142

Please note 
"flags: qr aa; QUERY: 1, ANSWER: 1, AUTHORITY: 3, ADDITIONAL: 2"
"aa" = "authorative answer"
ANSWER: 1,
AUTHORITY: 3,

Now the result of your ISP's nameserver who is supposed to be authorative:

$  dig +norecurse -t ns coremedicalsolutions.com @203.30.237.3

; <<>> DiG 9.2.3 <<>> +norecurse -t ns coremedicalsolutions.com @203.30.237.3
;; global options:  printcmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 21640
;; flags: qr aa ra; QUERY: 1, ANSWER: 2, AUTHORITY: 0, ADDITIONAL: 2

;; QUESTION SECTION:
;coremedicalsolutions.com.      IN      NS

;; ANSWER SECTION:
coremedicalsolutions.com. 86400 IN      NS      ns2.chariot.net.au.
coremedicalsolutions.com. 86400 IN      NS      ns.chariot.net.au.

;; ADDITIONAL SECTION:
ns.chariot.net.au.      86400   IN      A       203.30.237.2
ns2.chariot.net.au.     86400   IN      A       203.30.237.3

;; Query time: 375 msec
;; SERVER: 203.30.237.3#53(203.30.237.3)
;; WHEN: Fri Jan 21 02:34:12 2005
;; MSG SIZE  rcvd: 123

Here the flags line is:
flags: qr aa ra; QUERY: 1, ANSWER: 2, AUTHORITY: 0, ADDITIONAL: 2"
  aa = authorative answer
  ra = recursion available
ANSWER: 2,
AUTHORITY: 0,

Imagine how confusing this is for a nameserver.

You don't have to believe me, just do a NS type search for 
coremedicalsolutions.com at http://www.squish.net/dnscheck/

That dnscheck produces the following (repeated 13 times):

"7.7% of queries will be returned by 192.5.6.30 (A.GTLD-SERVERS.NET) - 
answer was not authoritative

coremedicalsolutions.com.	172800	IN	NS	ns.chariot.net.au.
coremedicalsolutions.com.	172800	IN	NS	ns2.chariot.net.au"

ns.chariot.net.au. is really recursive, while it shouldn'. A supposedly 
authorative nameserver should only play that role and not offer recursion. 
This makes is susceptible for attacks. And it is really recursive.

Located in Holland and not in their netblock, I can use 
ns.chariot.net.au. to do queries.

$ dig -t ns kpn.com @203.30.237.2

; <<>> DiG 9.2.3 <<>> -t ns kpn.com @203.30.237.2
;; global options:  printcmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 63508
;; flags: qr rd ra; QUERY: 1, ANSWER: 2, AUTHORITY: 0, ADDITIONAL: 2

;; QUESTION SECTION:
;kpn.com.                       IN      NS

;; ANSWER SECTION:
kpn.com.                172800  IN      NS      dns.kpn.com.
kpn.com.                172800  IN      NS      ns2.kpn.net.

;; ADDITIONAL SECTION:
dns.kpn.com.            172800  IN      A       145.7.191.35
ns2.kpn.net.            125347  IN      A       194.151.228.50

;; Query time: 571 msec
;; SERVER: 203.30.237.2#53(203.30.237.2)
;; WHEN: Fri Jan 21 03:11:01 2005
;; MSG SIZE  rcvd: 100

The flags here are normal for a recursive resolver,
"rd" = "recursion desired" 
"ra" = "recursion available" 
ANSWER: 2,
AUTHORITY: 0.

But of course a recursive resolver cannot give authorative answers, not 
being the offically appointed nameserver for the "kpn.com" domain.

Have you noticed that the flags returned for the non-recursive authorative 
query for your domain are similar to those of the recursive 
non-authorative query for kpn.com?

Ask your ISP to fix their nameserver.

=Adriaan=


More information about the freebsd-questions mailing list