Broken ppp LQR - FreeBSD or ERX ?
Mike Tancsa
mike at sentex.net
Fri Nov 21 23:56:02 PST 2003
I have been trying to track down a rather old mystery as to why when using
PPPoE via userland PPP the daemon would not be able to detect if the DSL
connection was down when the other side was a Juniper ERX. The same
machines when using mpd works fine and can detect via LCP echos if the link
is down. Also, both mpd and ppp work correctly when talking PPPoE to an
SMS. Looking through the logs, it would appear that LQR doesnt work
against the ERX (ie the ERX is broken) or FreeBSD's LQR is broken, or the
two dont like each other. According to the man page,
When LQR is enabled, ppp sends
the QUALPROTO option (see ``set lqrperiod'' below) as part of the
LCP request. If the peer agrees, both sides will exchange LQR
packets at the agreed frequency, allowing detailed link quality
monitoring by enabling LQM logging. If the peer doesn't agree,
ppp will send ECHO LQR requests instead. These packets pass no
information of interest, but they MUST be replied to by the peer.
So, there is a sort of "Plan A" and "Plan B". e.g on a session against an ERX
Mov 22 02:34:04 datawest3 ppp.prev[52036]: tun0: LQM: deflink: Will send
LQR every 10.00 secs
Nov 22 02:34:04 datawest3 ppp.prev[52036]: tun0: LQM: deflink: Output:
Nov 22 02:34:04 datawest3 ppp.prev[52036]: tun0:
LQM: Magic: 860c0808 LastOutLQRs: 00000000
Nov 22 02:34:04 datawest3 ppp.prev[52036]: tun0: LQM: LastOutPackets:
00000000 LastOutOctets: 00000000
Nov 22 02:34:04 datawest3 ppp.prev[52036]: tun0:
LQM: PeerInLQRs: 00000000 PeerInPackets: 00000006
Nov 22 02:34:04 datawest3 ppp.prev[52036]: tun0: LQM: PeerInDiscards:
00000000 PeerInErrors: 00000000
Nov 22 02:34:04 datawest3 ppp.prev[52036]: tun0:
LQM: PeerInOctets: 00000076 PeerOutLQRs: 00000001
Nov 22 02:34:04 datawest3 ppp.prev[52036]: tun0: LQM: PeerOutPackets:
0000000a PeerOutOctets: 00000141
Looking at the source code this implies the LQM_LQR method.
Nov 22 02:34:44 datawest3 ppp.prev[52036]: tun0: LQM: deflink: Output (again):
Nov 22 02:34:44 datawest3 ppp.prev[52036]: tun0:
LQM: Magic: 860c0808 LastOutLQRs: 00000000
Nov 22 02:34:44 datawest3 ppp.prev[52036]: tun0: LQM: LastOutPackets:
00000000 LastOutOctets: 00000000
Nov 22 02:34:44 datawest3 ppp.prev[52036]: tun0:
LQM: PeerInLQRs: 00000000 PeerInPackets: 00000011
Nov 22 02:34:44 datawest3 ppp.prev[52036]: tun0: LQM: PeerInDiscards:
00000000 PeerInErrors: 00000000
Nov 22 02:34:44 datawest3 ppp.prev[52036]: tun0:
LQM: PeerInOctets: 00000196 PeerOutLQRs: 00000001
Nov 22 02:34:44 datawest3 ppp.prev[52036]: tun0: LQM: PeerOutPackets:
00000017 PeerOutOctets: 000002ef
Nov 22 02:34:48 datawest3 ppp.prev[52036]: tun0: LCP: deflink:
RecvEchoRequest(4) state = Opened
Nov 22 02:34:48 datawest3 ppp.prev[52036]: tun0: LCP: deflink:
SendEchoReply(4) state = Opened
Nov 22 02:34:55 datawest3 ppp.prev[52036]: tun0: LQM: deflink: Output (again):
Nov 22 02:34:55 datawest3 ppp.prev[52036]: tun0:
LQM: Magic: 860c0808 LastOutLQRs: 00000000
Nov 22 02:34:55 datawest3 ppp.prev[52036]: tun0: LQM: LastOutPackets:
00000000 LastOutOctets: 00000000
Nov 22 02:34:55 datawest3 ppp.prev[52036]: tun0:
LQM: PeerInLQRs: 00000000 PeerInPackets: 00000012
Nov 22 02:34:55 datawest3 ppp.prev[52036]: tun0: LQM: PeerInDiscards:
00000000 PeerInErrors: 00000000
Nov 22 02:34:55 datawest3 ppp.prev[52036]: tun0:
LQM: PeerInOctets: 000001a5 PeerOutLQRs: 00000001
Nov 22 02:34:55 datawest3 ppp.prev[52036]: tun0: LQM: PeerOutPackets:
00000019 PeerOutOctets: 00000335
Nov 22 02:34:58 datawest3 ppp.prev[52036]: tun0: LCP: deflink:
RecvEchoRequest(5) state = Opened
Nov 22 02:34:58 datawest3 ppp.prev[52036]: tun0: LCP: deflink:
SendEchoReply(5) state = Opened
Nov 22 02:35:05 datawest3 ppp.prev[52036]: tun0: Phase: deflink: ** Too
many LQR packets lost **
Nov 22 02:35:05 datawest3 ppp.prev[52036]: tun0: LQM: deflink: Too many LQR
packets lost
Nov 22 02:35:05 datawest3 ppp.prev[52036]: tun0: CCP: deflink: State change
Stopped --> Closed
But as you can see from the logs, it doesnt work. The host never responds
and ppp hangs up even though it really is connected.
This compared to a connection against an SMS
Nov 22 02:38:02 gastest ppp[5018]: tun0: LQM: deflink: LQR/ECHO LQR not
negotiated
Note in this case, ppp reverts to "plan B" where it will just send LCP echos.
Now, getting back to the machine that connects to the ERX, if I make the
following change to the source code,
datawest3# diff -u lcp.c lcp.c.patch
--- lcp.c Sat Aug 31 22:12:28 2002
+++ lcp.c.patch Fri Nov 21 19:50:07 2003
@@ -575,6 +575,11 @@
log_Printf(LogLCP, "%s: LayerUp\n", fp->link->name);
physical_SetAsyncParams(p, lcp->want_accmap, lcp->his_accmap);
lqr_Start(lcp);
+
+
+ p->hdlc.lqm.method = LQM_ECHO;
+
+
hdlc_StartTimer(&p->hdlc);
fp->more.reqs = fp->more.naks = fp->more.rejs = lcp->cfg.fsm.maxreq * 3;
datawest3#
i.e. to force "plan b", where it just uses LCP echos everything against the
ERX works! It sends and receives LCP echo requests just fine, and if it
does not hear 5 back in a row, it determines the links is down.
As a work around, is there any way to force the LQM_ECHO method through the
ppp.conf ? Has anyone else run into this ?
---Mike
--------------------------------------------------------------------
Mike Tancsa, tel +1 519 651 3400
Sentex Communications, mike at sentex.net
Providing Internet since 1994 www.sentex.net
Cambridge, Ontario Canada www.sentex.net/mike
More information about the freebsd-net
mailing list