Broken ppp LQR - FreeBSD or ERX ?

Mike Tancsa mike at sentex.net
Fri Nov 21 23:56:02 PST 2003


I have been trying to track down a rather old mystery as to why when using 
PPPoE via userland PPP the daemon would not be able to detect if the DSL 
connection was down when the other side was a Juniper ERX.  The same 
machines when using mpd works fine and can detect via LCP echos if the link 
is down.  Also, both mpd and ppp work correctly when talking PPPoE to an 
SMS.  Looking through the logs, it would appear that LQR doesnt work 
against the ERX (ie the ERX is broken) or FreeBSD's LQR is broken, or the 
two dont like each other.  According to the man page,

When LQR is enabled, ppp sends
              the QUALPROTO option (see ``set lqrperiod'' below) as part of the
              LCP request.  If the peer agrees, both sides will exchange LQR
              packets at the agreed frequency, allowing detailed link quality
              monitoring by enabling LQM logging.  If the peer doesn't agree,
              ppp will send ECHO LQR requests instead.  These packets pass no
              information of interest, but they MUST be replied to by the peer.

So, there is a sort of "Plan A" and "Plan B".  e.g on a session against an ERX

Mov 22 02:34:04 datawest3 ppp.prev[52036]: tun0: LQM: deflink: Will send 
LQR every 10.00 secs
Nov 22 02:34:04 datawest3 ppp.prev[52036]: tun0: LQM: deflink: Output:
Nov 22 02:34:04 datawest3 ppp.prev[52036]: tun0: 
LQM:   Magic:          860c0808   LastOutLQRs:    00000000
Nov 22 02:34:04 datawest3 ppp.prev[52036]: tun0: LQM:   LastOutPackets: 
00000000   LastOutOctets:  00000000
Nov 22 02:34:04 datawest3 ppp.prev[52036]: tun0: 
LQM:   PeerInLQRs:     00000000   PeerInPackets:  00000006
Nov 22 02:34:04 datawest3 ppp.prev[52036]: tun0: LQM:   PeerInDiscards: 
00000000   PeerInErrors:   00000000
Nov 22 02:34:04 datawest3 ppp.prev[52036]: tun0: 
LQM:   PeerInOctets:   00000076   PeerOutLQRs:    00000001
Nov 22 02:34:04 datawest3 ppp.prev[52036]: tun0: LQM:   PeerOutPackets: 
0000000a   PeerOutOctets:  00000141

Looking at the source code this implies the LQM_LQR method.

Nov 22 02:34:44 datawest3 ppp.prev[52036]: tun0: LQM: deflink: Output (again):
Nov 22 02:34:44 datawest3 ppp.prev[52036]: tun0: 
LQM:   Magic:          860c0808   LastOutLQRs:    00000000
Nov 22 02:34:44 datawest3 ppp.prev[52036]: tun0: LQM:   LastOutPackets: 
00000000   LastOutOctets:  00000000
Nov 22 02:34:44 datawest3 ppp.prev[52036]: tun0: 
LQM:   PeerInLQRs:     00000000   PeerInPackets:  00000011
Nov 22 02:34:44 datawest3 ppp.prev[52036]: tun0: LQM:   PeerInDiscards: 
00000000   PeerInErrors:   00000000
Nov 22 02:34:44 datawest3 ppp.prev[52036]: tun0: 
LQM:   PeerInOctets:   00000196   PeerOutLQRs:    00000001
Nov 22 02:34:44 datawest3 ppp.prev[52036]: tun0: LQM:   PeerOutPackets: 
00000017   PeerOutOctets:  000002ef
Nov 22 02:34:48 datawest3 ppp.prev[52036]: tun0: LCP: deflink: 
RecvEchoRequest(4) state = Opened
Nov 22 02:34:48 datawest3 ppp.prev[52036]: tun0: LCP: deflink: 
SendEchoReply(4) state = Opened
Nov 22 02:34:55 datawest3 ppp.prev[52036]: tun0: LQM: deflink: Output (again):
Nov 22 02:34:55 datawest3 ppp.prev[52036]: tun0: 
LQM:   Magic:          860c0808   LastOutLQRs:    00000000
Nov 22 02:34:55 datawest3 ppp.prev[52036]: tun0: LQM:   LastOutPackets: 
00000000   LastOutOctets:  00000000
Nov 22 02:34:55 datawest3 ppp.prev[52036]: tun0: 
LQM:   PeerInLQRs:     00000000   PeerInPackets:  00000012
Nov 22 02:34:55 datawest3 ppp.prev[52036]: tun0: LQM:   PeerInDiscards: 
00000000   PeerInErrors:   00000000
Nov 22 02:34:55 datawest3 ppp.prev[52036]: tun0: 
LQM:   PeerInOctets:   000001a5   PeerOutLQRs:    00000001
Nov 22 02:34:55 datawest3 ppp.prev[52036]: tun0: LQM:   PeerOutPackets: 
00000019   PeerOutOctets:  00000335
Nov 22 02:34:58 datawest3 ppp.prev[52036]: tun0: LCP: deflink: 
RecvEchoRequest(5) state = Opened
Nov 22 02:34:58 datawest3 ppp.prev[52036]: tun0: LCP: deflink: 
SendEchoReply(5) state = Opened
Nov 22 02:35:05 datawest3 ppp.prev[52036]: tun0: Phase: deflink: ** Too 
many LQR packets lost **
Nov 22 02:35:05 datawest3 ppp.prev[52036]: tun0: LQM: deflink: Too many LQR 
packets lost
Nov 22 02:35:05 datawest3 ppp.prev[52036]: tun0: CCP: deflink: State change 
Stopped --> Closed

But as you can see from the logs, it doesnt work. The host never responds 
and ppp hangs up even though it really is connected.

This compared to a connection against an SMS
Nov 22 02:38:02 gastest ppp[5018]: tun0: LQM: deflink: LQR/ECHO LQR not 
negotiated
Note in this case, ppp reverts to "plan B" where it will just send LCP echos.

Now, getting back to the machine that connects to the ERX, if I make the 
following change to the source code,
datawest3# diff -u lcp.c lcp.c.patch
--- lcp.c       Sat Aug 31 22:12:28 2002
+++ lcp.c.patch Fri Nov 21 19:50:07 2003
@@ -575,6 +575,11 @@
    log_Printf(LogLCP, "%s: LayerUp\n", fp->link->name);
    physical_SetAsyncParams(p, lcp->want_accmap, lcp->his_accmap);
    lqr_Start(lcp);
+
+
+  p->hdlc.lqm.method = LQM_ECHO;
+
+
    hdlc_StartTimer(&p->hdlc);
    fp->more.reqs = fp->more.naks = fp->more.rejs = lcp->cfg.fsm.maxreq * 3;

datawest3#

i.e. to force "plan b", where it just uses LCP echos everything against the 
ERX works!  It sends and receives LCP echo requests just fine, and if it 
does not hear 5 back in a row, it determines the links is down.

As a work around, is there any way to force the LQM_ECHO method through the 
ppp.conf ?  Has anyone else run into this ?

	---Mike
--------------------------------------------------------------------
Mike Tancsa,                          	          tel +1 519 651 3400
Sentex Communications,     			  mike at sentex.net
Providing Internet since 1994                    www.sentex.net
Cambridge, Ontario Canada			  www.sentex.net/mike



More information about the freebsd-net mailing list