How to setup IPFW working with blacklistd

Mon Nov 13 17:38:02 UTC 2017

Greetings all!

Sorry for not being response to your request for help sooner.

I had a bit of a hardware crisis here last week, where
what I thought was merely a blown power supply turned
out to be a failed motherboard.  Getting the 2.5" SAS
drives back up and running in a different machine took
far longer than I would have guessed.  That, along with
a secondary MX host that was offline for the first 36
hours after the main mail server went down was a cause
for additional excitement.

Anyway.

I've read through the mail exchange, although its a bit
hard to follow all of it.

I'll offer a couple of observations about blacklistd
and how it operates, and maybe that will shed some light
on the problem at hand.  If not, well, I'd like to start
fresh with the current configuration, and what you're
seeing on your host.

Observations that might help:

1) The blacklistd support in 11.0 was broken in a couple
of significant ways.  The blacklistd support in 11.1 is
thought to be fully functional.  If you're not running 11.1,
you will need to update to 11.1.

2) I only use blacklistd with 'pf' in my day-to-day usage.
I extended the support in blacklistd-helper to hopefully
handle both ipfw and ipf, and it seemed to work OK for my
test setup.  HOWEVER, it is entirely possible that the way
I did the ipf/ipfw support has a flaw (or more) in it.

3) The changes to the various daemons to support the
blacklist just enable sending messages (and a copy of the
fd of socket) to the blacklist daemon.  The blacklist daemon
will extract information from the kernel about the socket's
other end (ie, the information about the remote system),
and stores that information in a database.

4) After the information is stored in the database, the
blacklist daemon calls the blacklistd-helper script and
that script is responsible for modifying the firewall
rules that are in effect.  If the script has a bug, it's
entirely possible that the information in the database
will be out of sync with the current firewall rules in
effect.

5) If you're experiencing a situation where the number
of login attempts is greater than the cutoff for the
service (e.g., the "1662/1" noted in the email thread),
that means that whatever firewall rule that is supposed
to be blocking the service isn't blocking the traffic.
(See next item for a case where the right rules are in
the filter, but you still get a "modest" overage of
attempts vs the cutoff.)

6) On a slow-ish single-CPU host (like the sparc64 that I use
as my gateway), it's possible to get more attempts than
the cutoff for a persist, high-speed attacker.

Basically, it takes so long before the system context switches
to the blacklist daemon, and the entry gets added to the pf table.
Where "so long" is still less than a second, but the machine has
already seen 10 or 12 attempts!

For example, here's a partial list of what my gateway is reporting
right now:

root at gatekeeper-130: blacklistctl dump -a
         address/ma:port	id	nfail	last access
[...]
  61.126.187.219/32:22	OK	3/3	2017/11/12 17:31:40
   156.212.51.78/32:22	OK	23/3	2017/11/12 19:09:38
  179.53.156.109/32:22	OK	3/3	2017/11/12 19:58:57
220.174.236.220/32:22		2/3	2017/11/12 23:39:58
  198.245.63.120/32:22	OK	3/3	2017/11/13 10:36:15

You can see a couple of "normally blocked" attempts (3/3),
a single IP address that has 2 of 3 attempts, and a very,
very persistent/fast host that got in 23 attempts before
it got blocked.

7) There was a note about different usernames from the same
remote host.  The blacklist support currently does not
differentiate between usernames.  It is just counting the
number of attempts from a remote IP address.

There's unfinished support for having a "known bad" set of usernames,
where a single login attempt for that username will block
the remote address.  This will allow (when finished), easy
blocking of the twenty or so most common usernames that are
probed.

Hopefully this will help.

-Kurt

On 11/13/17 9:17 AM, Cos Chan wrote:
> 
> 
> On Sat, Nov 11, 2017 at 1:42 PM, Ian Smith <smithi at nimnet.asn.au 
> <mailto:smithi at nimnet.asn.au>> wrote:
> 
>     On Thu, 9 Nov 2017 14:25:52 +0100, Cos Chan wrote:
> 
>       > Dear All
>       >
>       > Thanks Ian's great help, I have solved problem to post banned
>     entries from
>       > blacklistd to ipfw.
> 
>     Well, we're some of the way there :)  We really need Kurt Lidl's eyes on
>     this to make real progress, and indications are that my and your emails
>     cc'ing him were still being deferred for some reason - maybe he's away?
> 
>      > The original message was received at Tue, 7 Nov 2017 10:12:05
>     -0500 (EST)
>     > from mx2.freebsd.org <http://mx2.freebsd.org> [8.8.178.116]
>     >
>     >    ----- Transcript of session follows -----
>     > <lidl at pix.net <mailto:lidl at pix.net>>... Deferred: Operation timed out
>     with hydra.pix.net <http://hydra.pix.net>.
>     > Warning: message still undelivered after 4 hours
>     > Will keep trying until message is 1 week, 3 days old
> 
> 
>       > To my knowledge the problem is:
>       >
>       > I setup sshd+blacklistd without ipfw at first. Then I got
>     problem the entry
>       > was never reached nfail number (is it a bug?).
> 
>     The first issue was because of a severe deficiency in blacklistd-helper,
>     in that it doesn't actually check that the chosen firewall is running,
>     and it then fails to detect commands for that firewall that do not (can
>     not) succeed as any sort of error!  More about that below.
> 
>     The second, however, was mainly that you missed that nfail set to '*'
>     means that the host is NOT to be blocked, no matter how many auth or
>     other failures that (in this case) sshd reports.
> 
>     That also answers another question you had .. "nnn/-1" indicates that
>     nfail=* ie never to be blocked.  These still get accumulated in the
>     database, but are not applied as ipfw block rule table entries.
> 
> 
>       > so I have to change the nfail to * to get the entry into banned
>     list.
> 
>     In combination with other factors - like whether ipfw was running at the
>     time - that got blacklistd to record reported failures to its database,
>     but not to execute the 'add' commands to blacklistd-helper, so that
>     address was not in fact blocked, and subsequent attempts kept trying.
> 
>       > But while I setup ipfw, the nfail=* would not activate
>     blacklistd-helper so
>       > no entry in blacklist banned list were added to ipfw.
> 
>     Yes, nfail=* means NEVER block these failed addreses. blacklistd.conf(5)
> 
>       > I have modify the blacklistd nfail to 2, sshd MaxAuthTries to 3. The
>       > blacklist entries working fine.
> 
>     With ipfw running, yes :)  But it should have failed - noisily - sooner.
> 
>     When ipfw is running, issuing this will show you the addresses blocked:
> 
>       # ipfw table port22 list
> 
> 
> until now it seems working on list updating. but I am not sure if it is 
> really working fine.
> 
> here is one strange record:
> 
> $ sudo blacklistctl dump -b | grep 1662
> 193.201.224.218/32:22 <http://193.201.224.218/32:22>   OK      1662/1  
> 2017/11/13 00:31:04
> 
> This IP was blocked in ipfw from last week. while I checked it last week 
> Friday it was 800+/1 in blacklist and until today it become 1662.
> 
> To my knowledge the ipfw should block the connection, the times of 
> banned IP should be not increased?
> 
> I could see more entries with more than 3/1, for example:
> 
> 89.160.221.132/32:22 <http://89.160.221.132/32:22>   OK      18/1    
> 2017/11/13 00:01:21
> 60.125.42.119/32:22 <http://60.125.42.119/32:22>   OK      3/1    
>   2017/11/12 16:13:53
> 166.62.35.180/32:22 <http://166.62.35.180/32:22>   OK      3/1    
>   2017/11/10 06:36:25
> 202.162.221.51/32:22 <http://202.162.221.51/32:22>   OK      6/1    
>   2017/11/10 00:42:14
> 168.0.114.130/32:22 <http://168.0.114.130/32:22>   OK      3/1    
>   2017/11/10 23:40:30
> 95.145.71.165/32:22 <http://95.145.71.165/32:22>   OK      3/1    
>   2017/11/11 07:07:07
> 123.161.206.210/32:22 <http://123.161.206.210/32:22>   OK      3/1    
>   2017/11/12 18:14:00
> 203.146.208.208/32:22 <http://203.146.208.208/32:22>   OK      6/1    
>   2017/11/10 10:16:21
> 149.56.223.241/32:22 <http://149.56.223.241/32:22>   OK      1/1    
>   2017/11/12 06:09:16
> 121.169.217.98/32:22 <http://121.169.217.98/32:22>   OK      9/1    
>   2017/11/12 21:59:57
> 211.251.237.162/32:22 <http://211.251.237.162/32:22>   OK      2/1    
>   2017/11/13 12:08:07
> 103.99.0.116/32:22 <http://103.99.0.116/32:22>   OK      30/1    
> 2017/11/10 14:56:07
> 
> These records I am not sure if they were not increased after added to 
> ipfw list. but the 1662 times one, I am sure it was increased after ipfw 
> had the ip in list.
> 
> 
>       > BUT I found another problem.
>       >
>       > The output of blacklist dump is strange:
>       >
>       > $ sudo blacklistctl dump
>       >         address/ma:port id      nfail   last access
>       > 96.227.104.132/32:22 <http://96.227.104.132/32:22>         
>       0/2     1970/01/01 01:00:00
>       > 89.245.78.187/32:22 <http://89.245.78.187/32:22>           0/2 
>         1970/01/01 01:00:00
>       > 116.193.162.203/32:22 <http://116.193.162.203/32:22>         
>       1/2     2017/11/09 11:48:05
>       >
>       > Since the blacklistd accepts instruction from sshd. how could be 0/2
>       > entries presented there? I am sure my successful logins were not
>     added to
>       > blacklistd.
> 
>     1970/01/01 01:00:00 is just the UNIX '0' timestamp, in this case plus
>     one hour (your TZ offset).  It here means 'no previous entry'.  Not sure
>     about that 0/2, but there are several different codes returned by sshd
>     including success, failed auth and 'abusive behaviour' .. I'm not sure
>     which ones your reports (including in off-list mail) indicate.
> 
>     As for the mysterious 'n-1' behaviour you mentioned offlist for nfail,
>     in /usr/src/contrib/blacklist/bin/blacklistd.c there's this:
> 
>              switch (bi->bi_type) {
>              case BL_ABUSE:
>                      /*
>                       * If the application has signaled abusive behavior,
>                       * set the number of fails to be one less than the
>                       * configured limit.  Fallthrough to the normal BL_ADD
>                       * processing, which will increment the failure count
>                       * to the threshhold, and block the abusive address.
>                       */
>                      if (c.c_nfail != -1)
>                              dbi.count = c.c_nfail - 1;
>                      /*FALLTHROUGH*/
>              case BL_ADD:
>                      dbi.count++;
>                      dbi.last = ts.tv_sec;
>                      if (dbi.id <http://dbi.id>[0]) {
>                              /*
>                               * We should not be getting this since the rule
>                               * should have blocked the address. A possible
>                               * explanation is that someone removed that
>     rule,
>                               * and another would be that we got another
>     attempt
>                               * before we added the rule. In anycase, we
>     remove
>                               * and re-add the rule because we don't
>     want to add
>                               * it twice, because then we'd lose track
>     of it.
>                               */
>                              (*lfun)(LOG_DEBUG, "rule exists %s", dbi.id
>     <http://dbi.id>);
>                              (void)run_change("rem", &c, dbi.id
>     <http://dbi.id>, 0);
>     dbi.id <http://dbi.id>[0] = '\0';
>                      }
>                      if (c.c_nfail != -1 && dbi.count >= c.c_nfail) {
>                              int res = run_change("add", &c, dbi.id
>     <http://dbi.id>, sizeof(dbi.id <http://dbi.id>));
>                              if (res == -1)
>                                      goto out;
>                              sockaddr_snprintf(rbuf, sizeof(rbuf), "%a",
>                                  (void *)&rss);
>                              (*lfun)(LOG_INFO,
>                                  "blocked %s/%d:%d for %d seconds",
>                                  rbuf, c.c_lmask, c.c_port, c.c_duration);
> 
>                      }
>                      break;
> 
>     But if the 'add' command via blacklistd-helper fails, it will never add
>     the 1 .. I'm not certain about this, but it could explain what you see,
>     although I can't discern whether sshd is reporting BL_ADD or BL_ABUSE.
> 
>     You might instead try MaxAuthTries 4 .. sshd_config(5) says:
> 
>           MaxAuthTries
>                   Specifies the maximum number of authentication
>     attempts permitted
>                   per connection.  Once the number of failures reaches
>     half this
>                   value, additional failures are logged.  The default is 6.
> 
>     Half of 3 as an integer is only 1, but half of 4 is 2.  See if it helps?
> 
> 
> I didnt change the MaxAuthTries, since I found something interesting 
> from the different logs concerning that issue:
> 
>  From blacklistctl dump:
> 
> $ sudo blacklistctl dump
>          address/ma:port id      nfail   last access
> 78.203.146.34/32:22 <http://78.203.146.34/32:22>           0/1    
>   1970/01/01 01:00:00
> 195.225.116.21/32:22 <http://195.225.116.21/32:22>           0/1    
>   1970/01/01 01:00:00
> 123.31.26.123/32:22 <http://123.31.26.123/32:22>           0/1    
>   1970/01/01 01:00:00
> 112.148.101.13/32:22 <http://112.148.101.13/32:22>           0/1    
>   1970/01/01 01:00:00
> 93.23.6.18/32:22 <http://93.23.6.18/32:22>           0/1     1970/01/01 
> 01:00:00
> 5.102.197.124/32:22 <http://5.102.197.124/32:22>           0/1    
>   1970/01/01 01:00:00
> 193.154.127.32/32:22 <http://193.154.127.32/32:22>           0/1    
>   1970/01/01 01:00:00
> 113.232.216.41/32:22 <http://113.232.216.41/32:22>           0/1    
>   1970/01/01 01:00:00
> 
>  From sshd log:
> 
> Nov 10 17:57:41 res sshd[49839]: Invalid user pi from 193.154.127.32
> Nov 10 17:57:41 res sshd[49840]: Invalid user pi from 193.154.127.32
> Nov 10 17:57:41 res sshd[49840]: input_userauth_request: invalid user pi 
> [preauth]
> Nov 10 17:57:41 res sshd[49839]: input_userauth_request: invalid user pi 
> [preauth]
> ...
> Nov 11 03:50:47 res sshd[57896]: Invalid user support from 123.31.26.123
> Nov 11 03:50:47 res sshd[57896]: input_userauth_request: invalid user 
> support [preauth]
> Nov 11 03:50:47 res sshd[57896]: error: Received disconnect from 
> 123.31.26.123 port 55811:3: com.jcraft.jsch.JSchException: Auth fail 
> [preauth]
> Nov 11 03:50:49 res sshd[57898]: Invalid user admin from 123.31.26.123
> Nov 11 03:50:49 res sshd[57898]: input_userauth_request: invalid user 
> admin [preauth]
> Nov 11 03:50:49 res sshd[57898]: error: Received disconnect from 
> 123.31.26.123 port 57823:3: com.jcraft.jsch.JSchException: Auth fail 
> [preauth]
> Nov 11 03:50:51 res sshd[57900]: Invalid user admin from 123.31.26.123
> Nov 11 03:50:51 res sshd[57900]: input_userauth_request: invalid user 
> admin [preauth]
> Nov 11 03:50:51 res sshd[57900]: error: Received disconnect from 
> 123.31.26.123 port 59819:3: com.jcraft.jsch.JSchException: Auth fail 
> [preauth]
> Nov 11 03:50:53 res sshd[57902]: Invalid user ubnt from 123.31.26.123
> Nov 11 03:50:53 res sshd[57902]: input_userauth_request: invalid user 
> ubnt [preauth]
> Nov 11 03:50:53 res sshd[57902]: error: Received disconnect from 
> 123.31.26.123 port 61795:3: com.jcraft.jsch.JSchException: Auth fail 
> [preauth]
> Nov 11 03:50:55 res sshd[57904]: Invalid user PlcmSpIp from 123.31.26.123
> Nov 11 03:50:55 res sshd[57904]: input_userauth_request: invalid user 
> PlcmSpIp [preauth]
> Nov 11 03:50:55 res sshd[57904]: error: Received disconnect from 
> 123.31.26.123 port 61920:3: com.jcraft.jsch.JSchException: Auth fail 
> [preauth]
> Nov 11 03:50:57 res sshd[57906]: Invalid user admin from 123.31.26.123
> Nov 11 03:50:57 res sshd[57906]: input_userauth_request: invalid user 
> admin [preauth]
> Nov 11 03:50:57 res sshd[57906]: error: Received disconnect from 
> 123.31.26.123 port 61949:3: com.jcraft.jsch.JSchException: Auth fail 
> [preauth]
> 
> I see 2 problems:
> 
> Problem 1:
> The IP 193.154.127.32 didn't reach sshd maximum authentication (=3), it 
> tried only 2 times.
> But in my opinion it should be recorded to blacklistd as 2/1 instead of 0/1.
> 
> Problem 2:
> The IP 123.31.26.123 was trying to use different user name to login more 
> than 3 times. it was also recorded in blacklistd as 0/1.
> 
> In my opinion the above 2 all should be banned by blacklistd.
> 
> 
>       > I am trying to find out the reason from log but I dont know how
>     to see
>       > blacklistd log. man page said that is to syslogd but what the
>     facility it
>       > is? or some other ways to get out log?
> 
>     Not sure of the facility but when using the -v switch, as you have been,
>     logging goes to stderr instead of syslog.  Without -v you should see it
>     logging to /var/log/messages.  If not, try adding to /etc/syslog.conf:
> 
>     !blacklistd
>     *.*             /var/log/myblacklistd.log
> 
>     then '# touch /var/log/myblacklistd.log && service syslogd restart'
> 
> 
> Unfortunately I started the logging later than Nov 11 03:50:57, so I 
> didnt get the log of "0/1" records yet.
> 
> 
>     Ok, problems with blacklistd-helper; the first bit verbatim, tabs lost:
> 
>     #!/bin/sh
>     #echo "run $@" 1>&2
>     #set -x
>     # $1 command
>     # $2 rulename
>     # $3 protocol
>     # $4 address
>     # $5 mask
>     # $6 port
>     # $7 id
> 
>     pf=
>     if [ -f "/etc/ipfw-blacklist.rc" ]; then
>              pf="ipfw"
>              . /etc/ipfw-blacklist.rc
>              ipfw_offset=${ipfw_offset:-2000}
>     fi
> 
>     if [ -z "$pf" ]; then
>              for f in npf pf ipf; do
>                      if [ -f "/etc/$f.conf" ]; then
>                              pf="$f"
>                              break
>                      fi
>              done
>     fi
> 
>     if [ -z "$pf" ]; then
>              echo "$0: Unsupported packet filter" 1>&2
>              exit 1
>     fi
> 
>     Earlier you said you'd run it without /etc/ipfw-blacklist.rc existing.
>     In that case - UNLESS you had either /etc/pf.conf or /etc/ipf.conf lying
>     around from before? it should have failed with 'exit 1' .. though it's
>     not clear from browsing the code that even that would cause it to quit.
> 
> 
> No, there are not /etc/pf.conf and /etc/ipf.conf.
> 
> 
>     So once /etc/ipfw-blacklist.rc exists, that's a flag indicating you
>     intend using ipfw, however there's NO check that ipfw is running ..
> 
>     Then - ignoring the pf) and ipf) sections - though I suspect they'd have
>     the same issue unless really running - here's the ipfw add bit, no tabs:
> 
>     add)
>              case "$pf" in
>     [..]
>              ipfw)
>                      # use $ipfw_offset+$port for rule number
>                      rule=$(($ipfw_offset + $6))
>                      tname="port$6"
>                      /sbin/ipfw table $tname create type addr 2>/dev/null
> 
>     Unless ipfw is running, enabled, that will fail - silently.
> 
>                      /sbin/ipfw -q table $tname add "$addr/$mask"
> 
>     Ditto, perhaps with a message to stderr - that's simply ignored.
> 
>                      # if rule number $rule does not already exist,
>     create it
>                      /sbin/ipfw show $rule >/dev/null 2>&1 || \
>                              /sbin/ipfw add $rule drop $3 from \
>                              table"("$tname")" to any dst-port $6
>      >/dev/null && \
>                              echo OK
>                      ;;
> 
>     When both of these ipfw commands also fail, it'll only fail to echo OK.
> 
>     Not that failing to echo OK seems to matter to the calling code, but
>     the OK is kept as 'id' which is passed to the rem)ove code, but is
>     unused except by the npf firewall .. 'netbsd packet filter' I guess.
> 
>     I can certainly suggest patches for at least the ipfw sections - and
>     really, if the introductory code checks ipfw is working that should be
>     enough - but I'm unsure whether 'exit 1' after an error message is all
>     that's needed to get blacklistd to whinge loudly and refuse to continue?
> 
>     This should be turned into a PR via bugzilla, but since I'm not running
>     11.x here, I can only really contribute if you do so and add me as a cc.
> 
> 
> Sorry I dont know how to describe the problem in bugzilla since I dont 
> really understand what you said.
> I have to learn more about the script :)
> 
> 
>     Please try to avoid top-posting on replies, thanks. 
> 
> 
> Sure, I will.
> 
> 
>     cheers, Ian
> 
> 
> 
> 
> -- 
> with kind regards