freebsd-update's install_verify routine excessive stating
Oliver Fromme
olli at lurza.secnetix.de
Fri Jan 23 12:06:29 PST 2009
Doug Barton wrote:
> Oliver Fromme wrote:
> > Yoshihiro Ota wrote:
> > > Oliver Fromme wrote:
> > > > It would be much better to generate two lists:
> > > > - The list of hashes, as already done ("filelist")
> > > > - A list of gzipped files present, stripped to the hash:
> > > >
> > > > (cd files; echo *.gz) |
> > > > tr ' ' '\n' |
> > > > sed 's/\.gz$//' > filespresent
> > > >
> > > > Note we use "echo" instead of "ls", in order to avoid the
> > > > kern.argmax limit. 64000 files would certainly exceed that
> > > > limit. Also note that the output is already sorted because
> > > > the shell sorts wildcard expansions.
> > > >
> > > > Now that we have those two files, we can use comm(1) to
> > > > find out whether there are any hashes in filelist that are
> > > > not in filespresent:
> > > >
> > > > if [ -n "$(comm -23 filelist filespresent)" ]; then
> > > > echo -n "Update files missing -- "
> > > > ...
> > > > fi
> > > >
> > > > That solution scales much better because no shell loop is
> > > > required at all.
> > >
> > > This will probably be the fastest.
> >
> > Are you sure? I'm not.
>
> I'd put money on this being faster for a lot of reasons.
I assume, with "this" you mean my solution to the slow
shell loop problem (not quoted above), not Yoshihiro Ota's
awk proposal?
> test is a
> builtin in our /bin/sh, so there is no exec involved for 'test -f',
> but going out to disk for 64k files on an individual basis should
> definitely be slower than getting the file list in one shot.
Correct.
> There's no doubt that the current routine is not efficient. The cat
> should be eliminated, the following is equivalent:
>
> cut -f 2,7 -d '|' $@ |
>
> (quoting the $@ won't make a difference here).
Right, technically it doesn't make a difference because the
file names are fixed and don't contain spaces. *But* then
it is better to use $*. Every time I see $@ without double
quotes I wonder if the author forgot to add them. It always
smells like a bug. Using $@ without quotes is pointless
because then it behaves exactly the same as $*.
> I haven't seen the files we're talking about, but I can't help
> thinking that cut | grep | cut could be streamlined.
Yes, it can. I already explained pretty much all of that
(useless cat etc.) in my first post in this thread. Did
you read it? My suggestion (after a small correction by
Christoph Mallon) was to replace the cat|cut|grep|cut
sequence with this single awk command:
awk -F "|" '$2 ~ /^f/ {print $7}' "$@"
For those not fluent with awk, it means this:
- Treat "|" as field separator.
- Search for lines where the second field matches ^f
(i.e. it starts with an "f").
- Print the 7th field of those matching lines.
Best regards
Oliver
--
Oliver Fromme, secnetix GmbH & Co. KG, Marktplatz 29, 85567 Grafing b. M.
Handelsregister: Registergericht Muenchen, HRA 74606, Geschäftsfuehrung:
secnetix Verwaltungsgesellsch. mbH, Handelsregister: Registergericht Mün-
chen, HRB 125758, Geschäftsführer: Maik Bachmann, Olaf Erb, Ralf Gebhart
FreeBSD-Dienstleistungen, -Produkte und mehr: http://www.secnetix.de/bsd
In my experience the term "transparent proxy" is an oxymoron (like jumbo
shrimp). "Transparent" proxies seem to vary from the distortions of a
funhouse mirror to barely translucent. I really, really dislike them
when trying to figure out the corrective lenses needed with each of them.
-- R. Kevin Oberman, Network Engineer
More information about the freebsd-hackers
mailing list