scripting suggestion: how to make this command shorter

Zhang Weiwu zhangweiwu at realss.com
Sat Jun 27 13:13:14 UTC 2009


Hello. I wrote this one-line command to fetch a page from a long uri,
parse it twice: first time get subject & second time get content, and
send it as email to me.

$ w3m -dump 'http://search1.taobao.com/browse/33/n-g,w6y4zzjaxxymvjomxy----------------40--commend-0-all-33.htm?at_topsearch=1&ssid=e-s5' | grep -A 100 对比 | mail -a 'Content-Type: text/plain; charset=UTF-8' -s '=?UTF-8?B?'`w3m -dump 'http://search1.taobao.com/browse/33/n-g,w6y4zzjaxxymvjomxy----------------40--commend-0-all-33.htm?at_topsearch=1&ssid=e-s5' | grep 找到.*件 | base64 -w0`'?=' zhangweiwu at realss.com


The stupid part of this script is it fetches the page 2 times and parse
2 times, thus making the command very long. If I can write the command
in a way that the URI only appear once, then it is easier for me to
maintain it. I plan to put it in cron yet avoid having to modify two
places when the URI changes (and it does!).

How do you suggest optimizing the one-liner?

By the way I feel it stupid having to wrap the subject by using:
$ mail -s '=?UTF-8?B?'`echo $subject | base64`'?='

instead of
$ mail -s $subject

Because mail(1), as defined, intelligent user agent, should know the
current locale is UTF-8 and should know UTF-8 header must be base64
encoded for RFC compatibility. Yet it also should know if mail body is
UTF-8 the header 'Content-Type: text/plain; charset=UTF-8' must not be
omitted in case of UTF-8 content. I think this is a bug, as both are
required by RFC. How do you think?


More information about the freebsd-questions mailing list