Any tool known to demangle special chars in MS tree path names ?

Kevin Oberman rkoberman at gmail.com
Mon Dec 8 17:39:25 UTC 2014


On Mon, Dec 8, 2014 at 8:18 AM, Julian H. Stacey <jhs at berklix.com> wrote:

> Hi ports@
> Is there a utility in ports/ to automatically clean disgusting path
> names in big trees one acquires from Microsoft users ?
>
> Trees with in both directories & filenames, masses of meta characters
> such as as ' ` . * | \ & space (& accents & high parity bit national
> extensions eg german umlauts) etc.
>
> It's known:
>         tr exists,
>         find has -X
>         xargs has -0
>         One can delimit path names on command line,
>         One could reinvent the wheel, writing & improving a scarey
>         shell script (that would probably break while debugging &
>         trash some data)
>
> But I'm looking for something better, that probably already exists, that
> will permanently clean trees, to no longer need to delimit various
> utilities against nasty names, each time tree is accessed.
>
> Either:
>         An existing tool (preferably C) one can run automatically
>         to forcibly rename dirs & files in a Unix friendly manner ?
>
> Or if none exists I'll write a C program to run from find.
>         If so, I'll probably just map nasty chars inc. any high bit
>         parity (accents umlauts & other noise) to eg "0xAB" expansion.
>         ( I dont care about national accents & char sets. )
>
> PS Assume files are big, copying not viable, rename/mv via link & unlink
> best.
>
> Any tools (URLs) known ?  Or should I write my own ?
>
> Cheers,
> Julian
> --
> Julian Stacey, BSD Linux Unix'78 C Sys Eng Consultant Munich
> http://berklix.com
>  Indent previous with "> ".  Interleave reply paragraphs like a play
> script.
>  Send plain text, not quoted-printable, HTML, base64, or
> multipart/alternative.
>
>
Around  decade or so ago I was looking for a tool to clean up all of the
Microsoft "special" characters in web pages. I found "demoroniser", a
public domain tool written by "John Walker". As is, it does not meet your
needs, but  goes a long way in the right direction. It expects to work with
files, not directory trees, but modifying to would be quite trivial, mostly
wrapping it in a recursive loop that uses opendir, readdir, and closedir to
walk the tree and feed it the directory names. (No, I am not volunteering.)

Most notably, it is written in Perl, not C. Perl is now very of of fashion.

In any case, it is available at:
http://www.fourmilab.ch/webtools/demoroniser/
--
Kevin Oberman, Network Engineer, Retired


More information about the freebsd-ports mailing list