Improving BSD licensed text-processing tools

Jesse Hagewood jesse.hagewood at gmail.com
Mon Jun 25 20:49:29 UTC 2012


Progress this week:

- Diff's context, unified, and normal formats seem to be completely GNU
compatible now. Most of it was timestamp issues, a little bit of it was
output diff would give when running across binary files or directories.
- The bug I found that involved input files over a few hundred bytes turned
out to not be about size. It actually occurred because BSD diff would
search the input file for any non-ASCII characters, and if it found any at
all in the file, would consider the file a binary file. GNU diff doesn't do
that. This means that any text file with Unicode characters would be
considered a binary file. My fix for this is problem is to instead check
the first few bytes of the file to see if it is an ELF format file, and if
so, assumes the file is a text file.
- Lots of code clean-up with diff. There were lots of uses of putchar(),
puts() and other output functions like that in diffreg.c, and i substituted
all of them with printf(), also fixed a lot of style things. Not really
finished in this respect, though.
- Put together a test script for diff.
- Studied the --ignore-*-* options, I've found that the ones that were
previously implemented don't work correctly.  For example, in
ignore-blank-lines' output, the line in the diff dealing with the blank
lines is followed by a 'o' character.
- Did a write-up for man/mdoc macros on my wiki. Currently I've described
the specific source files involved with implementing macros, and I will add
more information soon.

Here's my to-do list for diff:
https://socsvn.freebsd.org/socsvn/soc2012/jhagewood/diff/TODO


More information about the soc-status mailing list