GNU sort is a ubiquitous utility, used in GNU/Linux, FreeBSD, and Mac OS X. While being very mature and flexible, it is quite tricky to use, due to backwards compatibility and many options. Consequently it's probably the utility most questioned on the main coreutils mailing list. Although the caveats are well documented, the documentation is necessarily long and complicated. So to help users more directly we've added the --debug option to give helpful warnings and annotation of input to the user.

There are 3 types of output from --debug. Info, warnings and key annotations.


The only info currently reported is the locale that is being used to sort, which is a common cause of confusion for users.
$ sort --debug /dev/null
sort: using `en_US.UTF-8' sorting rules

$ LC_ALL=C sort --debug /dev/null
sort: using simple byte comparison

$ LC_ALL=en_US.missing sort --debug /dev/null
sort: using simple byte comparison


Here is a contrived example that shows all of the warnings currently reported.
$ sort --debug -rb -k1n +2.2 -2b /dev/null
sort: using `en_US.UTF-8' sorting rules
sort: key 1 is numeric and spans multiple fields
sort: obsolescent key `+2 -2' used; consider `-k 3,2' instead
sort: key 2 has zero width and will be ignored
sort: leading blanks are significant in key 2; consider also specifying `b'
sort: option `-b' is ignored
sort: option `-r' only applies to last-resort comparison
Taking a more realistic example in isolation
$ sort --debug -s -r -k1,1n /dev/null
sort: using `en_US.UTF-8' sorting rules
sort: option `-r' is ignored
[Update Oct 2021:
New warnings are added related to the handling of thousands grouping characters, decimal points, and sign characters. For example:
$ printf '0,9\n1,a\n' | sort -nk1 --debug -t, -s
sort: key 1 is numeric and spans multiple fields
sort: field separator ‘,’ is treated as a group separator in numbers
For more examples and details see the commit. ]

Key annotations

Key annotations are generally useful to confirm the extents of the keys being matched, especially when one needs to define character offsets.

In this example we see that there can be 2 comparisons per line, the last resort one (because we didn't specify -s) serves to mess up the sort in this example

$ printf "1.1 four\n1.1 five\n" | sort -n --debug 2>/dev/null
1.1 five
1.1 four
Here we can see how TAB characters are distinguished with '>', and the complicated number matching of the '-g' option.
printf "0x3e4\n1.1\n  +2" | cat -n | sort -gs -k2,2 --debug 2>/dev/null
     3>  +2
Here we see how leading blanks are significant in the comparison fields, which in this case can be used to efficiently sort right aligned numbers. Note the significance of LANG=C here to avoid issues with blanks being ignored in the comparison in some locales.
printf '...%6s\n' 9 10 | LANG=C sort -s -k2,2 --debug 2>/dev/null
...     9
...    10
© May 17 2010