GNU sort is a ubiquitous utility, used in GNU/Linux, FreeBSD, and Mac OS X. While being very mature and flexible, it is quite tricky to use, due to backwards compatibility and many options. Consequently it's probably the utility most questioned on the main coreutils mailing list. Although the caveats are well documented, the documentation is necessarily long and complicated. So to help users more directly we've added the --debug option to give helpful warnings and annotation of input to the user.
There are 3 types of output from --debug. Info, warnings and key annotations.
InfoThe only info currently reported is the locale that is being used to sort, which is a common cause of confusion for users.
$ sort --debug /dev/null sort: using `en_US.UTF-8' sorting rules $ LC_ALL=C sort --debug /dev/null sort: using simple byte comparison $ LC_ALL=en_US.missing sort --debug /dev/null sort: using simple byte comparison
WarningsHere is a contrived example that shows all of the warnings currently reported.
$ sort --debug -rb -k1n +2.2 -2b /dev/null sort: using `en_US.UTF-8' sorting rules sort: key 1 is numeric and spans multiple fields sort: obsolescent key `+2 -2' used; consider `-k 3,2' instead sort: key 2 has zero width and will be ignored sort: leading blanks are significant in key 2; consider also specifying `b' sort: option `-b' is ignored sort: option `-r' only applies to last-resort comparisonTaking a more realistic example in isolation
$ sort --debug -s -r -k1,1n /dev/null sort: using `en_US.UTF-8' sorting rules sort: option `-r' is ignored
Key annotationsKey annotations are generally useful to confirm the extents of the keys being matched, especially when one needs to define character offsets.
In this example we see that there can be 2 comparisons per line, the last resort one (because we didn't specify -s) serves to mess up the sort in this example
$ printf "1.1 four\n1.1 five\n" | sort -n --debug 2>/dev/null 1.1 five ___ ________ 1.1 four ___ ________Here we can see how TAB characters are distinguished with '>', and the complicated number matching of the '-g' option.
printf "0x3e4\n1.1\n +2" | cat -n | sort -gs -k2,2 --debug 2>/dev/null 2>1.1 ___ 3> +2 __ 1>0x3e4 _____Here we see how leading blanks are significant in the comparison fields, which in this case can be used to efficiently sort right aligned numbers. Note the significance of LANG=C here to avoid issues with blanks being ignored in the comparison in some locales.
printf '...%6s\n' 9 10 | LANG=C sort -s -k2,2 --debug 2>/dev/null ... 9 ______ ... 10 ______