XML indenting with sed(1)

Some years ago I stumbled upon SedSokoban, the sokoban game implemented as a sed script. I found it pretty amusing, so I got interested in the more arcane uses of sed. As an exercise, I set out to write an XML indenter sed script.

Now I found that script (again), and I thought it would be a nice starting post here.

xmlindent.sed looks like this:

/>/!N;s/\n/ /;ta
s/	/ /g;s/^ *//;s/  */ /g
H;x;s/\n//;s/>.*$/>/;s/^	//;p;bb
H;x;s/\n//;s/>.*$/>/;p;s/^/	/;bb
H;x;s/\n//;s/ *<.*$//;p;s/[^	].*$//;x;s/^[^<]*//;ba
s/[^	].*$//;x;s/^<[^>]*>//;ba

Unfortunately it chokes on some xml inputs, but I could use it to pretty-format most of the common xml files I came across (configuration files, xml-based network protocol messages, etc).


Amber said...

my weapon of choice here:

`xmllint --format`

mitchnull said...

xmllint is of course much better than my simple sed script, but 1) sed is available on more systems and 2) this was just a fun exercise anyway ;)