Some years ago I stumbled upon SedSokoban, the sokoban game implemented as a sed script. I found it pretty amusing, so I got interested in the more arcane uses of sed. As an exercise, I set out to write an XML indenter sed script.
Now I found that script (again), and I thought it would be a nice starting post here.
xmlindent.sed looks like this:
:a />/!N;s/\n/ /;ta s/ / /g;s/^ *//;s/ */ /g /^<!--/{ :e /-->/!N;s/\n//;te s/-->/\n/;D; } /^<[?!][^>]*>/{ H;x;s/\n//;s/>.*$/>/;p;bb } /^<\/[^>]*>/{ H;x;s/\n//;s/>.*$/>/;s/^ //;p;bb } /^<[^>]*\/>/{ H;x;s/\n//;s/>.*$/>/;p;bb } /^<[^>]*[^\/]>/{ H;x;s/\n//;s/>.*$/>/;p;s/^/ /;bb } /</!ba { H;x;s/\n//;s/ *<.*$//;p;s/[^ ].*$//;x;s/^[^<]*//;ba } :b { s/[^ ].*$//;x;s/^<[^>]*>//;ba }
Unfortunately it chokes on some xml inputs, but I could use it to pretty-format most of the common xml files I came across (configuration files, xml-based network protocol messages, etc).
2 comments:
my weapon of choice here:
`xmllint --format`
xmllint is of course much better than my simple sed script, but 1) sed is available on more systems and 2) this was just a fun exercise anyway ;)
Post a Comment