After an unusually productive evening's hacking, I've managed to add another silly feature to my blog: I can now have code samples pretty-printed in tasteful colours.
I've put together a Perl module called Syntax::Vim, which runs Vim to get it to load a piece of code and syntax colour it, and then runs a script based on Bram Moolenaar's 2html.vim to turn the colours into markup (in an ad-hoc syntax). My Perl then parses the output and turns it into a data structure, which it can give back to a program as-is or turn into HTML.
The point of all this is to make it really easy to turn a source file into useful markup:
#!/usr/bin/perl -w # Make some HTML from my own source. use strict; use Syntax::Vim; my $syntax = Syntax::Vim->new( file => $0, filetype => 'perl', # Usually can be omitted html_full_page => 1, # Not just a fragment of HTML ); print $syntax->html;
That returns HTML with bits marked with <span> elements:
<span class="synPreProc">#!/usr/bin/perl -w</span> <span class="synComment"># Make some HTML from my own source.</span> <span class="synStatement">use strict</span>; ...
I've put together a stylesheet based on the default light-background Vim colour scheme, which is what I've used on this site.
There's one problem with using this on my blog: it breaks the display of source files in the context lines in search results (they get doubly HTML escaped). I'm not going to bother fixing this yet, because the code is getting unmanageable, mainly because the blog messages are parsed in an event-based way (like SAX parsers, except in mine events are pulled from the function that wants them, rather than pushed through the stream from the parser). I'm probably going to switch to a tree-based system, which should reduce the amount of time I spend worrying about what element I'm in.
I've got more plans for Syntax::Vim. I want to add an XML output format, which will use different elements for the different parts of the input file, which should make it fairly simple to write some XSLT and XSL-FO to make nice colourful PDFs from source code.
If I can find the time to tidy this stuff up, I'll stick it on CPAN.
It's a bit of a shame I've had to implement this syntax colouring by
shelling out to Vim. It's dog slow, relies on you having an appropriate
version of Vim available (I don't know what version is required)
and makes it tricky to detect errors and things appropriately. There's
all the nastiness of forking and execing and pointing STDOUT, etc.,
at /dev/null. It would be nice to have a pure-Perl syntax
markup system, with a big set of syntax description files like Vim's,
and an easy way of adding new ones. Sounds like a fun project, but
I'm busy writing training materials till the end of time, so probably
won't be getting round to that one for a while.