After a long bank holiday monday hacking I've put together a new search facility for my blog. The indexing of documents is fairly simplistic (a complete mapping of words to documents is stored in a database), but it does have one cool feature: search results are shown with a few lines of context from the message, with the words you searched for highlighted. Just like Google :-)
While writting this stuff I found myself writing a loop like this, which I thought was disgusting enough that it had to be left in the code:
for (0 .. $#$match) { ... }
I expect I'll have to tweak my regexes later. I've already had to add some specialness to make a search for “C++” work. The boundaries between words are currently detected with this regex:
our $DEFAULT_REGEX = qr/(?:[^+\w]|(?<![+\w])\+)+/;