Syntax highlighting with GeSHi

Since this blog's main subject is development it's unavoidable I'll show snippets of code in my posts. And I like my code to be syntax-highlighted, so it's natural that I noticed and downloaded the GeSHi plugin for Habari before I even installed Habari: I considered it that much of an essential part of my blogging toolbox.

I'm a great fan of GeSHi, which is used in many OSS projects, including Wikka Wiki (where I'm a member of the development crew). The latest stable version of GeSHi, which is 1.0.8.1 as I'm writing this, comes with support for an impressive number of languages out of the box, and it's possible to write (and submit!) your own as well, if you're missing what you need (and know the syntax well, of course).

Setting up the plugin

GeSHi installed as 3rdparty software on my serverThe plugin is straightforward to install: just put it in your plugins directory and activate it: it doesn't have any configuration. One thing I immediately liked about this simple and straightforward plugin is that is does not come with GeSHI itself: that way you can make sure you have the latest and greatest, even if the plugin hasn't been updated for a while. It comes with a little documentation file which outlines how to integrate GeSHi with the plugin: it expects to see the main GeSHi class in a subdirectory 'geshi' in (habari)/3rdparty/. That's not quite how I did it, because I don't like duplicating software all over my server. Here's what I did:

First of all, I got the latest version, and unpacked it into a subdirectory of /var/www/ where I keep shared source code; it creates a geshi directory, and I renamed that by adding its version number so I can quickly see what I have (and easily go back to an earlier version if ever needed); then I made a geshi symlink to that — like so:

  1. [root@vps src]# cd /var/www/src/packages/3rdparty
  2. [root@vps 3rdparty]# tar xvzf /usr/local/src/GeSHi-1.0.8.1.tar.gz
  3. [root@vps 3rdparty]# mv geshi geshi-1.0.8.1
  4. [root@vps 3rdparty]# ln -s geshi-1.0.8.1 geshi
I also made the directory tree readable and executable by all. You can see the result in the screenshot.

The next step was to "put it in" a geshi directory under (habari)/3rdparty/. But, as you can see in the screenshot above, my Habari code base itself also lives in that shared source directory structure under /var/www/ (more about that in a later post). That doesn't make any difference for this exercise: I just went to the 3rdparty directory in my Habari code base and from there created a symlink to the geshi symlink:

  1. [root@vps 3rdparty]# cd /var/www/src/packages/3rdparty
  2. [root@vps 3rdparty]# cd habari/3rdparty
  3. [root@vps 3rdparty]# pwd
  4. /var/www/src/packages/3rdparty/habari/3rdparty
  5. [root@vps 3rdparty]# ln -s /var/www/src/packages/3rdparty/geshi geshi
Now, when I install a new version of GeSHi, all I have to do is change the geshi symlink in /var/www/src/packages/3rdparty to point to the new version. And when I upgrade my version of Habari, all I need to do is create a new symlink from within /var/www/src/packages/3rdparty/habari/3rdparty to that geshi symlink.

Tweaking the plugin

GeSHI options

As I said, the plugin doesn't have any configuration. It sets up two (sensible) options (enable classes, which provides class attributes as hooks for styling, and enable fancy line numbers). If you want extra (or different) options, just edit the geshipaint() function in the plugin, and add your own options. I haven't done that (yet) as I'm happy with the current options.

Stylesheet

The documentation also points out that GeSHI comes with a little tool to generate a stylesheet for a language. The plugin assumes such a stylesheet geshi.css is located in the main geshi directory, and helpfully injects it in the page head section. That's fine, but I work with different languages and normally just use a generic color scheme for all code highlighting — and GeSHi doesn't come with a default stylesheet either. Including non-existing files is not very useful, so in the theme_header() function I changed:

  1. return "<link rel=\"stylesheet\" type=\"text/css\" media=\"screen\" href=\"" . Site::get_url('habari') . "/3rdparty/geshi/geshi.css\"/>";
into:

  1. $css_path = Site::get_url( 'habari' ) . '/3rdparty/geshi/geshi.css';
  2. // include link if the stylesheet exists
  3. if ( file_exists( $css_path ) ) {
  4. return '<link rel="stylesheet" type="text/css" media="screen" href="' . $css_path . '"/>';
  5. } else {
  6. return '';
  7. }
The code generated by the plugin is just what GeSHi produces: <pre> tags (with a class named after the specified language and a style attribute setting a basic monospace font family) surrounding (because line numbers are no) an ordered list of lines of code. That's fine, but doesn't give us any hook to generically style all GeSHi code blocks, so I added a wrapper around it In the geshipaint() function; I changed:

  1. $mycode = str_replace($mymatches[0][$i], $geshi->parse_code(),$mycode);
into:

  1. $mycode = str_replace($mymatches[0][$i], '<div class="geshi">' . $geshi->parse_code() . '</div>' , $mycode);
Finally, I added the following block to my theme's stylesheet:

  1. /* --- syntax highlighting code - GeSHi --- */
  2. div.geshi pre {
  3. margin: 0 1em !important;
  4. overflow: auto;
  5. background-color: #ffe;
  6. border: 1px dotted #ccc;
  7. }
  8. div.geshi pre ol li {
  9. margin-bottom: 0; /* override .entry-content ol li */
  10. font-family: "Lucida Console", Monaco, monospace;
  11. /*outline: 1px dotted red;*/ /* DEBUG */
  12. }
  13. div.geshi pre ol li.li2 {
  14. background-color: #ffd; /* gentle zebra stripes */
  15. }
  16. div.geshi pre ol li div {
  17. margin: 0; /* override .entry-content div */
  18. }
  19. div.geshi .br0 { color: #6C6; }
  20. div.geshi .co0 { color: #808080; font-style: italic; } /* comment */
  21. div.geshi .co1 { color: #808080; font-style: italic; }
  22. div.geshi .co2 { color: #808080; font-style: italic; }
  23. div.geshi .coMULTI { color: #808080; font-style: italic; } /* multi-line comment */
  24. div.geshi .es0 { color: #009; font-weight: bold; }
  25. div.geshi .kw1 { color: #b1b100; } /* keyword */
  26. div.geshi .kw2 { color: #000; font-weight: bold; }
  27. div.geshi .kw3 { color: #006; }
  28. div.geshi .kw4 { color: #933; }
  29. div.geshi .kw5 { color: #00F; }
  30. div.geshi .me0 { color: #060; }
  31. div.geshi .nu0 { color: #C6C; } /* number */
  32. div.geshi .re0 { color: #00F; }
  33. div.geshi .re1 { color: #00F; }
  34. div.geshi .re2 { color: #00F; }
  35. div.geshi .re4 { color: #099; }
  36. div.geshi .sc0 { color: #0BD; }
  37. div.geshi .sc1 { color: #DB0; }
  38. div.geshi .sc2 { color: #090; }
  39. div.geshi .st0 { color: #F00; }
  40. div.geshi .sy0 { color: #F00; } /* symbol */

Marker tags

Finally, the plugin also assumes you enclose your GeSHi-highlightable code within tags. To make the latter assumption work with Habari, you have to go and edit (habari)/system/classes/inputfilter.php and add geshi to the $whitelist_elements and $whitelist_attributes arrays. While that works, it made me uneasy for two reasons: first, it means that with every update of my Habari code base, that change has to be made again (solvable with a patch, but still an extra step); second, I just don't like to mix real HTML tags with "made up" tags, not even if they get replaced on output. All it took to solve that was another little tweak: In the geshipaint() function I changed:

  1. $count = preg_match_all("|<geshi(.*?)>(.*?)</geshi>|isx", $mycode,$mymatches);
into:

  1. $count = preg_match_all("|\[geshi(.*?)\](.*?)\[/geshi\]|isx", $mycode,$mymatches);
Now I can use square brackets like quasi-BBCode tags, and no changes to inputfilter.php are needed any more.

All of which you can see in operation in my earlier post Searching for Subversion and, of course, this one!


4 Responses to Syntax highlighting with GeSHi

  1. 7 BenBE November 28, 2008 5:24pm

    There's an easier way to make GeSHi use a special class for its code. Starting with 1.0.8 the behaviour of

    1. $GeSHi->set_overall_class();
    has been changed so that GeSHi produces a top-level-tag with two classes. The first of them being the value you set with set_overall_class and the second being the language name. Older versions used the language name by default and replaced this by the overall class if present. More details can be found at http://qbnz.com/highlighter/geshi-doc.html#setting-css-class-id

    The second comment goes to your stylesheet: That one is not fully complete as some languages (like e.g. Java5) have more than 5 Keyword Groups (in case of java5 there are about 150 ;-)). So you might want to complete this stylesheet ;-)

    Regards,

    BenBE.

  2. 8 marjolein November 28, 2008 5:47pm

    Ben, thanks for that useful comment.

    The set_overall_class() would be a good one to add to the settings, or better yet, make configurable for the plugin. (That's another thing I'd like to do now that I'm getting a little more familiar with how Habari works.)

    As to the stylesheet, I know it's not complete! But, 150 extra classes...? They may be needed for Java, but I don't use Java. What I did was start with what I had in my own stylesheet for Wikka Wiki, and add a few more I came across in the code I've used here so far. I'll just keep adding classes as needed, but I don't use all that many languages — it should soon enough become "locally complete". Your local completeness will surely be different from mine, though. :)

    Finally, thanks for demonstrating that the syntax highlighting works in comments, too!

  3. 14 Upgrading fun (well…) ▪ Coadventures December 11, 2008 12:47pm

    ...Suddenly I remembered inputfilter.php which I'd had to edit for the first installation of the GeSHi plugin. There it was: a lot of whitelisted HTML tags — with h1 through h6 conspicuously missing — ever...

  4. 265 Kaira February 25, 2009 1:30pm

    Hmm, very cognitive post.

    Is this theme good unough for the Digg?

Leave a Reply


Some HTML allowed (like a, em, strong, pre). If you want to embed a code fragment, its syntax will be highlighted if you surround it in pseudo-tags like this:
[geshi lang="php"]echo 'highlighted code!';[/geshi] (instead of using pre); specify language in the lang attribute. Do not enclose your code in tags like <?php … ?> as that will make it disappear.