June 09, 2003
Working at Atlassian

This poor neglected blog desperately needs content, so I thought I'd describe the last 5 weeks of my life, spent working at Atlassian. 4 of those weeks were spent doing JIRA 2.1 documentation using Forrest, and the last has been coding.

Atlassian is a company of about 7 people, right in the middle of Sydney, run by Mike and Scott. I ended up working here after responding to a blog post of Mike's.

I really hope Atlassian is a model of post-dotcom software companies, because it's very cool working here. Small, focused, clueful people, quality code, best practices. This is the first time I've seen pair programming and test-driven development in real use. As a newcomer to the code, I found pair programming awesome. Rather than the usual process of spending a week floundering in strange code, trying to get my bearings, I always had someone on hand to ask stupid questions. At the end of each day I could feel my brain creaking under the strain of rapidly acquired information. Perhaps when I'm a JIRA code guru I'll find it frustrating working in pairs. If so, I hope I'll remember what it's like being on the other side.

As for test-driven development, I have one thing to say: mock objects. Wonderful things. Pretty much every action in JIRA has an associated test, and mocks make it simple to test only the bit you're interested in. Makes me wonder how Cocoon's flow layer can be unit-tested. Unit tests would make an interesting addition to the Petstore sample.

Probably the most mentally stimulating discussion last week was about Webwork 2 and especially XWork. You have to see this thing in action; it is just beautiful :) 3 years ago I got involved in Avalon because I felt there was something cool about IoC and SoC, even if I couldn't quite see it. Looking at XWork, I get the same sense all over again, about 100x stronger; this is the Inversion of Control pattern Done Right.

I made one awful discovery though; 4 years of exclusive vim usage has almost crippled me on regular editors. While using IDEA, I had this terrible urge to hit escape repeatedly, type :wq, navigate using hjkl keys, and use all the other obscure key combinations that are hardwired in my brain. It's like a right-handed person suddenly being asked to write something left-handed. After a week of embarassing fumbling, I have learned (or unlearned) the basics and am just beginning to appreciate IDEA.

Having tinkered a lot with JIRA, I can say it's a very fine piece of work, but its best bits are sadly underused. For instance, with services, you can comment on a bug just by replying to the notification email JIRA sends out. Quoted text is automatically stripped out. Bugs can also be closed (with a comment) by an appropriately formatted CVS commit log. I need to ask Steven if I can upgrade cocoondev's JIRA to play with all this stuff in Forrest.

Finally, if everyone in your project uses HTML-capable mailers, then I can highly recommend this CVS commit log formatter. Instead of straight 'diff' output, it colourizes things, adds a summary, and optionally makes hyperlinks our of certain phrases (like bug refs). A small thing, but it makes reviewing code changes much easier.

Posted by jefft at 07:43 PM | Comments (1)
April 23, 2003
RDF: CSS syntax?

Micah Dubinko suggests using a CSS syntax for RDF:


@namespace dc url(http://purl.org/dc/elements/1.1/)

:root {
dc|description: "A discussion of the broader context and relevance of XML/RDF techniques.";
dc|creator: "Uche Ogbuji";

Note that the CSS3 :root selector is used to make a 'this here document' self-reference. To make assertions about other URIs, you could either use the url() function as a selector, or select an element that points off to some URI.

Some will argue that the last thing RDF needs is another syntax. Yet, none of the existing ones are workable within DTD-valid XHTML, it seems. Re-using CSS parsing technology seems like a good compromise. Maybe.

Sounds like a good idea to me. Metadata inherits in the same way styles do, so the cascading comes in handy. Maybe we should use this in Forrest.

Posted by jefft at 01:20 PM | Comments (0)
April 17, 2003
Bayesian spam zapping with bogofilter

It's now been a week since I started using Bogofilter a Bayesian network spam catching affair by ESR, to filter out the 11-odd spam messages I get per day. I have previously been using an elaborate procmail system called SpamBouncer, which works reasonably well, but blocks some BigPond users (Telstra being a major source of spam), and is generally hard to update.

So far bogofilter has worked very well, with no false positives, and only a few misses. The best part was that I got to use my lovingly hoarded spam collection to 'train' the network:

cat Mail/spam.incoming | formail -s bogofilter -s

formail, in particular formail -s <cmd> is extremely useful for mbox tinkering. The -s option splits an incoming mbox stream, and runs <cmd> for every message; in this case, telling bogofilter to classify the contents as spam.

In the same way, one tells bogofilter what isn't spam:

cat $MAIL | formail -s bogofilter -n

bogofilter is now trained and can be used to filter incoming mail. The man page has a sample procmail recipe that positively reinforces whatever decision is made, so the network is constantly adapting. When bogofilter lets a spam mail through, this can be rectified with (in mutt) ''|bogofilter -Ns', and then bouncing the message to oneself to test the change.

Oh, and if bogofilter seems.. really uncannily good at classifying existing mail, check that your previous spam software hasn't added custom headers to each mail. Real spammers are very inconsiderate, and don't add headers like 'X-SBClass: spam', so its no good training on such emails ;)

Posted by jefft at 10:19 PM | Comments (0)
February 05, 2003
Connecting Apache and Java with mod_proxy

Pier Fumagalli sends an email I wish I'd see 2 years ago: [TIPS] Basic configurations of Apache 2.0 for Cocoon 2, aka how to avoid the whole mod_jk/mod_jk2/mod_webapp mess and connect Apache and a web server with minimal fuss using mod_proxy and mod_rewrite.

Tomcat's Proxy HOWTO explains how to prevent HttpRequest.getServerPort() and HttpUtils.getRequestURL() breaking.

Posted by jefft at 09:59 PM | Comments (2)
January 27, 2003
XHTML2 blogging

From the www-html@w3 list, Sjoerd Visscher's weblog - w3future.com. Looks very pretty in Moz (note the cssedit page). Interesting content too.. looks like a good place to steal ideas as XHTML2-in-Forrest gains momentum.

Posted by jefft at 12:16 PM | Comments (0)
January 13, 2003
Cocoon committer!

As Ovidiu noted, out of the blue Stefano nominated me as a Cocoon committer :) Wow, quite an honour! I've been lurking on cocoon-dev for almost 3 years now (since 1.7), and have always regarded Cocoon as 'home base' amongst the Apache projects. I've learned the secret handshakes, passwords, and am keen to begin the life of partying, carousing and beer-quaffing that Cocoon-committership entails.

Anyway, there's no longer any excuses for not getting that Anteater Cocoon test suite into src/test/anteater/. Scarier still, no excuse for not fixing the (numerous) problems encountered last time I ran it. More immediately, I really need to fix up the half-baked LinkRewriterTransformer samples...

Posted by jefft at 11:53 PM | Comments (2)
December 15, 2002
Browsing Human source code

Cool mostly from a 'because you can' perspective, UCSC Human Genome Browser. Download your own source code :) Someone needs to invent a BCEL-like API for parsing this stuff. Though, I guess if we've scrapped the idea that God wrote the code 4000 years ago and compiled it to DNA, then we're left with the idea that (God) evolved it, and we should be looking for ecologies of 'selfish genes'. And since ecologists have such trouble working out relations beween highly observable animals, in real-time, what chance is there of doing it for a static snapshot of billions of base-pair sequences?

Time to get into bioinformatics.. don't want to be writing websites for the next 50 years :)

Posted by jefft at 03:33 PM | Comments (1)
December 02, 2002
Online XForms book

An O'ReillyXForms Book in the making, by Micah Dubinko.

Encountered while browsing the XPath NG mailing list. Hope something comes of it all.

Posted by jefft at 01:20 AM | Comments (1)
October 22, 2002
XQuery implementation Ivelin pointed out Qexo: The GNU Kawa implementation of XQuery on cocoon-dev. Per Bother is one smart cookie.
Looking at the examples, it seems as if one could obtain XQuery-like functionality with a jelly script.
Posted by jefft at 12:23 AM | Comments (1)
October 18, 2002
Composite for blog editing Following Ugo's blog entry and the subsequent cocoon-dev post, I'm editing this blog with composite, a "Mozilla Editor for html composition in textareas". It all seems very cool to us contenteditable-deprived Moz users, and certainly makes editing blogs in MT much easier.
Posted by jefft at 02:04 AM | Comments (0)
Lost in the woods
I'm happy to say I'm now a Forrest committer! All it took was a few rants, some semi-intelligent ml discussion, and one monster patch which nobody felt like applying :) Forrest has masses of potential, if only it can focus on being a generic doc system instead of an xml.apache.org revamp.

Since then I've been working on making Forrest as braindead simple to use as possible. I'm pretty happy with the progress so far. Once the Forrest binary is installed, you can create and render a template site by typing 'forrest seed site'. To generate a webapp all ready for deployment, type 'forrest webapp'.

 I wrote up a Forrest "getting started" guide at http://xml.apache.org/forrest/your-project.html. If you're looking for an XML->{HTML,PDF} doc tool, have a look.

I've also written a [RT] Linking revisited: A general linking system, with some ideas about how Forrest xdocs could link to each other. Currently, each page links to the HTML rendition of other pages, eg <link href="foo.html">. That is conceptually broken; why should a mere link assume what the sitemap is going to render foo.xml as? What happens if foo.xml moves to a different directory? Anyway, the RT proposes a system whereby we could write <link href="site:site/foo"> instead, meaning the 'site/foo' node in an abstract tree of nodes (a node is a link-to-able bit of site content). Implemented with Cocoon Sources, fancy stylesheets and a bit of hand-waving. Thoughts on it are welcome.

Robert Koberg has implemented some very nifty stuff for his LiveStoryBoard CMS. The site looks plain, but poke around at it's internals and you'll see some pretty amazing use of XSLT. In particular, LSB makes heavy use of a central site.xml file, similar to what my RT proposes for Forrest. Robert kindly sent me a copy of LSB 2.0 (the site there is 1.0) which I'm looking forward to having a poke at.
Posted by jefft at 01:51 AM | Comments (0)
September 29, 2002
The Helicopter Game

Mindless Flash Fun -- navigate a helicopter through an obstacle-filled tunnel.

Posted by jefft at 10:20 PM
September 13, 2002
Semantic google?

An excellent article about Google and the semantic web Ftrain: August 2009: How Google beat Amazon and Ebay to the Semantic Web.

Incidentally, did you know that one can type "related:jakarta.apache.org" and get a list of sites "related" to jakarta?

Posted by jefft at 02:09 PM | Comments (0)
September 11, 2002
Live Forrest?

Posted a '[mini-RT] Using Forrest' suggesting that Forrest development be oriented around a live Cocoon rather than the static crawler.

Posted by jefft at 06:41 PM | Comments (0)
The D Programming Language

Andy Oliver has taken a shine to the D Programming Language, and has suggested a D Services API project on Krysalis, intended to be a more general and faster alternative to servlets.

Not that I think D has a chance of succeeding against C#, but that's because I have a boring, commercial definition of "succeed" :)

One thing that grabbed my interest was the mention of "Versioning" built into the language. It's not what I was hoping, which is a separation of class name from version, so one could do: import org.apache.foo 2.3 - 2.7; and generate a link error if any version of a class outside that range is linked. Oh well..

Posted by jefft at 01:33 PM | Comments (0)
August 20, 2002
Forrest: Maven plugin

Forrest is an awful project. You start out dispassionately criticising it on a weblog, but people respond so reasonably and encouragingly on the list that you find yourself sucked into helping ;)

Some good discussion was had on forrest-dev following my rant. See threads here and here. It turns out that forrestbot can be used to build new projects. While this is more a side-effect than it's original intention, it's better than nothing. A page describing how to use forrestbot in a new project has been added.

I also wrote a script, acorn.xml, which can be used to bootstrap a project's use of Forrest.

Based on this, I'm now working on a Maven Forrest plugin. I've got a basic version working by simply wrapping acorn.xml. The hope is that it's possible to have one Ant script, used for invoking Forrest from the command-line, and then both Centipede and Maven can provide wrappers around this script.

I hope to have this working in the next few days, and then get back to Anteater.


Posted by jefft at 05:24 PM | Comments (0)
August 16, 2002
Forrest rant: users come first

I spent most of today attempting to use Forrest in my own project, Anteater. It didn't go well. The Forrest site offers no introductory, "here's how you use Forrest in your own site" guide. It turns out this is no accident. The docs state:

Our first target is to create a consistent xml.apache.org website, with a uniform, lightweight and easy to navigate layout and structure.

Now that's all very well and good, but how in hell do they think they're going to get contributors?

As I understand it, BSD-style OSS development is based on the principle of "enlightened self-interest". People don't primarily participate because they're chasing some ideological "free software" dream. They do it for mostly pragmatic reasons: they have an itch to scratch; they need whatever functionality the software provides. If they can help others by contributing back, great.

Now coming back to Forrest; your average potential contributor doesn't give a rat's ass about how the xml.apache.org site looks. They care what their site looks like. The magic of OSS development is when effort expended towards selfish aims (improving ones own site) also results in improving other people's sites.

That is the golden principle which is so sorely missing in Forrest. Consequentially, the user community is composed of a few dedicated individuals who truly care about xml.apache.org, and the project's goals are being met by the more lively, user-focused community formed around Apache Maven This is very sad.

Posted by jefft at 06:16 PM | Comments (0)
August 12, 2002
Cococon article Mentioned on the cocoon lists; an interview with Stefano: http://www.oio.de/public/interview-stefano-mazzocchi.htm

Heh.. and I'm sharing blogspace with the guy who put the + in MVC+, and made Cocoon 2.1 truly innovative :)


Posted by jefft at 11:10 PM | Comments (0)
August 08, 2002
A brief bio

Wahay! A blog of My Very Own, thanks to Ovidiu.. let me see.. how does one link.. Ovidiu's weblog.

Who am I anyway? I was born in South Africa 23 years ago, emigrated to Australia 5 years ago, did a comp sci degree at Sydney Uni, and am now working as a Java programmer. When not coding, I'm usually reading, mostly in the closely related subjects of science fiction, fantasy and theology. Favourite authors include Greg Egan (modern SF), Terry Pratchett, discworld, Dicworld author, and (though not recently) Douglas Hofstadter (Metamagical Themas; Godel, Escher, Bach), and C.S. Lewis, a brilliant theologian.

In computing, I seem to spend an awful lot of time reading Jakarta mailing lists, wherein lurk a whole bunch of interesting and clever people who spend their time writing open source Java code. Cool. I've gotten somewhat involved, and ended up a committer in the Avalon and Commons projects.

I have an unnatural fascination with XML, ever since my introduction to it in late 1999 when I worked on an XML database at University. I'd recommend subscribing to the XML-DEV mailing list, which is like listening in on a tea party from Alice in Wonderland, but very educational anyway. One of the more interesting projects involving XML is Apache Cocoon, on whose lists I've lurked for about 2 years. Cocoon has become rather top-heavy, and forXP'ers, it's no longer "the simplest thing that can possibly work", but it's fun nonetheless.

I'm currently working on the Anteater project, about which I'm also meant to be writing a chapter of a book entitled "Extreme XML: Cocoon, and somethingorother", along with Ivelin Ivanov, Ugo Cei and Nicola Ken Barozzi.

Which reminds me, I ought to get some real work done..


Posted by jefft at 12:37 PM | Comments (3)