Weyfour WWWWolf ([info]wwwwolf) wrote,
@ 2006-07-17 12:33:00
Previous Entry  Add to memories!  Tell a Friend!  Next Entry
Entry tags:mac os x, opml, outliners, plist, programming, xml

File format freakshow: OmniOutliner 2
Urrrgh.

You know, people seem to hate Mozilla's Mork file format. It's certainly ugly and braindamaging enough. I think I've finally found a contender. Not quite as bad, but still pretty darn awful and not very well thought out.

Now, let's face the facts: XML-based outline formats suck. Please consider the implications of that: XML was meant to make laying out structured information neatly easy and fun. Yet, OPML is a frigging ridiculous attempt at that: it stores data as attribute values, which is a huge big no-no. OML is slightly better, but no one supports it. Sane people would use some RDF-based thingy, but no one supports that, either.

But I forgive all those things right now, right here. Yes, even OPML. I thought OPML was the most annoying and illogical use of XML. How wrong I was!

I'm talking about OmniOutliner file format. This is version 2.x, which came with OS X 10.3. I have no idea if v3 is any better, because it's apparently a pay upgrade.

The file format itself is based on Apple PLists. PLists are only slightly broken format: I think the only miserable thing in it are the <dict> tags, which stores a <key>,typetag stream rather than pairing the keys and values in another container tag, which makes xpathing them a bit trickier. But it's better than nothing.

So how do you store that thing in a plist?

Not at all how do you think.

The first thing you notice when you open the file in Emacs is that the file is just a memory dump. Okay, a neatly text-serialised memory dump, but a memory dump nevertheless.

<dict>
    <key>Identifier</key>
    <string>39f711d444bac978055e0065</string>
    <key>MaximumWidth</key>
    <real>1.000000e+06</real>
    <key>MinimumWidth</key>
    <real>1.300000e+01</real>
    <key>OOPlainTextExportWidthKey</key>
    <integer>72</integer>
    <key>Title</key>
    <string>Topic</string>
    <key>Width</key>
    <real>5.120000e+02</real>
</dict>


Yeah, really useful information for other apps here. Okay, I'm not really criticising this, it's easy to ignore by other apps so far. But then we come to this wart:


<dict>
    <key>Page Layout</key>
    <data>
    BAt0eXBlZHN0cmVhbYED6IQBQISEhAtOU1ByaW50SW5mbwGEhAhOU09iamVj
    dACFkoSEhBNOU011dGFibGVEaWN0aW9uYXJ5AISEDE5TRGljdGlvbmFyeQCU
    hAFpEpKEhIQITlNTdHJpbmcBlIQBKxBOU0pvYkRpc3Bvc2l0aW9uhpKEmZkP


       (... 111 lines of base64-encoded crap omitted ...)


    hJmZDE5TTGVmdE1hcmdpboaShJ2ctaIkhpKEmZkLTlNUb3BNYXJnaW6GkoSd
    nLWiJIaShJmZCk5TTGFzdFBhZ2WGkoSdnISXl4J/////hpKEmZkLTlNGaXJz
    dFBhZ2WGkoSdnKKeAYaShJmZCk5TU2F2ZVBhdGiGkoSZmQCGhoY=
    </data>
    <key>SpellCheckingEnabled</key>
    <true/>
</dict>


Hey Apple! Be a bit more careful what apps you're bundling with OS X! I mean, people tell this sort of jokes about Microsoft Office, not OS X itself!

Okay, even that is easy to ignore by applications that don't need the page layout data (if that indeed is page layout data - could be encrypted blueprints of Russian submarines for all I know!)

What the parsing applications are really interested of is the actual outline data, right?

Well...

<key>Cols</key>
<array>
  <string>{\rtf1\mac\ansicpg10000\cocoartf102
{\fonttbl}
{\colortbl;\red255\green255\blue255;}
}</string>
  <string>{\rtf1\mac\ansicpg10000\cocoartf102
{\fonttbl\f0\fswiss\fcharset77 Helvetica;}
{\colortbl;\red255\green255\blue255;\red0\green0\blue0;}
\pard\tx560\tx1120\tx1680\tx2240\tx2800\tx3360\tx3920\tx4480\tx5040\tx5600\tx6160\tx6720

\f0\fs24 \cf2 Foo}</string>
</array>
<key>Expanded</key>
<true/>


...if you intend to make any kind of sense at all out of this, please add an RTF parser in addition to XML parser to your application...

In another news, file formats that offer just serialisation of what the application has in memory suck. Won't change much if the serialisation format is "human-readable" or not. I sometimes find even Ruby YAML a bit hard to follow... The point is, while memorydumps are easy for programmer, they're not easy for another app's programmer.



Create an Account
Forgot your login or password?
Login w/ OpenID
English • Español • Deutsch • Русский…