Document Management File Formats

I have been thinking a bit about file formats related to document management lately (aka enterprise content management).

Specifically around forms (think about the printed policy you get from your insurance company).

You may just get this in the mail and laugh that it is still often in ALL CAPS.

But this stuff is enormously complex.

I'm trying to figure out the core document formats. I figure if you make good choices there, you might get it right with the details. If possible I'd like to use open standards, but I'll settle for published friendly standards - there are many 100% proprietary options out there that frighten me (who wants to be beholden to one vendor forever with no way out short of complete re-implementation?).

There are two types of file formats that have my interest right now:

  1. form file format
  2. archival storage format
There are lots of options here of course. And lots of pros/cons. I have a long way to go until I have an opinion. The right answer may be more than one.

With form file format a couple options are perhaps:


With archival storage format options could be:

PDF/A, ODF, OOXML (gasp), more TIFF and AFP.

As I learn more / develop opinions I'll try to share. If you already have some, please do to the same :)


I'm not sure what you mean by "form format". If you mean you want it to include either logic or presentation (e.g., you can choose any one of the following five values), then XHTML seems a good bet.

For archival, PDF is pretty cool. You could even store signatures right in there with the printed text. It's all text so you could do diffs, even though they might not be very readable. Best of all, your Mac produces it for you for free. ;)

Requirements, please. :-)

I agree, who wants to be beholden to the same pain in the ass wife for the rest of your life? PLEASE HELP ME.

I responded in detail on my blog. I hope you find it useful.

Thanks Laurence - good stuff.