Johannes Ernst's Blog [XML]  [LID]

More feedback on the XML-RSig proposal

Hans Granqvist writes that I'm making a mistake in treating XML as a wire format, when "XML is a logical format".

I beg to differ: XML can be treated either way. In particular, when it is sent over the wire (such as for RSS feeds) it most certainly is a wire format, as it is probably for most XML generators (e.g. think the XML generated from countless Perl scripts). I realize that in order to have the full power of XML as on-the-wire-format and as in-memory, transformable format, one needs to have quite complex machinery to digitally sign it. Which is what we have with XML-DSig, which is the problem, which lead to this discussion. To make signed XML content more broadly useful and usable, we need to simplify dramatically compared to XML-DSig, and that might mean giving up some — or even a lot of — generality. RSS is a useful guide in this: by dramatic simplification, mass adoption was accomplished; one can always put more machinery in it again after mass adoption was reached. We need to do the same thing for XML signatures.

Eric van der Vlist sent me multiple e-mails commenting on my Really Simple XML Signatures proposal, and agreed that I could blog our discussion. He writes:

Just two comments about your simple signature proposal...

  1. When you're a adding the signature element you're actually adding two nodes: a text node with a CR and tabulations and the rsig:signature element node.
  2. More important, if you want to allow to use XML APIs (such as DOM) to insert tha rsig:signature element, you can't avoid the canonicalization nightmare. XML parsers don't carry enough information to insure that the attributes will be written back in the same order, that double quotes won't be replaced by single quotes, ... and ignoring the fact that "XML is an unstable media" as quoted in the document you've linked in your other post will lead to troubles...

Isn't the blogosphere great: feedback that nobody so far had provided! And he is right, I've got to address it. So let me take a shot:

Ad 1: he is right, my algorithm is a little sloppy. Let me try to restate it:

  1. Pick a node, any node ;-), to sign.
  2. "Cut out" this node, from the first character of the start tag through, and including the last character of the end tag.
  3. Apply the signature algorithm to that separate file, as you would for any blob. To convert characters to bytes, apply the character set of the overall XML file. In my example, I used gpg --clearsign (an enumerated value defined in LID — a fancy way of saying "create a gpg signature"). The private key is maintained by me, the public key can be retrieved from the URL given in the lid attribute as specified in the LID spec.
  4. Re-insert the cut-out file exactly where you took it from, so the file looks the same as before.
  5. Insert a new node into the XML file by character insertion. This new node contains the signature. This node becomes a child of the node whose signature it is. Addition: Do not add or remove any white space when inserting the new node. (Alternatively: allow arbitrary white space between the new node and the next node; while possible, this may be very confusing and have little practical value, so let's not do it.)

(That means my example files in the original post are now a bit incorrect and need to be updated slightly)

Ad 2: he's right, once we at a DOM level, we got to do canonicalization, and that is why things get so complicated with XML-DSig. I think this is largely also Hans's argument.

But this is exactly where I'd like to simplify, dramatically: if doing signatures on a DOM level requires complex machinery that makes it hard for many people to adopt any form of XML signatures, let's not do signatures on the DOM level! This is the mental leap you need to do if you want to follow me from XML-DSig to XML-RSig. There seem to be two primary arguments why one would not want to make this leap:

  • When transforming XML content from one format into another, going through DOM, only DOM-level signatures have any chance of being maintained across the transformation. I'm saying "have any chance" instead of "will be" because from my reading of the spec, that is far from automatic with XML-DSig either.
  • There is no parser infrastructure in place yet to make it real easy to process XML-RSig signatures.

On the first of these two points, I have no qualms. If the choice is having no signatures (in practice) or having signatures that might not be preserved under some circumstances, I choose the latter. (And some business cases that would seem to depend on this can probably be worked around.) On the second, I realize I have to make a proposal for how to best marry XML-RSig logic with common parsers. (I'm thinking of a SAX level, because newer parsers support the locator API. This should do it. But then, I'll have to prove that, and I will after the conceptual discussion has gone a bit further)

Later, Eric points me towards paying more attention to namespaces. He's right. In particular, it would be very bad if party B could re-interpret a signed document fragment from party A by making it resolve to entirely different namespaces (think: "money in the bank" could be meant to mean "money owed" simply by pointing it to a different vocabulary with the same namespace prefix). So namespace declaration must be inherent part of the signature. (I'm not entirely clear whether this is necessarily the case with XML-DSig. Anybody have any insight here?)

So I'd like to augment the first step in the signing algorithm to read as follows:

  1. Pick a node, any node ;-), to sign. Add all namespace declarations in effect at this node, including the default namespace.

By the way, if anybody knows of any real-world, deployed example of XML-DSig in which signature preservation on a DOM level was used and an essential feature for the business scenario, I'd love to hear about it. I'm sure there must be some; what I'd like to investigate is whether, for those cases where this is really needed, a suitable workaround could be found based on XML-RSig. Or if not, what kinds of business scenarios these are where this is needed.

But regardless, here is the meta-plan:

  1. Broad use of XML as a data representation format; nearly all of what's found in the wild is unsigned. (that's where we are right now).
  2. Really simple (and somewhat restrictive) XML signature proposal that addresses a substantial subset of all use cases for digital signatures of XML data. (That's what I'm trying to get to)
  3. Broad deployment of XML signatures for some of the uses cases, such as Signed Ping. (The blogosphere will tell me whether that is feasible)
  4. Extension and generalization of the protocol to cover more use cases.
  5. All use cases for signed XML data are supported. (i.e. includes the XML DSig capabilitities)

Does this sound like a reasonable meta-plan, given that skipping all steps between the first and the last does not seem to have worked so far ...?

[permanent link]    Add to [del.icio.us