Johannes Ernst’s Blog

Towards LID 2.0

The beginnings of a definition for a future LID 2.0 are in the Wiki at http://lid.netmesh.org/wiki/index.php/LID_2.0.

Apart from a number of housekeeping items (clarifying, further simplifying the spec …) the fundamental new idea is to break LID into profiles. Anybody implementing LID thus can pick and choose which profiles they like to implement. That way, we keep LID simple while allowing powerful feature sets for those who need them.

Conversely, all that’s required to implement LID is MinimumLID, which broadens LID’s usefulness from a Digital Identity for people to a Digital Identity for arbitrary objects — from Digital Identities for groups of people and legal personas (corporations…) to distributed computing objects (e.g. software agents) to electronically enabled physical objects and who knows what else.

Comments welcome, preferably on the lid-dev mailing list, or on the Wiki.

Introducing the “that” Pointer

Migrating from C to C++ back then, we all had to get used to the this pointer. The this pointer, of course, identifies the object that is supposed to perform the service that we requested. It must have been a good idea, because these days, it’s pretty much in all programming languages (even in Perl!).

It’s time to introduce the that pointer. This is not a joke (even if initially, it sounds like it).

I believe that by not having a that pointer, we don’t see a lot of innovation that we could otherwise see, and we face a lot of friction that really does not need to be there. Let me explain.

In the real world, lots of entities perform services for me. For example, I can borrow a book from a number of different libraries. So when I choose one particular library to borrow a certain book from, I do the equivalent of setting the this pointer to that library, and request the service borrow from that this pointer. And that’s about all I can do.

From the library’s perspective, it gets an incoming request to borrow a certain book, but it has absolutely no idea who is making the request. And in the real world, that means the library would most certainly go out of business because it would have no way of distinguishing those people who bring books back after they borrowed them, and those who just keep them.

In other words, in the real world the library requires a that pointer as well, i.e. a pointer pointing not only to the object that is supposed to be performing the service (this), but also a pointer to the entity (that) requesting the service, such as "me".

In (pseudo) code in a Java-like language that would support that, the library borrowing code would look as follows:

 class Library { ... public borrow( Book toBorrow ) { if( that == null ) { throw MustGetLibraryCardFirstException(); } if( ! customerDatabase.isInGoodStanding( that )) { throw PayOutstandingFeesFirstException(); } ... } } 

Isn’t that much more natural than doing whatever you’d be doing to address this without a built-in that pointer? Do any of your alternatives remind you of the arguments of the conservative C crowd back then, saying "why do I need C++, I can pass a first argument called this myself"? (it sounded reasonable back then, but not any more)

But I nevertheless hear you asking: so if you think that is such a good idea, why didn’t we notice and introduced it right with the this pointer in the 80’s?

There’s a good reason: back then, both the object requesting the service and the object providing the service were entirely under our control, in the same address space, and not very likely attempting to defraud each other. But now, in the distributed, wild wild west environment of the WWW, that assumption is most patently invalid. If I set up the equivalent of the library service on my website without checking for that, I could be certain that within 24 hours, somebody somewhere would have attempted to take advantage of it and defraud me. That’s why we require registration on important websites, and logins, and passwords, so we can get a hold of who that that guy is, and deal with him appropriately.

What we don’t have on the web today is a unifying mechanism for dealing with the that object, but we need one, just like software development started stalling without the this pointer back then.

Turns out that in LID, we have those funny URL arguments all over the place:

   http://some.url/somewhere?...&clientid=http://jim.smith.name/&credtype=...&credential=...

clientid might not have been the best choice of terms, but do you recognize it? It is the that pointer. A pointer to the entity that wishes the HTTP GET service performed on the http://some.url/somewhere resource. Here it is Jim Smith, and he provided his LID URL as the that pointer to the service.

In order to recognize lying, we provide two additional arguments, by which the that object can make a credible statement (using a digital signature, for example) why it is indeed them, and not a fraudster pretending to be them. But that’s the security icing on the cake.

Admittedly, I did not realize at the beginning of LID that that we were intruding a that pointer (shame over me, because I had been joking about a that pointer for years). But now that we recognized that, you will undoubtedly see just how powerful such an authenticated that pointer is:

Using this extremely simple mechanism, we have now a way of uniformly identifying the client for any web request, HTTP GET, POST or otherwise. Everybody can implement this! And it’s useful immediately, if initially for nothing else than to replace internal unique identifiers in a user registration database with URLs. You get many other features for free: such as invoking operations on the that object … and in LID we have barely scratched the surface with what those operations could be (SSO, authenticated messaging, contact management, social networking …)

Somebody suggested to me that the that pointer should be part of the HTTP core protocol, and on technical grounds, it’s hard to disagree (political and practical is of course a different matter). But regardless how it is implemented, it just makes a lot of sense to me to start using this pattern all over the place, even if many requests have an empty that pointer because anonymity is desired. Why should it be any more complicated??

And yes, it is very REST-ful.

Dan Lyke thinks OpenID is a subset of LID

He says:

Some folks over at LiveJournal have come up with a single sign-on system called OpenID. It’s a lot like a subset of LID, and I’m not terribly excited about it because it seems to be a big step backwards in many ways,

To which I might add: it also seems more complex to implement, which, together, I would consider an unattractive proposition.

SSO in the Age of REST

How does single-sign-on (SSO) look for a representational state transfer (REST) architecture? I began asking myself that question a couple of months ago when people started pointing out to me that LID was very REST-ful.

The SSO idea is typically defined in terms of "applications": sign-on once, use many applications without having to sign-on again. Unfortunately, REST doesn’t know the concept of an application, only the concept of a resource (ie. the thing behind a URL). So does or doesn’t SSO make sense in a REST architecture? This question has puzzled me for a while, but I now think there’s a very simple and elegant answer:

The scope of SSO in a REST architecture is not the application, but the resource.

In other words, instead of introducing an artificial concept of an application into a REST architecture (thereby breaking the idea of REST), we simply do SSO on a per-URL basis.

That, of course, turns out to be a bit of a challenge. Most applications’ sign-on functionality use at least one or two additional screens (and thus URLs): "enter your username here" and "you have been logged on successfully" and stuff like that. But if the scope of SSO is a single URL, how can one reasonably do that?

LID to the rescue ;-) and let me explain how. But before I do that, you should know that we also implemented this scheme and updated our FirstSSO example website accordingly. So you can try it out right there.

Here are the ingredients that we need:

  • One or more resources that need to be access protected, identitied by their URLs.
  • A piece of code that protects access to one or more resources, and that can check whether or not a client is authenticated, and authorized to access this resource (let’s call it the Access Control Script). Given that this is REST, you can think of it as using one Access Control Script per URL, although of course a website with multiple protected resources may want to optimize that and use only one script for many resources. But it is essential to understand that it could be as many scripts as resources, without any further change, otherwise it wouldn’t be REST and SSO but something non-REST-ful.
  • A client that has a LID identity, using a standard browser, no extensions or browser plugins required.
  • A lid cookie and a session cookie.

Here is what happens.

  1. You type in the URL of one of the protected resources into your browser.
  2. The Access Control Script attempts to determine who you are. To do that, it looks at the lid and session cookies, as well as any URL arguments that sign the URL (using the LID clientid, credtype and credential arguments). Given that this is your first visit to the protected resource, none are present.
  3. Given that your identity could not be verified (yet), the Access Control Script does not return the resource, but an HTTP 4.1 Unauthorized error code, with the WWW_Authenticate field in the HTTP header containing the term LID. The body of the returned document should contain a text field that allows you to enter your LID URL.
  4. You enter your LID URL in the returned error document, and submit it to an auxiliary script that redirects you to your own LID, just like in the case of the LID SSO scenario that is defined in the white paper. (Note that from an architectural standpoint, that script is not strictly necessary. It is only there as a user convenience that makes it easy for your to generate signed URLs.)
  5. You authenticate against your own LID management software, and are redirected to the URL of the protected resource, but now with a signature on your URL.
  6. The Access Control Script now can determine who you are by looking at the clientid, credtype and credential arguments to the URL of the protected resource.
  7. The Access Control Script stores your LID URL in the lid cookie, so you do not have to enter your LID URL again when accessing the same resource (it’s up to the site to decide whether or not to use the same cookie for more than one resource; even if it does, that does not change anything). It also issues a session cookie that allows the Access Control Script to skip using the LID SSO protocol for some period of time until the session expires (that’s entirely an optimization step, it would work just fine without the session cookie — except for one thing: it turns out many browsers don’t follow HTTP redirect for included images using the IMG HTML tag, and so we do need the cookie here).
  8. The Access Control Script checks whether or not you (now identified through your LID) may or may not access the resource, using whatever mechanism it desires, such as access control lists. If you are allowed to access the resource, it will return the resource to you.

Isn’t this beautiful? SSO on a per-URL basis, with an optimization (using session cookies) that make this reasonably efficient. Among many other things, that allows you to bookmark URLs to protected resources, and still use SSO to access them.

I’m also beginning to realize that a case could be made to put some of the core LID functionality (such as signing requests and specifying a clientid) right into HTTP. Of course, there would be very substantial pratical difficulties in doing so, but architecturally, I really like the idea that this would be quite straightforward. It indicates that we are on the right track with all of this.

What is Microsoft InfoCard?

Slightly updated May 18

If you Google Microsoft’s InfoCard, you mostly seem to find people asking "who can tell me more about InfoCard", but very little actual answers. Here is what I’ve learned from public statements by Kim Cameron and other Microsoft people, and the public demo they did at Digital Identity World 2005. (Disclaimer: I may be wrong about some things, as I don’t work for Microsoft. Also, I believe all of the information here is public. If I’m wrong on either count, please do let me know.)

To understand InfoCard, you need to understand Kim Cameron, InfoCard’s architect. Kim is credited with being the, or at least one of the inventors of the concept of a meta-directory. A directory (as in corporate directory, LDAP, that kind of thing) is a special kind of database run by companies to manage information about their employees, such as their names, phone numbers, e-mail addresses, office locations, as well as computers, printers and sometimes access permissions to various applications or information. When companies started to deploy directories, very quickly multiple directories were found within the same company, and the question arose how those directories could be used together, because some directories would know information about some employees but not others, etc. The idea of a meta-directory is to have a piece of software that would appear just like any other directory, but that would pull its data from other directories. In other words: have your cake and eat it, too. Keep whatever directories you have, but make all their information appear in one place (coincidentally one of the core principles behind our NetMesh InfoGrid as well).

So when Kim decided to do something about digital identity, he used the same mindset that he used for the idea of a meta-directory, because he saw the same market conditions in this area: lots of incompatible digital identity systems, that prevent everybody from interacting with most other people — just like stovepipe directory systems would prevent one person from accessing a printer defined in another. In the identity space, not only do we have Microsoft Passport, Liberty Alliance, SXIP, Identity Commons, and our LID, but thousands, or maybe far more, home-grown account and user registration systems. In Kim’s view, while there may be advantages that one of those systems has versus others, the real problem is fragmentation of digital identity systems, just like fragmentation of directory systems back then. So the core idea for InfoCard is to be a meta-identity system, with the word "meta" meaning the same thing as it does in the term meta-directory system.

Another way saying the same thing would be by parallel with TCP/IP as the universal abstraction layer that abstracts away from things like Ethernet, but nevertheless depends on them. Using this analogy, we could think of InfoCard just like we do about TCP/IP (in relation to digital identity systems and Ethernet or WiFi, respectively).

Kim’s hope that by having such an abstraction layer, such a big momma identity backplane (as Marc Canter puts it so memorably), we can get an explosion of identity-enabled new applications. And he adds another analogy: there was little innovation in graphics before there were commons APIs that developers could use to talk to any graphics card, but then it exploded, we got graphical user interfaces and all of that. Without that common API, the next level of innovation simply wasn’t possible. He thinks that it will be the same about identity.

Before we get into the guts, let’s list some more of the assumptions behind Infocard: (you should also read Kim’s Laws of Identity which I won’t cover here but which contain a lot more interesting assumptions)

  • Kim believes that it has to be an entirely open system. My understanding is that Microsoft will find a license (I also understand they have not settled on one, in fact Kim is looking for input), that allows anybody to create any part or all of InfoCard themselves. Unlike some earlier rumors, InfoCard does not seem to be released as open source itself, but admittedly, that would really have surprised me.
  • InfoCard is built entirely on the web services (WS-*) stack. Given that it is a very distributed system, this choice is understandable. Kim says that while not all WS technologies used in InfoCard have been blessed yet by suitable standards bodies, all of them are on the standards track already.
  • Because of the need to combat phishing and other attacks where outside stuff (web pages, viruses popping up application windows etc.) pretends to be something else to the user, InfoCard will be anchored pretty deeply inside the Windows OS in a secure process space.
  • The InfoCard — like a virtual credit card or membership card — metaphor is the central user interface metaphor.
  • InfoCard only defines the "framework" protocols between the InfoCard client-piece (the one inside Windows), an identity provider, and a relying party (e.g. a website that requires identifying information). Lots of parties can be an identity provider or a relying party using many (all?) of today’s identity systems which can plug into the InfoCard system by adding actual content into the defined messages.

Here is an example use case:

  1. An InfoCard-enabled user (e.g. one running the upcoming Windows Longhorn, or the downward-compatible release for XP) first signs up with one or more identity providers of their choice. That could be their ISP, their bank, a site like eBay, or Slashdot. This process is entirely outside of InfoCard, but of course the identity provider must support their part of the InfoCard protocol.
  2. The user visits an InfoCard-enabled relying website (such as an InfoCard-enabled Amazon) that requires certain identity information from the user, say, a shipping address. The website sends a web page which contains an HTML OBJECT tag, which triggers a DLL which invokes the InfoCard system.
  3. The InfoCard system determines which personal information is requested by the website, and matches it to the identities (i.e. InfoCards) that are in possession of the user. It then displays those InfoCards to the user that are applicable, such as: driver’s license (if the government was an InfoCard-enabled identity provider), or credit card from AMEX. Note that the InfoCard selector runs natively on the PC and is not downloaded.
  4. The user selects an InfoCard to use. The dialog shown takes over the entire Windows screen (similar to the Windows login / logout dialogs today) in order to reduce phishing. It would also be difficult for an attacked to bring up a screen that has the exact set of InfoCard pictures on it as the user owns, as the information about which cards the user has is stored securely in a secure area of Windows. As a result of the selection, the InfoCard process on the PC contacts the selected identity provider, and obtains essentially a signed XML document that contains the requested identity information. The signature comes from the identity provider.
  5. The InfoCard PC piece then forwards the obtained document to the relying party (the website).
  6. However, InfoCard does not describe the actual tokens flying around, thereby enabling other identity systems to plug in.

In order to accomplish this, InfoCard employs:

  • SOAP
  • WS-Addressing
  • WS-MetadataExchange
  • WS-Policy
  • WS-Security
  • WS-SecurityPolicy
  • WS-Transfer
  • WS-Trust
  • XML Signature
  • XML Encryption
  • SAML
  • WS-Federation (unclear)

Does this make sense do you? It does to me … Feel free to post back or contact me if I’m wrong or incomplete or you have questions or …

Next Page »