Monday, September 15, 2008

Pseudonymity would help

Kim Cameron writes of Google's failing to scope SAML assertions:

But according to the research done by the paper’s authors, the Google engineers “simplified” the protocol, perhaps hoping to make it “more efficient”? So they dropped the whole ID and scope “thing” out of the assertion. All that was signed was the client’s identity.

The result was that the relying party had no idea if the assertion was minted for it or for some other relying party. It was one-for-all and all-for-one at Google.

While I agree totally that the intended recipient should have been identified within an <AudienceRestriction> in the SAML assertion (how SAML shows the intended scope of the assertion) the problem would have been moot if Google used good pseudonymous identifiers for its users.

Pseudonymous identifiers are random identifiers that change for each relying party (so my identity at relying party A might be 123 while my identity at relying party B might be 345). Good pseudonymous identifiers are large random values (so that they are unpredictable) and are not reused across multiple users (so the same identifier is never used at different relying parties for the same or different users).

The primary impetus behind pseudonymous identifiers is to prevent the use of the identifier as a correlation factor across multiple relying parties -- in contrast, a globally unique identifier would allow relying party A to ask relying party B about what user 123 did yesterday, whether or not the user was around. However, pseudonymous identifiers also provide the following benefits:

  • added security depth - an unknown user identifier adds another layer of security on the SSO system (which, in this case, would have protected the user accounts from attack since even if the assertion went to a different relying party, there would be no user account with that specific identifier, so it wouldn't be useful).
  • easier integration of new partners - when integrating new partners, the identity systems of the partners may have different data structures for user identity (at it's most simplest case a new relying party may store user identifiers in 32 bit integer values, while the IdP typically uses 128 bit random values -- a system that supports good pseudonymous identifiers and the assumption that identifiers are different on each system will easily be able to handle this.

One might be concerned about how relying party A could invoke a service of relying party B when they are all using different identifiers (such as a google relying party using Google Checkout). This is pretty simple. Typically, any such service invocation requires relying party A to get a security token for the user at relying party B. When that token is obtained, the issuer does the identity translation. SAML provides for the protection of the identifier in the assertion using encryption since relying party A should never know what the user's identifier is at relying party B and the assertion is given to relying party A.

Liberty ID-WSF provides several entities that provide this translation services depending upon the topography of the deployment. The most common such service is the ID-WSF Discovery Service.

Similarly, in WS-*, the WS-Federation Pseudonym service is called out to perform the same translation service (and it is possible for a deployment of a WS-Trust STS to perform this translation internally during token generation).

I strongly recommend that any deployment of SSO, even within a single enterprise, make use of pseudonymous identifiers. They only strengthen the identity infrastructure.

Tags : / / / / / /

No comments: