Identity Metasystems and Lessons from Building the Sovrin Foundation

I recently spoke with Riley Hughes of Trinsic on his Future of Identity podcast about the birth of Sovrin Foundation, its inevitable growing pains, self-sovereign identity, identity metasystems, and adoption. Give it a listen.

I'm grateful to Riley for having me on as a guest.


Zero Trust with Zero Data

We ID Everyone

Presenting your ID to buy beer is used so often as an example of how verifiable credentials work that it's cliche. Cliche or not, there's another aspect of using an ID to buy beer that I want to focus on: it's an excellent example of zero trust

Zero Trust operates on a simple, yet powerful principle: "assume breach." In a world where network boundaries are increasingly porous and cyber threats are more evasive than ever, the Zero Trust model centers around the notion that no one, whether internal or external, should be inherently trusted. This approach mandates continuous verification, strict access controls, and micro-segmentation, ensuring that every user and device proves their legitimacy before gaining access to sensitive resources. If we assume breach, then the only strategy that can protect the corporate network, infrastructure, applications, and people is to authorize every access.
From Zero Trust
Referenced 2024-02-09T08:25:55-0500

The real world is full of zero trust examples. When we're controlling access to something in the physical world—beer, a movie, a boarding gate, points in a loyalty program, prescription drugs, and so on—we almost invariably use a zero trust model. We authorize every access. This isn't surprising, the physical world is remarkably decentralized and there aren't many natural boundaries to exploit and artificial boundaries are expensive and inconvenient.

The other thing that's interesting about zero trust in the physical world is that authorization is also usually done using Zero Data. Zero data is a name StJohn Deakin gave to the concept of using data gathered just in time to make authorization and other decisions rather than relying on great stores of data. There are obvious security benefits from storing less data, but zero data also offers significantly greater convenience for people and organizations alike. To top all that off, it can save money by reducing the number of partner integrations (i.e., far fewer federations) and enable applications that have far greater scale.

Let's examine these benefits in the scenario I opened with. Imagine that instead of using a credential (e.g., driver's license) to prove your age when buying beer, we ran convenience stores like a web app. Before you could shop, you'd have to register an account. And if you wanted to buy beer, the company would have to proof the identity of the person to ensure they're over 21. Now when you buy beer at the store, you'd log in so the system could use your stored attributes to ensure you were allowed to buy beer.

This scenario is still zero trust, but not zero data. And it's ludicrous to imagine anyone would put up with it, but we do it everyday online. I don't know about you, but I'm comforted to know that every convenience store I visit doesn't have a store of all kinds of information about me in an account somewhere. Zero data stores less data that can be exploited by hackers (or the companies we trust with it).

The benefit of scale is obvious as well. In a zero data, zero trust scenario we don't have to have long-term transactional relationships with every store, movie, restaurant, and barber shop we visit. They don't have to maintain federation relationships with numerous identity providers. There are places where the ability to scale zero trust really matters. For example, it's impossible for every hospital to have a relationship with every other hospital for purposes of authorizing access for medical personal who move or need temporary access. Similarly, airline personal move between numerous airports and need access to various facilities at airports.

Finally, the integration burden with zero trust with zero data is much lower. The convenience store selling beer doesn't have to have an integration with any other system to check your ID. The attributes are self-contained in a tamper-evident package with built-in biometric authentication. Even more important, no legal agreement or prior coordination is needed. Lower integration burden reduces the prerequisites for implementing zero trust.

How do we build zero data, zero trust systems? By using verifiable credentials to transfer attributes about their subject in a way that is decentralized and yet trustworthy. Zero data aligns our online existence more closely with our real-world interactions, fostering new methods of communication while decreasing the challenges and risks associated with amassing, storing, and utilizing vast amounts of data.

Just-in-time, zero data, attribute transfer can make many zero trust scenarios more realizable because it's more flexible. Zero trust with zero data, facilitated by verifiable credentials, represents a pivotal transition in how digital identity is used in authorization decisions. By minimizing centralized data storage and emphasizing cryptographic verifiability, this approach aims to address the prevalent challenges in data management, security, and user trust. By allowing online interactions to more faithfully follow established patterns of transferring trust from the physical world, zero trust with zero data promotes better security with increased convenience and lower cost. What's not to like?


Photo Credit: We ID Everyone from DALL-E (Public Domain) DALL-E apparently thinks a six-pack has 8 bottles but this was the best of several attempts. Here's the prompt: Produce a photo-realistic image of a convenience store clerk. She's behind the counter and there's a six pack of beer on the counter. Behind her, clearly visible, is a sign that says "We I.D. Everyone" .


Acceptance Networks for Self-Sovereign Identity

Data flowing on a network

When I hand a merchant in London a piece of plastic that I got from a bank in Utah to make a purchase, a tiny miracle happens. Despite the fact that the merchant has never met me before and has no knowledge of my bank, she blithely allows me to walk out of the store with hundreds of dollars of merchandise, confident that she will receive payment. I emphasized the word confident in the last sentence because it's core to understanding what's happened. In the past, these kinds of transactions required that the merchant trust me or my bank. But in the modern world, trust has been replaced by confidence.

We often mix these concepts up and I'm as guilty as anyone. But trust always involves an element of risk, whereas confidence does not. These are not binary, but rather represent a spectrum. In the scenario I paint above, the merchant is still taking some risk, but it's very small. Technology, processes, and legal agreements have come together to squeeze out risk. The result is a financial system where the risk is so small that banks, merchants, and consumers alike have confidence that they will not be cheated. There's a name in the financial services industry for the network that reduces risk so that trust can be replaced with confidence: an acceptance network.

Acceptance Networks

An acceptance network is the network of merchants or service providers that accept a particular form of payment, usually credit or debit cards, from a particular issuer or payment network. The term refers to a broad ecosystem that facilitates these transactions, including point-of-sale terminals, online payment gateways, and other infrastructure. Each component of the acceptance network plays a crucial role in ensuring that transactions are processed efficiently, securely, and accurately. This drives out risk and increases confidence. Acceptance networks are foundational components of modern payment ecosystems and are essential to the seamless functioning of digital financial transactions. Visa, Mastercard, American Express, and Discover are all examples of acceptance networks.

Before the advent of acceptance networks, credit was a spotty thing with each large merchant issuing it's own proprietary credit card—good only at that merchant. My mom and dad had wallets full of cards for JC Penney, Sears, Chevron, Texaco, and so on. Sears trusted its card. Chevron trusted its card. But it was impossible to use a Chevron card at Sears. They had limited means to verify if it was real and no way to clear the funds so that Chevron could pay Sears for the transaction.

That scenario is similar to the state of digital identity today. We have identity providers (IdPs) like Google and Apple who control a closed ecosystem of relying parties (with a lot of overlap). These relying parties trust these large IdPs to authenticate the people who use their services. They limit their risk by only using IdPs they're familiar with and only accepting the (usually) self-asserted attributes from the IdP that don't involve much risk. Beyond that they must verify everything themselves.

Fixing this requires the equivalent of an acceptance network for digital identity. When we launched Sovrin Foundation and the Sovrin network1 in 2016, we were building an acceptance network for digital identity, even though we didn't use that term to describe it. Our goal was to create a system of protocols, processes, technology and governance that would reduce the risk of self-sovereign identity and increase confidence in an identity system that let the subjects present verifiable credentials that carried reliable attributes from many sources.

I've written previously about identity metasystems that provide a framework for how identity transactions happen. Individual identity systems are built according to the architecture and protocols of the metasystem. Acceptance networks are an instantiation of the metasystem for a particular set of users and types of transactions. A metasystem for self-sovereign identity might have several acceptance networks operating in it to facilitate the operation of specific identity systems.

Problems an Acceptance Network Can Solve

To understand why an acceptance network is necessary to reduce risk and increase confidence in identity transactions, let's explore the gaps that exist without it. The following diagram shows the now familiar triangle of verifiable credential exchange. In this figure, issuers issue credentials to holders who may or may not be the subject of the credentials. The holder presents cryptographic proofs that assert the value of relevant attributes using one of more of the credentials that they hold. The verifier verifies the proof and uses the attributes.

Verifiable Credential Exchange
Verifiable Credential Exchange (click to enlarge)

Let's explore what it means for the verifier to verify the proof. The verifier wants to know a number of things about the credential presentation:

  1. Were the credentials issued to the entity making the presentation?
  2. Have any of the credentials been tampered with?
  3. Have any of the credentials been revoked?
  4. What are the schema for the credentials (to understand the data in them)?
  5. Who issued the credentials in the proof?

The first four of these can be done cryptographically to provide confidence in the attestation. The technology behind the credential presentation is all that's necessary. They can be automated as part of the exchange. For example, the proof can contain pointers (e.g., DIDs) to the credential definitions. These could contain public keys for the credential and references to schema.

The last one—who issued the credential—is not a technical matter. To see why, imagine that Alice (as holder and subject) has been issued a credential from her university (the issuer) giving information about her educational experiences there. She's applying for a job and wants to present the credential to a prospective employer (the verifier). How does the employer know that Alice didn't just make the credential herself or buy it from a diploma mill?

Knowing who issued the credential is not something that can be done solely with technology (although it can help). The employer in this scenario wants more than an identifier for the issuer. And they want to know that the public key really does belong to the university. In short, the employer wants to resolve the identifier to other information that tells them something about the university and the credential. There are lots of ways to do that—people have been doing this sort of thing for centuries: states keep registries of businesses (universities are businesses), accreditation organizations keep registries of schools they've accredited, the Department of Education has registries of various institutions of higher education in the US, and so on.

The employer could make use of these by building its own database of university identifiers it trusts. And every time a new one shows up, they could investigate and add it to their registry (or not)2. But going back to the magic of the credit card scenario that I opened this article with, if every merchant had to keep their own registry of banks, the experience wouldn't be magical for me or the merchant. The financial acceptance network makes it easy for the merchant to have confidence that they'll be paid because they have not only technology, but processes, protocols, governance, and legal agreements that make the verification process automatable.

Acceptance Networks for Digital Identity

For some use cases, keeping your own registry of the issuers you trust works. But for many, it's just too much work and makes it difficult to make use of a variety of credentials. This kind of "localized trust" is unwieldy in an identity system that might involve millions of issuers and identifiers and credentials for billions or even trillions of subjects. I've written extensively about identity metasystems and what they provide to help bridge the gap. This one, on how metasystems help provide life-like identity for digital systems is perhaps the most comprehensive. Acceptance networks implement metasystems.

An acceptance network for digital identity must have a number of important properties, including the following:

  1. Credentials are decentralized and contextual—There is no central authority for all credentials. Every party can be an issuer, a holder (identity owner), or a verifier. Verifiable credentials can be adapted to any country, any industry, any community, or any set of trust relationships.

  2. Credential issuers decide on what data is contained in their credentials—Anyone can create a credential schema for their use case. Anyone can create a credential definition based on any of these schemas.

  3. Verifiers make their own trust decisions about which credentials to accept—There's no central authority who determines what credentials are important or which are used for what purpose. The acceptance network supplies the technical underpinnings for credential exchange and support protocols for automating the verification of credential issuers.

  4. Credential verifiers don't need to have any specific technical, contractual, or commercial relationship with credential issuers—Verifiers do not need to contact issuers to perform verification.

  5. Credential holders are free to choose which credentials to carry and what information to disclose—People and organizations are in control of the credentials they hold (just as they are with physical credentials) and determine what to share with whom.

You may be thinking "but these are mostly about decentralized decision making." While it would be easier to imagine the acceptance network as a big directory, that solution can't possible support all the different ways people and organizations might want to use credentials. That doesn't mean an acceptance network couldn't be run by a single organization, like some financial services networks. Just that it has to support a variety of credential ecosystems running common protocols. I also think that there will be more than one and most issuers and verifiers will be part of several (again, like in financial services).

Structure of an Acceptance Network

One of the things we can take away from the architecture of financial services acceptance networks is that they are built in layers. No one has thought more about how this can work than Drummond Reed and the Trust Over IP Foundation (ToIP).3 This figure, from ToIP, shows how such a stack works.

Trust Over IP Stack
Trust Over IP Stack (click to enlarge)

The layers build on each other to provide something the lower level didn't. Layer 1 is the foundational functionality, like DID methods. Layer 2 builds on that to support creating digital relationships with anyone. Layer 3 uses those relationships to effect credential exchange. Layer 4 is the ecosystems that say things about the issuers for different use cases. The dual stack emphasizes the need for governance at every layer.

The acceptance network specifies the accepted protocols and technologies. The acceptance network also supports ecosystems, providing governance models and technology. The acceptance network is involved at each layer. Here are some examples of things an acceptance network might do at each layer:

  • Layer 1—limit the allowed DID methods and certify them.
  • Layer 2—require that wallets and agents using the network support specific versions of the DIDComm protocol. Provide a certification framework for wallet and agent vendors for security and interoperability.
  • Layer 3—require specific versions of the exchange protocols. Participate in protocol development. Provide a certification framework for specific implementations to aid with security and interoperability.
  • Layer 4—support the formation, certification, and discovery of credential ecosystem providers. Govern what is required to be a certified ecosystem provider and provide models for acceptable ecosystem governance.

As part of it's overall governance of the ecosystem, the acceptance network also provides model legal agreements for and between the various participants, trust mark rights (think of the Visa logo), and drives a uniform user experience.

The following diagram shows the credential exchange from the preceding figure with an acceptance network providing support to the verifier so that it can have confidence in the data the issuer has supplied through the holder.

Acceptance Network in Operation
Acceptance Network in Operation (click to enlarge)

Credential issuers who know their credential might be widely used would join one or more acceptance networks. They agree to follow the rules and regulations in the governance framework of the acceptance network. The acceptance network issues a credential to them that they can use to prove they are a member.4 The acceptance network maintains a registry—likely a registry of registries—that verifiers can use to discover information about the issuer of a credential that has been presented to them.

Using an Acceptance Network

Returning to our previous scenario, Alice holds a credential issued by her university. She presents it to a prospective employer who wants to know that the credential is from an accredited university. Alice's university has been accredited by an accreditation organization5. They have followed their process for accrediting Alice's university and issued it a credential. They have also added the university to their registry. The university and the accrediting organization are members of an acceptance network. The employer's systems know to automatically query the acceptance network when it received a credential proof from a issuer it does not know. Doing so provides the assurance that the issuer is legitimate. It could also provide information about the accreditation status of the university. This information reduces the risk that the employer would otherwise bear.

In this scenario, the employer is trusting the processes and structure of the acceptance network. The employer must decide which acceptance networks to use. This is much more scalable than having to make these determinations for every credential issuer. The acceptance network has allowed the verification process to scale and made the overall use of verifiable credentials easier and less risky.

A Note on Implementation

This discussion of acceptance networks has undoubtedly brought images to your mind about how it is structured or how to build one. The comparison to financial services acceptance networks points to a network run by an organization. And the term registry brings to mind a database of some kind. Why these are certainly possibilities, I think it's also possible to imagine more decentralized solutions. For example, the registry could be a distributed ledger or blockchain. The governance is likely most easily done by an organization, but there are other options like a decentralized autonomous organization (DAO). The scenario I described above illustrates a federated system where certifying authorities for specific ecosystems determine their own methods, processes, and requirements, but link their registry to that of the acceptance network.

Conclusion

As I mentioned above, we've been solving the problem of how to know which institutions to trust for centuries. We have ways of knowing whether a university is accredited, whether a bank is real, whether a company is actually registered and what its reputation is. What is missing is an easy way to make use of this information digitally so that processes for reducing risk can be automated. Acceptance networks rationalize the process and provide the needed tooling to automate these checks. They reduce the many-to-many problem that exists when each verifier has to determine whether to trust each issuer with a more scalable many-to-several system. Acceptance networks allow credential presentation to scale by providing the needed infrastructure for giving verifiers confidence in the facts that holders present to them.


Notes

  1. You can see in the linked post how we used trust to describe what we were building, even as we were reducing risk and inspiring confidence.
  2. Note that this investigation could make use of technology. Knowing the universities name, they could look up a well known location on the universities web site to find the identifier. They could use PKI (digital certificates) to be sure they're talking to the right place. They could look up the university in an online registry of accredited universities.
  3. Trust over IP isn't the only one working on this. Marie Wallace of Accenture and Stephen Wilson of Lockstep Partners have been writing about this idea.
  4. Note that there could be different levels or types of members who perform different roles in the ecosystem and make different agreements.
  5. An example is the Northwest Commission on Colleges and Universities.

Photo Credit: Data flowing over networks from DALL-e


SSI is the Key to Claiming Ownership in an AI-Enabled World

Personal Agents and AI

I've been trying to be intentional about using generative AI for more and more tasks in my life. For example, the image above is generated by DALL-E. I think generative AI is going to upend almost everything we do online, and I'm not alone. One of the places it will have the greatest impact is its use in personal agents, determining whether or not these agents enable people to lead effective online lives.

Jamie Smith recently wrote a great article in Customer Futures about the kind of AI-enabled personal agents we should be building. As Jamie points out: "Digital identity [is how we] prove who we are to others."​ This statement is particularly resonant as we consider not just the role of digital identities in enhancing personal agents, but also their crucial function in asserting ownership of our creations in an AI-dominated landscape.

Personal agents, empowered by AI, will be integral to our digital interactions, managing tasks and providing personalized experiences. As Bill Gates says, AI is about to completely change how you use computers. The key to the effectiveness of these personal agents lies in the robust digital identities they leverage. These identities are not just tools for authentication; they're pivotal in distinguishing our human-generated creations from those produced by AI.

In creative fields, for instance, the ability to prove ownership of one's work becomes increasingly vital as AI-generated content proliferates. A strong digital identity enables creators to unequivocally claim their work, ensuring that the nuances of human creativity are not lost in the tide of AI efficiency. Moreover, in sectors like healthcare and finance, where personal agents are entrusted with sensitive tasks, a trustworthy, robust, self-sovereign identity ensures that these agents act in harmony with our real-world selves, maintaining the integrity and privacy of our personal data.

In this AI-centric era, proving authorship through digital identity becomes not just a matter of pride but a shield against the rising tide of AI-generated fakes. As artificial intelligence becomes more adept at creating content—from written articles to artwork—the line between human-generated and AI-generated creations blurs. A robust, owner-controlled digital identity acts as a bastion, enabling creators to assert their authorship and differentiate their genuine work from AI-generated counterparts. This is crucial in combating the proliferation of deepfakes and other AI-generated misinformation, ensuring the authenticity of content and safeguarding the integrity of our digital interactions. In essence, our digital identity becomes a critical tool in maintaining the authenticity and trustworthiness of the digital ecosystem, protecting not just our intellectual property but the very fabric of truth in our digital world.

A good place to start is with our "humanness." On an earlier version of this post, Timo Hotti commented that the most important use of verifiable credentials might be to prove that you're human. There are a number of ways to do this. Proof that I have a driver's license, at least for now, proves that I'm human. Bank credentials could also be used to prove humanness because of know-your-customer (KYC) regulations. I suggested something like this a year or so ago as a way of cutting down on bots on Twitter.

As we embrace this new digital frontier, the focus must not only be on the convenience and capabilities of AI-driven agents but also on fortifying our digital identities so that your personal agent is controlled by you. Jamie ends his post with five key questions that we shouldn't lose sight of:

  1. Who does the digital assistant belong to?
  2. How will our personal agents be funded?
  3. What will personal agents do tomorrow that we can’t already do today?
  4. Will my personal agent do things WITH me and FOR me, or TO me?
  5. Which brands will be trusted to offer personal agents?

Your digital identity is your anchor in the digital realm, asserting our ownership, preserving our uniqueness, and fostering trust in an increasingly automated world, helping you operationalize your digital relationships. The future beckons with the promise of AI, but it's our digital identity that will define our place in it.


dApps Are About Control, Not Blockchains

Control boundary for a dApp

I recently read Igor Shadurin's article Dive Into dApps. In it, he defines a dApp (or decentralized application):

The commonly accepted definition of a dApp is, in short, an application that can operate autonomously using a distributed ledger system.
From Dive Into dApps
Referenced 2023-11-12T15:39:42-0500

I think that definition is too specific to blockchains. Blockchains are an implementation choice and there are other ways to solve the problem. That said, if you're looking to create a dApp with a smart contract, then Igor's article is a nice place to start.

Let's start with the goal and work backwards from there. The goal of a dApp is to give people control over their apps and the data in them. This is not how the internet works today. As I wrote in The CompuServe of Things, the web and mobile apps are almost exclusively built on a model of intervening administrative authorities. As the operators of hosted apps and controllers of the identity systems upon which they're founded, the administrators can, for any reason whatsoever, revoke your rights to the application and any data it contains. Worse, most use your data for their own purposes, often in ways that are not in your best interest.

dApps, in contrast, give you control of the data and merely operate against it. Since they don't host the data, they can run locally, at the edge. Using smart contracts on a blockchain is one way to do this, but there are others, including peer-to-peer networks and InterPlanetary File System (IPFS). The point is, to achieve their goal, dApps need a way to store data that the application can reliably and securely reference, but that a person, rather than the app provider, controls. The core requirement for achieving control is that the data service be run by a provider who is not an intermediary and that the data model be substitutable. Control requires meaningful choice among a group of interoperable providers who are substitutable and compete for the trust of their customers.

I started writing about this idea back in 2012 and called it the Personal Cloud Application Architecture. At the time the idea of personal clouds had a lot of traction and a number of supporters. We built a demonstration app called Forever and later, I based the Fuse connected car application on this idea: let people control and use the data from their cars without an intermediary. Fuse's technical success showed the efficacy of the idea at scale. Fuse had a mobile app and felt like any other connected car application, but underneath the covers, the architecture gave control of the data to the car's owner. Dave Winer has also developed applications that use a substitutable backend storage based on Node.

Regular readers will wonder how I made it this far without mentioning picos. Forever and Fuse were both based on picos. Picos are designed to be self-hosted or hosted by providers who are substitutable. I've got a couple of projects tee'd up for two groups of students this winter that will further extend the suitability for picos as backends for dApps:

  • Support for Hosting Picos—the root pico in any instance of the pico engine is the ancestor of all picos in that engine and thus has ultimate control over them. To date, we've used the ability to stand up a new engine and control access to it as the means of providing control for the owner. This project will allow a hosting provider to easily stand up new instance of the engine and its root pico. For this to be viable, we'll use the support for peer DIDs my students built into the engine last year to give owners a peer DID connection to their root pico on their instance of the engine and thus give them control over the root pico and all its decedents.
  • Support for Solid Pods—at IIW this past October, we had a few sessions on how picos could be linked to Solid pods. This project will marry a pod to each pico that gets created and link their lifecycles. This, combined with their support for peer DIDs, makes the pico and its data movable between engines, supporting substitutability.

If I thought I had the bandwidth to support a third group, I'd have them work on building dApps and an App Store to run on top of this. Making that work has a few other fun technical challenges. We've done this before. As I said Forever and Fuse were both essentially dApps. Manifold, a re-creation of SquareTag is a large dApp for the Internet of Things that supports dApplets (is that a thing?) for each thing you store in it. What makes it a dApp is that the data is all in picos that could be hosted anywhere...at least in theory. Making that less theoretical is the next big step. Bruce Conrad has some ideas around that he calls the Pico Labs Affiliate Network.

I think the work of supporting dApps and personal control of our data is vitally important. As I wrote in 2014:

On the Net today we face a choice between freedom and captivity, independence and dependence. How we build the Internet of Things has far-reaching consequences for the humans who will use—or be used by—it. Will we push forward, connecting things using forests of silos that are reminiscent the online services of the 1980's, or will we learn the lessons of the Internet and build a true Internet of Things?
From The CompuServe of Things
Referenced 2023-11-12T17:15:48-0500

The choice is ours. We can build the world we want to live in.


Permissionless and One-to-One

Mixtape

In a recent post, Clive Thompson speaks of the humble cassette tape as a medium that had a "a weirdly Internet-like vibe". Clive is focusing on how the cassette tape unlocked creativity, but in doing so he describes its properties in a way that is helpful to discussions about online relationships in general.

Clive doesn't speak about cassette tapes being decentralized. In fact, I chuckle as I write that down. Instead he's focused on some core properties. Two I found the most interesting were that cassette tapes allowed one-to-one exchange of music and that they were permissionless. He says:

If you wanted to record a cassette, you didn’t need anyone’s permission.

This was a quietly radical thing, back when cassette recorders first emerged. Many other forms of audio or moving-image media required a lot of capital infrastructure: If you wanted to broadcast a TV show, you needed a studio and broadcasting equipment; the same goes for a radio show or film, or producing and distributing an album. And your audience needed an entirely different set of technologies (televisions, radios, projectors, record players) to receive your messages.

From The Empowering Style of Cassette Tapes
Referenced 2023-11-02T08:01:46-0400

The thing that struck me on reading this was the idea that symmetric technology democratizes speech. The web is based on assymetric technology: client-server. In theory everyone can have a server, but they don't for a lot of reasons including cost, difficulty, and friction. Consequently, the web is dominated by a few large players who act as intervening administrative authorities. They decide what happens online and who can participate. The web is not one-to-one and it is decidedly not permissionless.

In contrast, the DIDComm protocol is symmetric and so it fosters one-to-one interactions that provide meaningful, life-like online relationships. DIDComm supports autonomic identity systems that provide a foundation for one-to-one, permissionless interactons. Like the cassette tape, DIDComm is a democratizing technology.


Photo Credit: Mix Tape from Andreanna Moya Photography (CC BY-NC-ND 2.0 DEED)


Cloudless: Computing at the Edge

Cloudless Sunset

Doc Searls sent me a link to this piece from Chris Anderson on cloudless computing. Like the term zero data that I wrote about a few weeks ago, cloudless computing is a great name that captures an idea that is profound.

Cloudless computing uses cryptographic identifiers, verifiable data, and location-independent compute1 to move apps to the data wherever it lives, to perform whatever computation needs to be done, at the edge. The genius of the name cloudless computing is that it gets us out of the trenches of dapps, web3, blockchain, and other specific implementations and speaks to an idea or concept. The abstractions can make it difficult get a firm hold on the ideas, but it's important to getting past the how so we can speak to the what and why.

You'd be rightly skeptical that any of this can happen. Why will companies move from the proven cloud model to something else? In this talk, Peter Levine speaks specifically to that question.

One of the core arguments for why more and more computing will move to the edge is the sheer size of modern computing problems. Consider one example: Tesla Full Self Driving (FSD). I happen to be a Tesla owner and I bought FSD. At first it was just because I am very curious about it and couldn't stand to not have first-hand experience with it. But now, I like it so much I use it all the time and can't imagine driving without an AI assist. But that's beside the point. To understand why that drives computing to the edge, consider that the round trip time to get an answer from the cloud is just too great. The car needs to make decisions onboard for this to work. Essentially, to put this in the cloudless perspective, the computation has to move to where the data from the sensors is. You move the compute to the data, not the other way around.2

And that's just one example. Levine makes the point, as I and others have done, that the Internet of Things leads to trillions of nodes on the Internet. This is a difference in scale that has real impact on how we architect computer systems. While today's CompuServe of Things still relies largely on the cloud and centralized servers, that model can't last in a true Internet of Things.

The future world will be more decentralized than the current one. Not because of some grand ideal (although those certainly exist) but simply because the problems will force it to happen. We're using computers in more dynamic environments than the more static ones (like web applications) of the past. The data is too large to move and the required latency too low. Cloudless computing is the future.


Notes

  1. Anderson calls this deterministic computer. He uses that name to describe computation that is consistent and predictable regardless of how the application gets to the data, but I'm not sure that's the core idea. Location independence feels better to me.
  2. An interesting point is that training the AI that drives the car is still done in the cloud somewhere. But once the model is built, it operates close to the data. I think this will be true for a lot of AI models.

Photo Credit: Cloudless Sunset from Dorothy Finley (CC BY 2.0 DEED - cropped)


Internet Identity Workshop 37 Report

IIW 37 Attendee Map

We recently completed the 37th Internet Identity Workshop. We had 315 people from around the world who called 163 sessions. The energy was high and I enjoyed seeing so many people who are working on identity talking with each other and sharing their ideas. The topics were diverse. Verifiable credentials continue to be a hot topic, but authorization is coming on strong. In closing circle someone said (paraphrashing) that authentication is solved and the next frontier is authorization. I tend to agree. We should have the book of proceedings completed in about a month and you'll be able to get the details of sessions there. You can view past Books of Proceedings here.

Heidi facilitating opening circle

As I said, there were attendees from all over the world as you can see by the pins in the map at the top of this post. Not surprisingly, most of the attendees were from the US (212), followed by Canada (29). Japan, the UK, and Germany rounded out the top five with 9, 8, and 8 attendees respectively. Attendees from India (5), Thailand (3), and Korea (3) showed IIW's diversity with attendees from APAC. And there were 4 attendees from South America this time. Sadly, there were no attendees from Africa again. Please remember we offer scholarships for people from underrepresented areas, so if you'd like to come to IIW38, please let us know. If you're working on identity, we want you there.

Sam Curren discusses DIDComm

In terms of states and provinces, California was, unsurprisingly, first with 81. Washington (32), British Columbia (14), Utah (11), Ontario (11) and New York (10) rounded out the top five. Seattle (22), San Jose (15), Victoria (8), New York (8), and Mountain View (6) were the top cities.

Demo hour

Doc Searls has several hundred photos from Day 1, Day 2, and Day 3 of IIW on his Flickr account.

Demo hour

As always the week was great. I had a dozen important, interesting, and timely conversations. If Closing Circle and Open Gifting are any measure, I was not alone. IIW is where you will meet people to help you solve problems and move your ideas forward. Please come! IIW 38 will be held April 16-18, 2024 at the Computer History Museum. We'll have tickets available soon.


Photo Credits: Doc Searls


Zero Data

data minimization

Learning Digital Identity A few weeks ago, I came across this article from StJohn Deakin from 2020 about zero data. I've been thinking a lot about zero trust lately, so the name leapt out at me. I immediately knew what StJohn was speaking about because I've been talking about it too. My new book, Learning Digital Identity, talks about the concept. But the name—zero data—is simply brilliant. I want to dig into zero data in this post. I'll discuss the link between zero data and zero trust in a future post.

StJohn describes the idea like this:

Personal data should be held by humans first, and by the companies, organisations and governments that humans choose to interact with, second. The ultimate in ‘data minimisation’ is for the platforms in the centre to simply facilitate the interactions and not hold any data at all. This is the exact opposite of our Google/Facebook/Amazon dominated world where all human data is being concentrated in a few global silos.

A zero data society doesn’t mean that data isn’t shared between us, quite the opposite. With increased trust and participation, the data available and needed to drive our global society will explode exponentially.

From The Future of Data is ‘Zero Data’
Referenced 2023-09-30T17:37:15-0600

If you think about this in the context of how the internet has worked for the last three decades, the concept of zero data might seem baffling. Yet, consider a day in your life. How often do you establish lasting relationships—and thus share detailed information about yourself—with every individual or entity you come across? Almost never. It would be absurd to think that every time you grab a coffee from the local store, you'd need to form a lasting bond with the coffee machine, the cashier, the credit card terminal, and other customers just to facilitate your purchase. Instead, we exchange only the essential information required, and relevant parties retain just the data that is needed long term.

To build a zero data infrastructure we need to transfer trustworthy data just-in-time. Verifiable credentials (VCs) offer a way to represent information so that its authenticity can be verified through cryptographic means. They can be thought of as digital attestations or proofs that are created by an issuer about a subject and are presented by the holder to a verifier as required.

Verifiable Credential Exchange
Verifiable Credential Exchange (click to enlarge)

Here are some of the interaction patterns facilitated by verifiable credentials:

  • Selective Disclosure: VCs enable users to share only specific parts of a credential. For instance, a user can prove they are of legal age without revealing their exact date of birth.
  • Credential Chaining: Multiple credentials can be linked together, enabling more complex proofs and interactions. For example, an employer might hire an employee only after receiving a VC proving they graduated and another proving their right to work.
  • Holder-Driven Data Exchange: Instead of organizations pulling data about users from third parties, VCs shift the interaction model to users pushing verifiable claims to organizations when needed.
  • Anonymous Credential Proofs: VCs can be designed to be presented anonymously, allowing users to prove a claim about themselves without revealing their identity. For example, VCs can be used to prove the customer is a human with less friction than CAPTCHAs.
  • Proofs without Data Transfer: Instead of transferring actual data, users can provide cryptographic proofs that they possess certain data or prove predicates about the data, reducing the exposure of personal information. For example, VCs can be used to prove that the subject is over 21 without revealing who the subject is or even their birthdate.
  • Adaptive Authentication: Depending on the sensitivity of an online interaction, users can be prompted to provide VCs of varying levels of assurance, enhancing security in adaptable ways. I plan to talk about this more in my next post about zero data and zero trust.

These interaction patterns change traditional data management and verification models, enabling businesses to retain considerably less data on their clients. Verifiable credentials have numerous benefits and features of that provide a positive impact on data management, security, and user trust:

  • Data Minimization: As we've seen, with VCs, users can prove facts without revealing detailed data. By selectively sharing parts of a credential, businesses only see necessary information, leading to overall reduced data storage and processing requirements.
  • Reduced Redundancy & Data Management: Trustworthy VCs reduce the need for duplicate data, simplifying data management. There's less need to track, backup, and maintain excess data, reducing complexity and associated costs.
  • Expiration, Revocation, & Freshness of Data: VCs can be designed with expiration dates and can be revocable. This ensures verifiers rely on up-to-date credentials rather than outdated data in long-term databases.
  • Trust through Standardized Protocols: VCs, built on standardized protocols, enable a universal trust framework. Multiple businesses can thus trust and verify the same credential, benefiting from reduced integration burdens and ensuring less custom development.
  • Enhanced Security & Reduced Exposure to Threats: Data minimization reduces the size of the so-called honey pot, reducing the attraction for cyber-attacks and, in the event of a breach, limit the potential damage, both in terms of data exposed and reputational harm.
  • Compliance, Regulatory Benefits & Reduced Liability: Adhering to data minimization aligns with many regulations, reducing potential legal complications. Storing minimal data also decreases organizational liability and regulatory scrutiny.
  • Cost Efficiency: By storing less data, organizations can achieve significant savings in storage infrastructure and IT operations, while also benefiting from focused data analytics.
  • Enhanced User Trust & Reputation: By collecting only essential data, organizations can build trust with users, gaining a competitive edge in a privacy-conscious market that is increasingly growing tired of the abuses of surveillance capitalism.

In essence, verifiable credentials shift the paradigm from "data collection and storage" to "data verification and trust." This is what Marie Wallace means with her analogy between VCs and music streaming. Online interactions are provided with the assurance they need without the business incurring the overhead (and risk) of storing excessive customer data. Zero data strategies not only reduce the potential attack surface for cyber threats but also offers a variety of operational, financial, and compliance benefits.

The biggest objection to a zero data strategy is likely due to its decentralized nature. Troves of user data make people comfortable by giving them the illusion of ready access to the data they need, when they need it. The truth is that the data is often unverified and stale. Nevertheless, it is the prevailing mindset. Gettng used to just-in-time, trustworthy data requires changing attitudes about how we work online. But the advantages are compelling.

And, if your business model depends on selling data about your customers to others (or facilitating their use of this data in, say, an ad network) then giving up your store of data may threaten precious business models. But this isn't an issue for most businesses who just want to facilitate transactions with minimal friction.

Zero data aligns our online existence more closely with our real-world interactions, fostering new methods of communication while decreasing the challenges and risks associated with amassing, storing, and utilizing vast amounts of data. When your customers can prove things about themselves in real time, you'll see several benefits beyond just better security:

Zero data, facilitated by verifiable credentials, represents a pivotal transition in how digital identity is used in online interactions. By minimizing centralized data storage and emphasizing cryptographic verifiability, this approach aims to address the prevalent challenges in data management, security, and user trust. Allowing online interactions to more faithfully follow established patterns of transferring trust from the physical world, the model promotes just-in-time data exchanges and reduces unnecessary data storage. As both businesses and individual users grapple with the increasing complexities of digital interactions, the integration of verifiable credentials and a zero data framework stands out as a practical, friction-reducing, security-enhancing solution for the modern digital landscape.


Digital Identity Podcasts

Learning Digital Identity I've been invited to be on a few more podcasts to talk about my new book Learning Digital Identity from O'Reilly Media. That's one of the perks of writing a book. People like to talk about it. I always enjoy talking about identity and especially how it's so vital to the quality of our lives in the digital world, so I'm grateful these groups found time to speak with me.

First, I had a great discussion about identity, IIW, and the book with Rich Sordahl of Anonyome Labs.

I also had a good discussion with Joe Malenfant of Ping Identity on digial identity fundamentals and future trends. We did this in two parts. Part 1 focused on the problems of digital identity and the Laws of Identity. Digital Identity Fundamentals and Future Trends with Phil Windley, Part 1

Part 2 discussed how SSI is changing the way we see online identity and the future of digital identity. We even got into AI and identity a bit at the end. Digital Identity Fundamentals and Future Trends with Phil Windley Part 2

Finally, Harrison Tang spoke with me in the the W3C Credentials Community Group. We had a fun discussion about the definition of identity, administrative and autonomic identity systems, and SSI wallets.

I hope you enjoy these. As I said, I'm always excited to talk about identity, so if you'd like to have me on your podcast, let me know.