Isn't it great to be able to sign in to one application (Trello, for example) using your credentials and user profile that are stored somewhere else (say, Google)? OAuth2 and OpenID Connect can give you a wonderful experience, so you don't have to create another password, re-enter your profile, or even type the same password again to access multiple services during the same session.
As a user I was content knowing that this all just worked, and I didn't think much about it. However as a developer, I was asked to add a new service to a web application and to make it all work with an existing OpenID Connect provider.
OpenID Connect was new to me, and I soon learned it extends another specification (OAuth2). I got caught between narrowly-focused or platform-specific articles on one hand, and overwhelmingly-broad specifications on the other. It was hard enough to tell which part(s) of a spec were relevant, if I could even figure out which spec to read. I just needed one source to tell me how it all works, so I could figure out how to use the new service with the existing accounts.
This article is the first of a series that is meant to do just that. It introduces workflows—in a platform-independent manner—that are commonly used by webapps to sign in and authorize API use, and it provides links to other articles that can help fill in the details. This first article will focus on the early steps in the process, in which a user signs in and asks for permission to do something.
OAuth2 in Theory
While the goal of this series will be to develop an understanding of OpenID Connect, I found it helpful to take a step back and learn how OAuth2 works. This made it easier to understand how OAuth2 workflows are extended in a few, minor ways to implement the features in OpenID Connect.
So what is OAuth2, and what does it do? RFC6749 opens with this:
OAuth2 is a widely-used authorization framework that enables a third-party application to obtain limited access to an HTTP service.
For web applications, this means a user signs in to an Authorization Server and grants access to resources it owns on any number of Resource Servers. Grants may be made back to the original client (the web application) or to a Resource Server (service) that is acting on behalf of the user.
Back to the motivating scenario: we can now say the user wants to authorize a new service to access resources it controls (its own user profile) on a Resource Server (the OAuth2 / OpenID Connect provider).
Keeping that in mind, we will have to take one more step back to discuss authentication, for a practical reason: an OAuth2 server will not grant permission for anything interesting, until it can trust that a user is who it says it is.
Authentication vs. Authorization
Before figuring out what resources you have access to (authorization; the scope of OAuth2), you must first authenticate (say who you are). This is a two step process:
- The user presents claims about its identity, such as a username, email address, or session (which can be used to look up who the user is).
-
These claims are verified, using information that only that user is expected to have:
- Passwords are widely used and will have to be entered (again), unless the user has signed in recently from that browser or app (more on that later).
- Additional authentication codes may also be required, for Authorization Servers that use Two-/Multi-Factor Authentication (2FA/MFA) workflows. Users are often required to enter these codes periodically, or on the first use of a new device or browser, at a minimum.
A formal definition of authentication and a blog post summarize the two nicely:
Authentication means confirming your own identity, whereas authorization means being allowed access to the system.
Authorization is the process to determine whether the authenticated user has access to [the] particular resources.
Authentication in OAuth2
There are a number of forms for authentication that are commonly used for OAuth2:
- Authentication is often first done by filling out a login form with a username and password and POSTing that data to an endpoint on the Authorization Server.
- Some Authorization Servers set a cookie when responding to authenticated requests, which can be used to authenticate future requests. The cookie must not have expired, and the session still needs to be active on the Authorization Server, for authentication to work.
-
Some Authorization Servers respond with a JSON Web Token ("JWT" or "JWT token") that can be included in an
Authorization
header in future requests.
Authentication with cookies is most often a form of stateful authentication, meaning the server must track active sessions (identified by cookies) and decide how long to keep them active. JWT tokens, on the other hand, are signed by the Authorization Server, which makes it practically (although not technically) impossible to forge. This enables a form of stateless authentication. This is often thought to scale better (with caveats), as the user's identity is already contained within the JWT token and does not require further persistence on the server.
Remember, browsers only send cookies back on the domain that issued them, subject to further restrictions by the server that issued them. Browsers will automatically send whatever unexpired cookies it has on a domain, when navigating to a page. Async requests like XMLHttpRequest
and the Fetch API might need to be configured to send credentials like cookies and Authorization
headers, if the request is made to a different domain than the one that originally issued them.
JWT Tokens
JWT tokens can be used both for authentication and for authorization; including a token in an Authorization
header can be used to fulfill both purposes.
Note, however, that using JWT tokens—especially for session management—might introduce a number of security vulnerabilities or add unnecessary complexity. The OAuth2 spec does not strictly define what kind of tokens to use, so it might be possible to avoid these vulnerabilities by using OAuth2 with a different token scheme (such as stateful authorization with a pseudo-random number) or by using a different authorization mechanism altogether.
Asking for Permission: Requesting Authorization
Using an OAuth2 token gives clients a way to say who the user is and—depending on what's in the token—what that user is allowed to do. However, clients still need a standardized workflow for requesting authorization from Authorization Servers. For that, we need some kind of grant (an OAuth2 term) to request or a flow (an OpenID Connect term) to initiate.
Grants and Flows
OAuth2 defines an authorization endpoint for users to request access to one or more resources, using one or more OAuth2 grants. There are a few commonly used OAuth2 grants that are further extended by OpenID Connect flows:
- Demystifying OAuth 2.0 and OpenId Connect (and SAML) summarizes workflows and terminology.
- Single-Page Apps provides more details about the authorization requests.
- When To Use Which (OAuth2) Grants and (OIDC) Flows can help you determine which grant/flow is appropriate.
To recap: users requesting authorization must authenticate with the Authorization Server, using one of the methods described earlier. If the user is allowed to delegate authorization to the specified client for the specified operation(s), the Authorization Server responds with a redirect back to the requested URI. This redirect contains the authorization code and/or access token, depending upon the request. If the user needs to sign in first, the Authorization Server redirects the client to the login page and then resumes the workflow after the user authenticates.
Web applications commonly use two grants/flows to request authorization:
-
In the authorization code grant/flow, clients obtain an authorization code for a specific operation and pass it to the Resource Server. The Resource Server then exchanges the authorization code and a shared secret (
client_secret
) on a secure channel, for an access token. This is typically used for native mobile applications or for web applications that have a dedicated web server. - The implicit grant/flow is intended for situations like Single Page Applications, where there is not a secure piece of infrastructure that can share a secret with the Authorization Server but also hide it from everyone else. In this case, the client receives the access token directly from the Authorization Server.
Making the Request
Using the selected workflow, users need to tell the Authorization Server what it (or a service acting on its behalf) wants to do. This is done by adding a few parameters to the authorization request (see here for an example):
-
client_id
: the application or service for whom the user is requesting authorization. -
redirect_uri
: where tokens and/or authorization codes should be delivered. -
response_type
may be used to say how access tokens should be delivered to the requested address. Common forms include URL fragments and query parameters. -
scope
may be used to say what sort(s) of access are being requested. The Authorization Server may grant all, some, or none of the requested scopes.
It is also important to provide a few safeguards by adding some extra parameters:
-
Clients using the Authorization Code flow should add a unique, unguessable
state
that is routed through the workflow and returned back to the client. A client receiving astate
that does not match the value it originally sent should assume that the response is malicious and stop the workflow. This prevents Cross Site Request Forgery attacks, where the malicious response was either forged, or came from an unintended request that the user was tricked into making. -
Clients using the implicit flow must include a cryptographic
nonce
parameter to prevent the same signed token from being valid when presented more than once (a kind of Replay Attack).
The end of a successful authorization request results in a Resource Server (a protected service or API endpoint) obtaining an access token. The token can then be used to grant or deny authorization for a request that the user is making, or that a designated client is making on its behalf. Access tokens are often (but not always) delivered in the form of JWT tokens.
Single Sign On (SSO)
Now that you know how a user requests authorization and how the Authorization Server is able to verify the user's identity throughout a session (with cookies, or some other token), you have a basis for understanding how Single Sign On works. By that, I mean a user who signs in to use one service or application is then able to use additional services and applications without entering their password again.
How does this work, for apps and services on distinct domains? Won't the user's browser refuse to send cookies from one service on domain A, to another service on domain B?
That's true, and each application is free to (and often does) maintain its own session tokens on its own domain. If an application does not think the user is signed in, it redirects the user to the Authorization Server to sign in. But at that point, it's tokens on the Authorization Server's domain that determine whether the user has to enter a password again. So any applications and services that use the same Authorization Server will effectively share that session, because they all redirect to the same domain to request authorization.
Signing Out When SSO is Enabled
In one moment of face-palming during my journey, I had a bit of an epiphany:
If you modify how to sign in, you might also need to modify signing out.
This all became clear when—after enabling single sign-on—users could sign out and back in again, without being challenged for their password.
To understand why, take a moment to think about what is happening during a SSO workflow: signing in results in storing tokens from multiple domains in the browser (one for each app/service domain, and one for the Authorization Server's domain). In this case, cookies were being cleared from the application's domain but not from the Authorization Server's domain. So users could still authenticate with the Authorization Server, as soon as the app directed them to request authorization.
If you want signing out to be meaningful, there are a variety of options to help clear out the additional state:
- Tokens on the Authorization Server's domain can be cleared by making a request to an endpoint on that server, which signs the user out and clears session tokens. This request may be made asynchronously in the background, or it can be done in the foreground by navigating to the page that ends the session and (possibly) redirecting back to the app from which signing out was initiated.
-
Sessions on each other application or service domain may need to be cleared or invalidated, as well. Single Sign Out workflows are being formalized with OpenID Connect, by defining an
end_session_endpoint
. Apps go to varying lengths to sign users out, if they have been signed out from the Authorization Server: some poll the Authorization Server and sign users out right away, while others may simply wait until an authorized request fails.
Wrapping Up
Granting access to an application to use some data you own is something that has become so ubiquitous in our lives that it's easy to take it for granted, but it turns out that it is often not as simple as it appears to be on the surface. This is further complicated by the existence of so many means of application delivery (native mobile apps, Single Page Apps, server-based apps) and the necessity for the underlying OAuth2 specifications to cover all these architectures.
Despite this complexity, we have now gotten far enough to sign in (authenticate) and to request authorization. The next article will help you complete the workflow so you can actually do something useful with that hard-won authorization.
Acknowledgements
I gratefully acknowledge Brad Ediger and Colin Jones for their contributions on the finer points of authentication and JWT tokens, respectively, and to Brad Ediger, Heather You and Stacey Boeke for their thoughtful and detailed reviews of earlier drafts of this article.