things that you need to know in order to build web APIs that can scale as your API grows and gains wide adoption. Keep in mind the focus of this series is on design aspects, so I won’t be touching on any implementation details here. But if that’s something you’re interested in, let me know down in the comments below. Now in the first two videos, I covered how you should go about picking the right API standard. If you missed those, I highly recommend that you catch up. I’ve added the links down below. Go check them out. In this video, I’m going to cover API security. The things the industry used to do, what was wrong with them, and what we are doing today. So let’s get started. All right, so the most important thing to get started off with is authentication and authorization. These are two different things, but you’d be surprised at how many people get these mixed up. Authentication is about verifying who you are. For example, when you log in to a web application using your username and password, you are letting the application identify who you are. Authorization on the other hand, is about verifying what you’re allowed to do, what resources you’re allowed to access. So once you’ve logged in and identified yourself, you’d have some role or access rights assigned to you. Let’s take Reddit for example. Once you’ve logged in, you can create, edit, delete your posts, but you don’t have the rights to edit or delete other people’s posts. The way that works is through authorization in those APIs. These are two elements of security that are crucial to understand from the getgo.
So now that we’ve gotten that out of the way, let’s talk about something called basic authentication. So for a long time, accessing internet resources was done through basic authentication, and even today, there are some use cases for it because it has the benefit of being simple. You take the username and password, basically encoded and add it to the header of your request. And so for this reason, it was always combined with SSL. While this is simple, there are a couple of problems with this approach. The most obvious one is that the username and password are sent with each request, leading to a much higher chance of an attacker compromising this data. On top of this, integration with other applications is just not great. Let’s say your API is pretty successful. Let’s say your API is Twitter, for example. So it’d have a large user base and maybe some third party applications who would like to build on top of your APIs. These apps might want to allow users to post tweets on behalf of them. To do this with basic authentication would mean that the users would need to share their user credentials with these applications. This is bad because applications have to store these credentials in clear text or in a way that they can decrypt them. If a bug or leak in the application exposed this data, an attacker could gain access to these credentials. Given that most people stick to the same username and password for multiple accounts, this is bad in so many ways. This also leads to another issue. Let’s say a user has authenticated against multiple applications. Like in this case, there’s no way for the user to revoke access to a single application because changing the password would revoke access to all the applications. And this is obviously not desirable. And finally, applications get full access to user data. There’s no way to limit the amount of access an application.
to these selected resources. For example, these APIs might only want to post tweets on behalf of the user, but technically they have access to their direct messages, their timeline, and all the other data that the user has. So for these reasons, basic authentication has proven to become a legacy form of authentication. Many big API providers have decided to move on from it for more modern approaches. So the reason why I used Twitter in this example is because Twitter actually used to use basic authentication back then. And for all the reasons that we just talked about, Twitter decided that basic authentication was just not going to work for them anymore. And they moved on to something called Oauth to solve all these problems. So what exactly is Oauth? Well, Oauth is a standard that was introduced in 2007 to tackle some of the problems that we just discussed about. Basically, it allows you to grant access to applications without having to share passwords with them.
This also leads to two more benefits. Each application gets registered with your API and the user has the opportunity to revoke access to each of those applications individually. This also means that users have granular access to resources. In fact, OAuth is all about limiting access to resources. [MUSIC] Now, OAuth can be a very complicated standard. There are different ways in which OAuth can be set up, but for this video, I’ll stick to talking about how OAuth works with something called an authorization code flow. So let’s continue with the Twitter example. You’ve got your API and you’ve got your users. There’s a third party application that wants to integrate with your API on behalf of your users. The reason why I called it the front channel is because this flow usually happens on a front and back channel. The front channel is the code that runs in a browser or some less secure interface. As you might already know, you could just open up the browser tools and have a look in the code and request responses over the network. So basically, you don’t want to have sensitive data in the front channel. The back channel is a secure server, something that you have full control over, unlike the front channel. So back to the example. The user wants to use a feature of the third party application that your API provides. To make this work, the third party application will get the user to redirect to what is known as an authorization server. In this case, it’s Twitter’s authorization server. Notice how we pass some key pieces of information along with this request. We indicate something called a redirect URI. This is where the authorization server will call back when the user’s interaction is done. There’s also something called the response type. This is where we indicate the OAuth flow that we want to use, in this case, authorization code flow as indicated by code. Finally, we also indicate the scopes needed. In this case, it’s right access. So we’re basically saying, all right. This user would like to authorize my application to use your APIs to get right access. Then this prompts the user to log in and consent to providing the requested access rights.
User to login and consent to providing the requested access rights. Once a user consents to it, it gets what is called an authorization code. The app then uses this code from the back channel and exchanges this code with the authorization server again to get what is called an access token. This access token is then used to make requests to the API to perform any action that the user consented to. So in this example, the app can now post tweets on behalf of the user. If you’re wondering why we get an authorization code and then exchange it for an access token, well the short answer is security. Remember, the browser can’t be fully trusted, so we don’t want to expose an access token in the browser. Unless you’re absolutely sure that not much harm can come out of that access token being stolen. Even if an authorization code is stolen, an attacker can’t directly use it to exchange it for an access token. This is because the back channel will also pass a secret, which only the authorization server and the back channel knows about during the exchange. So basically, this is how OAuth allows applications to integrate with APIs in a secure manner, which also allows granular access to resources. Now, when OAuth came out, it got serious adoption all over the world. This wide adoption also meant that people started misusing OAuth. They started using it for authentication rather than the delegated authorization, which it was meant for. Meant that it was used to identify users rather than focusing on access rights. The end result is an access token, which was never meant to hold any user information like name, email address or anything else that relates to the identity of the user. What’s actually bad is that everyone started using their own hacks to get any information about the user back. And the whole point of OAuth was to standardize things. So when everyone’s doing their own thing, that kind of defeats the purpose of the standard.
This is where Open ID Connect comes in. It is a simple identity layer on top of Oauth protocol, which tackles the identity problems that many people try to solve using Oauth. So the key thing here is, it’s not a separate protocol. It’s meant to work together with Oauth and standardize authentication and authorization. So, let’s look at how Open ID Connect integrates with Oauth. When we make the request from the front channel to the authorization server in the scopes, we mentioned one additional piece of information called Open ID. This indicates the authorization server, we are interested in getting back something called an ID token. That’s it. It’s as simple as that. The ID token is meant to give access to user information. The access token is meant for authorization problems. And that’s basically what Open ID Connect is. It’s something to help standardize authentication and authorization. All right, so we talked about Open ID Connect and Oauth and all of that stuff. We also touched a bit about scopes. But let’s dig in a little bit more into this subject. Oauth scopes are used to limit an application’s access to user data. During the authorization request, the API provider will display all the requested scopes to the user. This way, the user is able to understand exactly what access rights they are providing to the application. But this is where things get interesting for you as an API developer. Let’s say you provide a few simple scopes. Read, write, and a combo of read write. Now, at the time of releasing your API, you might feel like these scopes should be more than enough, and that might be the case. But oftentimes, having such a small number of scopes can bite you in the back. So let’s look at a case where the application wants to access the user’s followers information and they just need a read scope. And that’s what you’ve provided. So, with a read scope, they can access the timeline information. They can access followers, which is what they really needed. But they can also access messages and tweets. And messages might be really sensitive information for your user. You don’t really want to provide access to messages when all the application needs is data about followers. So think about having a more specific scope. Let’s say… with defenses for private humans.
Let’s say something like retweets. So in this case, if the application has only a retweet scope attached to their access token, well, they can’t get timeline information and followers and no messages either. But if you make a request to the tweets with the retweet scope, well, now they have access to it. So the main idea here is to have enough scopes that allow you to protect the sensitive data in your application. So the first tip when it comes to adding scopes, make sure you have scopes in place for protecting sensitive information. Secondly, ensure that you have scopes for different kinds of resources. Giving read write access to all resources is never a good idea. And the final tip is, don’t go overboard with the scopes either. Because you don’t want to have hundreds and hundreds of scopes, you need to find the right balance. Now at this point, you know what an access token is, but it’s never a good idea to have an access token that can live forever. To tackle this problem, the OAuth protocol allows limiting the validity of an access token. This adds another layer of security in case the token is compromised. Many API providers choose to set expirations of anywhere between a few minutes, to hours, or even days. The expiry is really up to you as an API provider and can depend on the use case of your API. Now, if you provide access tokens with an expiry, then you need to let the client get a new token whenever one expires. Typically, you don’t want the end user to be involved in this renewal process because that would be super frustrating for the end user to keep logging in whenever something expires. So it has to be done behind the scenes. The way this is done is by issuing what is known as a refresh token. It’s a special kind of token which can be used to request new tokens whenever the current one expires. Applications need to provide the client ID and the secret. This is the information that you get back when an application registers with an API.
if it’s an API and you pass this information along with a refresh token to get a new access token. If a refresh token got compromised, it will be pretty useless as it means nothing without the client ID and secret. So keep in mind that it’s not a good idea to do this on your front channel. So let’s look at a quick example of how refresh tokens actually work. So in the OATH flow, an authorization server that provides an access token, which has a limited validity, will also provide a refresh token. So the application stores this refresh token but continues to make requests with an access token. And at some point, the access token will expire. This is when the application uses the refresh token, exchanges it with the authorization server for a new access token and then makes a new request, which will succeed. So this is basically how refresh tokens work.