The Hypermedia Maturity Model

This is part three in a series about hypermedia and REST. You don't need to read the previous post to follow this one, but if you like this and want to read more or prefer to start at the beginning, check out the full series.

When people first started trying to apply the REST architectural style to their APIs, there was a lot of confusion about what a REST API should look like. The Richardson Maturity Model (RMM) has been a valuable set of guide posts to help us on our journey toward understanding REST. RMM is a classification model that defines four levels, 0-3. As you go up the levels more of the formal definition of REST is taken into account. Here they are very briefly.

Level 0 APIs pipe all requests through one URI and one HTTP method (usually POST).
Level 1 APIs employ many URIs, but still only use one HTTP method (usually POST).
Level 2 APIs have URIs that represent resources and use the HTTP methods to perform CRUD operations on those resources.
Level 3 APIs use self descriptive messages that include hypermedia controls to drive the API.

Today, we've pretty much settled on what a Level 2 REST API should look like, but when it comes to Level 3 REST APIs, there's still a lot of uncertainty. People hear about Hypermedia APIs and want to implement hypermedia, but they don't know what a Hypermedia API should look like. The Hypermedia Maturity Model (HMM) takes RMM Level 3 and splits it into four additional levels. As you go up the levels, the API becomes more and more self-descriptive.

An API is self-descriptive if you don't need to consult documentation to know how to use the API. Like an HTML browser, it should be possible to navigate a Hypermedia API without any out-of-band information such as documentation.

HMM Level 0

At this level, a representation would include URIs referencing another resource, but they are just strings with no semantics that allow a client application (like a web browser) to recognize it as a link and processes it as such. To a computer, these URLs are just strings. Documentation is needed to know that these values are links and should be followed. Imagine HTML with no <a> tag. You can have the URIs as text on the page, but you would have to manually copy and paste them into the URL bar in order to navigate. An API that uses plain JSON with some of the values being URLs is Level 0.


{
		"id": 23,
		"title": "My Blog Post",
		"content": "...",
		"author": "/api/author/1",
		"comments": "/api/blog/23/comments"
}

HMM Level 1

The next step is to use a media type that defines constructs that allow a computer to recognize certain elements as links. The <a> tag in HTML is an example. Instead of exposing bare URLs to the user, a user agent (web browser) can display these as links that users can click on to get to related resources.

HAL is a media type that is an example of a Level 1 API. It's mostly plain JSON, but it adds a few fields that have special meaning. One of those fields is _links, which describes the links a user can choose to follow.

HTTP/1.1 200 OK
Content-Type: application/hal+json

{
		"id": 23,
		"title": "The Hypermedia Maturity Model",
		"content": "...",
		"_links": {
				"self": { "href": "/api/blog/23" },
				"author": { "href": "/api/author/1" },
				"comment": { "href": "/api/blog/23/comment" }
		}
}

HMM Level 2

Level 1 APIs are sufficiently self-describing for read-only APIs, but they are lacking when user input is required. If you need to send data to the server, you have to consult the documentation to learn how to send the message. Imagine HTML without a <form> tag. You would have to consult the website's documentation in order to know how to construct a request to send data to the server.

To be Level 2, you need a hypermedia format that can describe how to send data to the server. In other words, it should have a feature that at least loosely resembles an HTML <form>. HTML Forms are self-describing in many ways. Using a <form method="post"></form> indicates that the HTTP POST method will be used to make the request and the body will be encoded in the application/x-www-form-urlencoded format. The user of the form doesn't need to know any of these details because it's all taken care of by the web browser. The web browser can do this because HTML Forms are based on well-defined standards that govern how they behave.

But, HTML Forms don't just describe the "how" of a request, they also describe the "what." The <input> tags within the <form> tag describes the structure of the data that is expected by the server. We don't have to consult the server's documentation to know what data to send.

These days there are many examples of Level 2 media types, including JSON Hyper-Schema, Siren, Collection+JSON, and many more. Each of these media types has a different way of being self-descriptive. JSON Hyper-Schema uses JSON Schema, which is good for describing and validating complex request structures. Siren, on the other hand, takes an approach inspired by HTML. In the example below, you will see actions that if you squint look a lot like HTML Forms.


{
		"class": "blog",
		"properties": {
				"id": 23,
				"title": "The Hypermedia Maturity Model",
				"content": "..."
		},
		"links": [
				{ "rel": ["self"], "href": "/api/blog23" },
				{ "rel": ["author"], "href": "/api/author/1" },
				{ "rel": ["comment"], "href": "/api/blog/23/comment" }
		],
		"actions": [
				{
						"name": "add-comment",
						"method": "POST",
						"href": "/api/blog/23/comment",
						"fields": [
								{ "name": "blog_id", "type": "hidden", "value": "23" },
								{ "name": "author", "type": "text" },
								{ "name": "text", "type": "text" }
						]
				}
		]
}

HMM Level 3

Level 2 is sufficient for most APIs, but it's possible to take the concept of self-descriptiveness to another level. In a Level 3 API, a resource doesn't just describe the actions you can take, it also describes the data itself. Just like how we said in Level 0 that a URI is just a meaningless string to a computer, property names like "title" or "author" also have no meaning. By using a shared vocabulary, such as the one defined by Schema.org, your data now has meaning that can be understood by other servers using the same vocabulary.

When your data is self-descriptive, you have the possibility of automating interactions between systems that usually require humans to translate between the two systems. A form could be filled in automatically because the server knows how the data it has matches up to the form data being requested. This concept could be especially useful in IOT where you have many independently designed devices generating data that can communicate dynamically because all of the devices speak the same language.

JSON-LD (JSON for Linked Data) is a media type that enables Level 3 style self-description of data. JSON-LD uses vocabularies to give a shared semantic meaning to all of your JSON property names. A fairly large and widely used vocabulary is Schema.org. The following is an example of a JSON-LD document.


{
		"@id": "/api/blog/23",
		"@type": "http://schema.org/BlogPosting",
		"http://schema.org/headline": "The Hypermedia Maturity Model",
		"http://schema.org/content": "...",
		"http://schema.org/author": {
				"@id": "/api/author/1"
		},
		"http://schema.org/comment": {
				"@id": "/api/blog/23/comments"
		}
}

Although JSON-LD provides a way to link to other JSON-LD documents, it's not a hypermedia format by itself. It's missing much of the richness of a true hypermedia format. That's where Hydra comes in. Hyrda is Level 3 hypermedia format that defines a vocabulary for describing rich hypermedia controls in JSON-LD documents.

You're not doing it wrong

The RMM is a model for the REST architectural style. If your API is not Level 3, then it doesn't fully conform to the REST architectural style. HMM is different. There is no defined architecture we are trying to conform to. In that sense, it's more of a categorization model than a maturity model. You're not doing it wrong because your API is not HMM level 3. Not every application needs that level of self descriptiveness. A mostly read-only API doesn't need to use a Level 2-capable media type. If your API will be consumed primarily by humans, as with HTML, there isn't a lot of value in Level 3 self-descriptive data because humans can interpret natural language text just fine.

However, when deciding how much self-descriptiveness you need, remember that the more self-descriptive your API, the more powerful your tools can be. Think of a web browser. We can interact with a website without knowing anything about HTML, HTTP, URI, etc because it's self-descriptive enough that the browser can abstract away all the technical details. Currently, we don't have tooling at that level for working with APIs, but as the tools mature, hypermedia will become even more valuable.

What's Next

In this series, we've been focusing on hypermedia, but in the next post we'll take a step back and do a deep dive on the REST architectural style.