SEO for headless WordPress themes featured image

in Development

SEO for Headless WordPress Themes

This is a written version of the “SEO for headless WordPress themes” talk that Luis and Reyes gave at the JavaScript for WordPress Conference. You can find the slides here and watch the full talk on Youtube.

If you want to build a headless WordPress theme, you need to get your SEO right. In this talk, Reyes and Luis walk you through all the SEO challenges you need to consider when building a headless WordPress theme and how to deal with them.

Table of contents:

Introduction
Main SEO challenges
  1. JavaScript and Server Side Rendering
  2. Search Engine Requirements
  3. Important Performance Metrics

Introduction

What is a headless CMS?

Let’s start by explaining this concept as it will help us understand what comes later. These illustrations explain it quite well:

Building a React theme for WordPress has never been easier

GET STARTED

traditional-cms-vs-headless-cms

On the left, you can see a traditional CMS, which is a database-driven structure where the front and backend functionalities are deeply coupled. It can only push content to web-based applications.

On the right side, you can see a headless CMS architecture, which is sometimes also referred to as decoupled. This is API driven and comes without a frontend out-of-the-box. By decoupling the frontend from the backend functionalities, you can manage both separately.

headless-cms-structure

On the one hand, a headless CMS allows you to push content to different channels and devices at the same time. For example, a wearable device, a website, or anything connected via Internet of Things.

On the other one, it allows to use different technologies for the applications that consume the data via API and display the content. For example, in WordPress, this headless approach makes it possible to use JavaScript for your theme instead of PHP.

Now you may think… “Okay, I can already include JavaScript files in PHP”, right?

Well, yes. But we are talking about about JavaScript UI libraries like React or Vue. And there is one thing you can’t do in PHP, which is Server Side Rendering for JavaScript. But don’t worry, we’re going to explain that later.

In any case, WordPress was not initially intended to be a headless CMS. But in 2016, the merge of the REST API into core made it easier to build headless WordPress themes with JavaScript. Since then, the WordPress development has shifted in that direction because of the flexibility provided by these headless solutions.

Example of a headless WordPress installation

This is a typical headless WordPress installation:

typical-headless-wordpress-installation

On the left: content creators go to the WordPress dashboard to add or modify content. That’s a PHP server running WordPress.

On the right: visitors go to www.site.com and that’s a separate Node server running React. Data from WordPress is retrieved using the REST API.

Benefits of headless WordPress themes

Apart from the flexibility of a headless architecture, a WordPress theme built with JavaScript has many benefits.

⚡️Performance

The first one is performance. A headless WordPress theme can be optimized to get the best possible performance and improve your website’s speed.

A fast website is important for SEO (as page speed is a ranking factor for Google searches) but for the user experience as well.

No one likes to stare at an empty screen for more than 3 seconds. That’s why pages with a longer load time tend to have higher bounce rates, lower average time on page and have been shown to negatively affect conversions.

Conclusion: performance is a key factor to make sure users come back to your website and engage with your content.

⚛︎ Modern UI libraries

The second benefit is that this approach sets developers free from the structures of the backend and allows them to use modern UI libraries like React or Vue for the frontend, which are becoming essential to rich user experiences.

🔒 Security and scalability

In addition, by building a headless WordPress theme, the frontend is more scalable and the backend can be locked down to improve security.

🚀 Future-proof

Lastly, a headless WordPress theme it’s future-proof as integrates easily with new technology and devices (as long as the API endpoint is available).

What about SEO?

We have talked about a headless CMS structure and the benefits of building a headless WordPress theme, but what happens with the SEO?

Traditional CMS platforms like WordPress work well with today’s SEO because they are mainly HTML-based. Search crawlers, or spiders, were originally designed for crawling through HTML.

In addition, these platforms have made it easier to optimize a website by installing a number of plugins which tell you how “SEO friendly” your web page is.

With a headless CMS, since it is as a backend-only solution, you don’t have the standard SEO functionality that you typically get with a traditional CMS. For example, meta tags are not automatically output (you would need to do this process manually).

This means that if you use WordPress as a headless CMS, you will need to do some additional development to implement SEO.

But don’t worry, there is good news too! First, most traditional SEO rules don’t change. Things like keyword research, link building or regular posting of high quality content, they still apply.

And second, if you are building a headless WordPress theme don’t give up because of the SEO. There are challenges of course, but there are solutions too, and a lot of things that can be implemented to keep your SEO up.

Main SEO challenges

There are 3 important challenges to keep in mind when developing a headless WordPress theme. We’ll go over each of them in detail:

  1. JavaScript and Server Side Rendering
  2. Search Engine Requirements
  3. Important Performance Metrics

1. JavaScript and Server Side Rendering

1.1. How does Google index your site?

You’ve probably heard that Google is now able to render JavaScript, right?

Well, yes, but no. Let us explain that.

This (see image below) is the normal crawler that’s been around for decades. It only understands HTML and is pretty fast.

how-google-index-your-site

A while ago, Google introduced a new renderer which is also capable of rendering JavaScript. It is still slow and expensive, so they don’t use it very often. That means that it may take several weeks to get your site.

how-google-index-your-site02

Long story short: if you are serious about SEO, you cannot rely on the JavaScript renderer yet. You still need valid HTML.

1.2. What’s the problem with JavaScript themes?

So, what’s the problem with JavaScript themes? Let’s start from the beginning.

This is a normal PHP theme, living in a PHP server:

problem-with-javascript-themes01The PHP server is able to run PHP and to generate an HTML file which is sent to the browser. The browser understands both HTML and JavaScript. This way everything is fine and Google is happy.

But what happens when we want to generate our themes with JavaScript libraries, like React or Vue?

Our PHP server is not able to run JavaScript, so it’s not able to generate an HTML. All it can do is send an empty HTML and the JavaScript files.

problem-with-javascript-themes02The browser receives an empty HTML (that’s bad) but it understands JavaScript, so it’s able to run the JavaScript files and show the user the final HTML (not that bad).

This approach works but is not ideal because you are sending an empty HTML, which is bad for SEO and not good for the user experience either.

If our rendering logic is in JavaScript, we need to use a JavaScript server, like NodeJS.

problem-with-javascript-themes03This server is able to run JavaScript, create the HTML files, and send them to the browser. Everything works fine again and Google is happy again!

Before we move on, let’s explain this further from the perspective of the servers and the concepts of server side rendering and client side rendering.

1.3. Server Side Rendering & Client Side Rendering

Static Sites

First, let’s see what a static server looks like.

 

This is the simplest server possible:

  • There’s one HTML file stored for each URL of your site.
  • It is fast, but you may want to use a CDN to be as close as possible to the end user.
  • And what happens when the HTML gets to the browser? Since it’s HTML it works fine, nothing fancy here.

But this approach has two drawbacks:

  • First, when the user navigates through your site it needs to start the process all over again: throw away all the DOM > white screen again > wait for the new HTML > and finally render the new page.
  • Second, it doesn’t work well with dynamic content stored in a database.

To solve that, somebody said “Hey, let’s improve this”, and they invented server side rendering.

Server Side Rendering

Here, instead of HTML files, we have code in the server, logic that gets data from a database and generates an HTML file for each URL.

 

Of course, this is slower than serving static files. But if you add a CDN, it is as fast as the static server.

Drawback: This works great for dynamic content but the process still has to start all over again each time users want to visit a new URL.

To solve that, again, somebody came and said “Hey, we want websites to be like native apps”. This is where the client side rendering comes in.

Client Side Rendering

To make websites be like native apps, all the logic must be downloaded and executed in the client, no server code, only client code.

And what’s the only language browsers can run? YES, JavaScript. So they invented client side rendering and with that, JavaScript UI libraries like React or Vue.

 

As we have seen, this has a major drawback: the HTML file is empty, which is bad for SEO. But when JavaScript takes control of the page it’s good. It feels like a native app and users are able to navigate inside the app without having to reach the server again, only fetch the required data using APIs. The user experience is great.

Server & Client Rendering

Finally, what about server side rendering AND client side rendering? Can we do both?

It turns out that we can! The only thing that we need is a JavaScript server.

This approach works great with dynamic content. And, again, if we use a CDN it is
as fast as serving static files. It has all the content in the HTML, so Google is happy again! And once JavaScript takes control of the page, it feels like a native app.

 

Mission accomplished, right? Well, no…

You are now probably like “OMG! This is super complicated stuff!” 😱

Don’t worry, you can use JavaScript frameworks, which take care of all this for you. Let’s briefly introduce some of them.

1.4. Meet the JavaScript frameworks

Gatsby

Gatsby is a bit special because it’s a static site generator and it works with a data query language called GraphQL.

It’s great because you end up with a static server, which is cheap. But it has the drawback that you need to recreate all the HTML files each time you update content in WordPress.

This framework is not focused on WordPress, but they have great support for this CMS using the GraphQL language.

Next.js

Next.js is a server side rendering solution. Technically, it also has static exports, but there’s no easy solution to use that with the dynamic content of WordPress.

It doesn’t have anything specific for this CMS but you can connect manually using the REST API.

What is great about Next.js is that it works with serverless, which is a kind of super cheap JavaScript server that turns itself on and off automatically.

It is not as cheap as a static server but it’s quite affordable.

The JavaScript frameworks

Frontity

Frontity does server side rendering and works with serverless as well.

Its differential value is that it’s focused on WordPress, so everything regarding this CMS works out of the box.

Nuxt.js

Nuxt is very similar to Next.js, but made for Vue.js.

2. Search Engine Requirements

This is the second SEO consideration to keep in mind. As part of these requirements we have included metadata, robots.txt and sitemaps.

2.1. Metadata

Metadata is all the information about a page that you send to a search engine that is not visible to your visitors. It is one of the most important practices to optimize your site for SEO.

However, not all metadata is relevant for SEO. Let’s see some of the elements that can influence your rankings. For example, the basic meta tags that go inside the <head> section of your site.

  • Meta title: this is the main title of a page and one of the most important elements for SEO. As search users, titles are the first piece of information that we see and we use them for deciding whether a page is relevant to our query. As people can use different devices to view your website, remember to keep your <title> descriptive and short.
  • Meta description: this is a short description of the page and can be displayed below the title in Google’s search results. Sometimes Google prefers to display another snippet from your page to match a specific search query. However, this does not mean that you should forget about it. It is important too.
  • Canonical URL: when multiple pages have similar content, search engines consider them duplicate versions of the same page. You should always use this element to prevent duplicate content and point Google to the original source of that content (or tell the search engine which version to crawl).
  • Social metadata: it allows to control the way your articles or pages are shared on social media networks by defining how titles, descriptions, and more appear in social streams.

Platforms like Facebook, Linkedin or Pinterest use a protocol created by Facebook called Open Graph for social-share-preview. While Twitter has its own meta tags. They are similar to the Open Graph protocol, but they use the “twitter” prefix instead of the “og” one.

There are a lot of Open Graph and Twitter meta tags, but you don’t have to use all of them. Here you can see the basic ones that you should keep in mind:

Basic meta tags - examples

If you want to learn more, you can find a complete list of all the Open Graph and Twitter meta tags in the links below:

How to add metadata manually

As meta tags are not automatically output in a headless WordPress theme, you have to do this process manually.

In Next, Frontity and Gatsby:

You do it inside React with a component called <Head> (or <Helmet> in Gatsby).

How to add metadata in Frontity, Next and Gatsby

It’s very simple. You can add that component wherever you want in your app, and it moves whatever you put inside to the <head> tag of your HTML.

In Nuxt.js:

You have to use their head() method with an array of the meta you want to include, it is that simple.

In case you want to use Yoast:

You need to install an additional plugin called wordpress-API-yoast-meta so the metadata from Yoast is exposed in the REST API.

  • Then, in Gatsby and Next, you can use the same <Head> component.
  • In Frontity, you can use the @frontity/yoast package and everything is added automatically for you.
  • And in Nuxt, you can just use the same head() method again.

2.2. Robots.txt

The robots.txt file is one way of telling a search engine where it can and can’t go on your website, or in other words, which URLs on that site it is allowed to index.

Adding robots.txt
  • Gatsby: you can use their gatsby-robots plugin.
  • Frontity: a robots file is added by default (we are also preparing a package to configure it further).
  • Next and Nuxt: you need to create a robots.txt file in the /public or /static folder.

As you can see, adding robots is pretty simple!

2.3. Sitemaps

Sitemaps are another important requirement for SEO because they help search engines to easily understand your website structure while crawling it.

Adding sitemaps
  • Gatsby: it has good support for sitemaps with their gatsby-sitemap plugin.
  • Frontity: we want to make use of the WordPress sitemap plugins. There is actually a conversation going on about including the sitemaps in the WordPress core, so we will release a package for sitemaps taking into account all these scenarios.
  • Next and Nuxt: as far as we know, there’s no simple way to generate a dynamic sitemap like this, which means that you will have to create one yourself.

3. Important Performance Metrics

By now, we have seen two of the three SEO challenges (JavaScript and Server Side Rendering and then the Search Engine Requirements). Let’s go over the last one: performance metrics.

As previously mentioned, performance is very important, not only for the search
engine optimization but for the user experience as well.

3.1. FCP, FMP and TTI metrics

As we can see from the image below, one of the most important performance metrics after the blank screen is the First Contentful Paint (FCP).

The First Contentful Paint (FCP) is triggered when any content from the DOM is painted. This could be a text, an image, a SVG, or a canvas render. This metric is important for users as it provides feedback that the page is actually loading.

Perfomance Metrics

Then we have the First Meaningful Paint (FMP). This is the time when the primary content of a page appears on the screen.

Definitions of primary content differ depending on pages. For example, for blog articles, it would be the headline + the above the fold text. But if an image is critical to your page (e.g. if you have an e-commerce product page), then the First Meaningful Paint requires that image to be visible.

This is going to be our primary metric for the user-perceived loading experience.

Finally, we have to consider the Time to Interactive (TTI) metric, which measures how long it takes a page to become interactive. Or in other words, when the users get the real thing and can start using it.

3.2. How to improve the First Meaningful Paint

Technically, the First Meaningful Paint is the time between the moment a user enters the URL of your site, or clicks on a link that points to your site, and the moment that the browser renders the HTML of your site.

If you do proper server side rendering, your contentful and meaningful times should be pretty similar.

There are two things you can do to improve them:

  • First, use a CDN to serve your HTML files as fast as possible, that’s a must.
  • And second, make sure your HTML is ready for render. What do we mean with this?

 HTML files should be ready for render

HTML files should be ready for render
  1. Use server side rendering to generate proper HTML files (we hope that’s clear now after this talk!).
  2. For CSS, you may want to use a library like Emotion or Styled-components because they put all the CSS in the <head> tag of your HTML files. This way the browser doesn’t need to download any additional CSS file. You may say “Well, that sounds kind of tricky”, but that’s what Google itself is doing in their own Google AMP format. Believe us, it works and it is fast.
  3. Fonts: try to avoid them as much as possible. If you absolutely can’t, use font display: swap so you don’t block the HTML render.
  4. And last but not least, make sure you always load JavaScript asynchronously again so you don’t block the HTML render.

3.3. How to improve the Time to Interactive

The Time to Interactive is the time between the moment the browser renders the HTML
and the moment the browser finishes the execution of JavaScript.

In order to improve this metric, the most important thing you can do is to reduce the JavaScript bundle size.

How to reduce JavaScript bundle size

What techniques can we use to reduce the bundle size?

  1. First, remove all the unused modules. We’ll show you later how you can inspect your modules.
  2. Second, make sure tree-shaking is active in your framework. Tree-shaking is a new technique that removes unused code. When you create your JavaScript bundle it works automatically, but it must be supported by your framework.
  3. Third, you should ship different code for new browsers that support ES6 (which is like the modern JavaScript). This is because code that works in old browsers is bigger due the transpilation and the polyfills it needs.
  4. And finally, use code-splitting. Let’s see this point in detail.
Code-splitting

Code-splitting means that instead of sending the same JavaScript bundle to all the URLs, we are going to split our JavaScript bundle in smaller chunks and send only what’s necessary.

For example, if we are in a home page, we only send the JavaScript needed for the home page. If we are in a post, we only send the JavaScript needed for the posts.

Then, once everything is loaded, if the user navigates to other parts of the site you can load that code. Actually, you can even prefetch that code if you think a user is likely to move to another part of the site.

Again, this looks like super complicated. But all these tools are included in the main frameworks with APIs that are easy to use and understand.

How code-splitting works

React Concurrent Mode

Finally, there’s another thing coming that we believe will help with Time to Interactive: React Concurrent.

Right now, the current React (called React Sync) does its rendering in a single chunk. But React Concurrent divides the work in smaller chunks, so they never block the CPU.

React Concurrent explained

The React team has been working on this for a while now, so it will be probably released later this year.

We’re sure Vue will eventually get something like this too.

3.4. Useful tools

To conclude this talk, these are some of the Google tools which you can use to analyze and optimize your metrics:

And these ones are useful to analyze JavaScript bundle size:

Luis went more in detail over these tools during our talk, feel free to check it out (starts at 30:29).

Closing words

We hope this post gives you a better understanding of the SEO challenges that come along with building a headless WordPress theme, how to solve them them, and most importantly, how to keep your SEO up with the right tools and knowledge base.

If you have any questions, hit us up on Twitter or our community forum. We will be happy to help!

Building a React theme for WordPress has never been easier

GET STARTED