Automate the Canonical <link>
Overview
Search engines often index multiple pages on your site that contain the same content. This can happen for a variety of reasons including the following taken directly from the Google webpage on canonicalization.
- Region variants: for example, a piece of content for the USA and the UK, accessible from different URLs, but essentially the same content in the same language
- Device variants: for example, a page with both a mobile and a desktop version
- Protocol variants: for example, the HTTP and HTTPS versions of a site
- Site functions: for example, the results of sorting and filtering functions of a category page
- Accidental variants: for example, the demo version of the site is accidentally left accessible to crawlers
But there are a few ways to tell search engines which URL they should consider as canonical. Now, that doesn't mean that the search engine will use that URL, but it's helpful; specifying the canonical URL is a hint, not a rule. However, let's make it easy to do.
Google specifies three ways to indicate which URL you want to specify as the canoncial one. This article shows how to use SSI to automatically insert a <link>
element in the <head>
section of your document.
Solution
With Server-Side Includes (SSI), it's easy have have this information automatically created for each page on your website. Simply add the following line of code to the <head>
section of your document.
<link href='<!--#echo var="REQUEST_SCHEME" -->://<!--#echo var="SERVER_NAME" --><!--#echo var="REQUEST_URI" -->' rel='canonical' />
Explanation
The <!--#echo var="REQUEST_SCHEME" -->
returns http or https. The <!--#echo var="SERVER_NAME" -->
returns the server name, and <!--#echo var="REQUEST_URI" -->
returns path to the file and filename.
Used on This Site
I use this on every page on this site. If you view the source of this page you'll see this, <link href='https://www.buildingwebprojects.com/articles/automate-the-canonical-link/index.shtml' rel='canonical' />