Thursday, May 30, 2024

Optimizing SEO Headers for Digital Assets — Adobe Experience Manager (AEM)

Search Engine Optimization (SEO) is crucial for ensuring that your digital content is easily discoverable by search engines and, consequently, your target audience. While much attention is often given to optimizing content pages, it’s equally important to focus on the SEO of digital assets such as images, videos, and documents. This is especially true for platforms like Adobe Experience Manager (AEM), where managing and delivering a vast range of digital assets is a core function. In this blog, we’ll explore the best practices for applying SEO headers to assets in AEM, focusing on the use of noindex and hreflang.

When it comes to digital assets, adding SEO headers such as noindex or hreflang can be more complex than for standard content pages. Unlike content pages, assets often serve a supporting role and may not need to be indexed by search engines. However, when they do, precise handling is required to ensure proper indexing and language targeting.

SEO Configurations for Content Pages:

For content pages, there are two primary methods to enable SEO configurations:

1. HTML Meta Tags:

SEO settings like noindex and hreflang can be added within the HTML metadata of content pages. This involves including these meta tags directly into the page’s HTML structure, either manually or using content authoring tools that automate their insertion.

2. HTTP Headers:

SEO directives such as noindex and hreflang can also be set at the server level using HTTP headers. This method provides an alternative to HTML meta tags for controlling SEO settings, allowing for centralized management of SEO configurations.

SEO Configurations for Assets:

For assets, the only viable method to implement SEO configurations is through HTTP headers, as assets do not have HTML structures where meta tags can be embedded:

1. HTTP Headers for Assets:

HTTP headers such as X-Robots-Tag for noindex and Link for hreflang can be used to manage SEO settings for digital assets like images, PDFs, and videos. These headers must be configured at the server or dispatcher level, ensuring that the appropriate SEO directives are applied to assets.

Example Syntax for Assets:

noindex:

The noindex directive tells search engines not to index a particular asset. This can be crucial for preventing duplicate content issues or for keeping non-essential assets out of search engine results.

X-Robots-Tag: noindex

Selective Application: Apply noindex to assets that do not provide direct value in search results, such as decorative images, icons, or internal documentation files.

hreflang:

The hreflang attribute is essential for assets that have multiple language versions. It helps search engines understand which version of an asset to serve based on the user's language preference.

Link: <https://example.com/assets/test-en.pdf>; rel="alternate"; hreflang="en", <https://example.com/assets/test-es.pdf>; rel="alternate"; hreflang="es"

Consistent Implementation: Ensure that hreflang tags are consistently applied across all versions of an asset.

Implementing SEO Headers for Assets in AEM:

Implementing these headers in AEM involves a combination of configuration and customization:

Dispatcher/CDN Configuration:

The Dispatcher/CDN configuration can be used to enable SEO headers based on specific patterns or directories. This configuration is usually done at the web server level (e.g., Apache) and is effective for broad, pattern-based rules.

Apache Configuration Examples

Folder Level Noindex Header:

<Directory /path/to/your/noindex-directory>
Header set X-Robots-Tag "noindex"
</Directory>

File Level Noindex and Hreflang Headers:

<Files "/path/to/your/specific-file.html">
Header set X-Robots-Tag "noindex"
Header set Link "<http://example.com/specific-file-en.html>; rel=\"alternate\"; hreflang=\"en\",<http://example.com/specific-file-es.html>; rel=\"alternate\"; hreflang=\"es\""
</Files>

2. Custom Approach through Authoring

For more granular control, such as enabling SEO headers at the individual asset level, a custom approach is required. This involves using AEM’s metadata schema to allow authors to control SEO headers and implementing a custom Java filter to apply these headers to asset responses.

Step-by-Step Implementation

Step 1: Metadata Schema Customization

  1. Create or Edit Metadata Schema:
  • Navigate to Tools > Assets > Metadata Schemas in AEM.
  • Create a new schema or edit an existing one.
  • Add fields for noindex and hreflang. For example:
  • noindex: Checkbox
  • hreflang: Text field or multi-value field for multiple languages

2. Apply Metadata Schema:

  • Apply the schema to the relevant asset folders or types.

Step 2: Custom Java Filter

Create a custom Sling filter to read the metadata and set the appropriate headers.

1. Create a Sling Filter:

package com.example.aem.filters;

import org.apache.sling.api.resource.Resource;
import org.apache.sling.api.resource.ResourceResolver;
import org.apache.sling.engine.EngineConstants;
import javax.servlet.Filter;
import javax.servlet.FilterChain;
import javax.servlet.FilterConfig;
import javax.servlet.ServletException;
import javax.servlet.ServletRequest;
import javax.servlet.ServletResponse;
import org.apache.sling.api.SlingHttpServletRequest;
import org.apache.sling.api.SlingHttpServletResponse;
import org.osgi.service.component.annotations.Component;
import org.osgi.service.component.annotations.Reference;

import javax.servlet.ServletException;
import java.io.IOException;
import java.util.ArrayList;
import java.util.List;

@Component(service = Filter.class, property = {
EngineConstants.SLING_FILTER_SCOPE + "=" + EngineConstants.FILTER_SCOPE_REQUEST,
EngineConstants.SLING_FILTER_PATTERN + "=/content/dam/.*"
})

public class SEOHeadersFilter implements Filter {

@Override
public void doFilter(
final ServletRequest request, final ServletResponse response, final FilterChain filterChain)

throws IOException, ServletException {

final SlingHttpServletRequest slingRequest = (SlingHttpServletRequest) request;
final SlingHttpServletResponse slingResponse = (SlingHttpServletResponse) response;

String metadataPath = slingRequest.getResource().getPath() + "/jcr:content/metadata";

Resource resource = slingRequest.getResourceResolver().getResource(metadataPath);
if (resource != null) {
String noindex = resource.getValueMap().get("noindex", String.class);
String[] hreflangArray = resource.getValueMap().get("hreflang", String[].class);

if ("true".equals(noindex)) {
slingResponse.setHeader("X-Robots-Tag", "noindex");
}
if (hreflangArray != null && hreflangArray.length > 0) {
List<String> hreflangList = new ArrayList<>();
for (String hreflangEntry : hreflangArray) {
String[] parts = hreflangEntry.split(":", 2);
if (parts.length == 2) {
String lang = parts[0].trim();
String url = parts[1].trim();
String hreflangUrl = String.format("<%s>; rel=\"alternate\"; hreflang=\"%s\"", url, lang);
hreflangList.add(hreflangUrl);
}
}
if (!hreflangList.isEmpty()) {

String hreflangHeader = String.join(", ", hreflangList);
System.out.println("inside: " + hreflangHeader);
slingResponse.setHeader("Link", hreflangHeader);
}
}
}

filterChain.doFilter(request, response);

}

@Override
public void init(FilterConfig filterConfig) {
}

@Override
public void destroy() {
}

}

2. Deploy the Filter:

  • Deploy the custom filter bundle to your AEM instance.

3. Caching SEO headers on dispatcher:

To make SEO headers available to subsequent requests after caching, the headers need to be cached at the dispatcher. This involves adding X-Robots-Tag and Link headers to the cache headers in the Dispatcher farm file configurations.

/headers {
"Cache-Control"
"Content-Type"
"Expires"
"Last-Modified"
"X-Content-Type-Options"
"X-Robots-Tag"
"Link"
}

Now the author can enable the required assets’ no-index or hreflang configurations.

[The AEM as a Cloud OOTB Fastly CDN blocks all custom headers set from AEM and allows only some standard headers. While the X-Robot-Tag header is currently supported, the Link header is not. Please contact Adobe through a support ticket to enable the Link header for your environments.]

Conclusion:

Optimizing SEO headers for assets in AEM is a nuanced process that differs significantly from optimizing content pages. By utilizing headers like noindex and hreflang, you can ensure that your digital assets are correctly indexed, served in appropriate languages, and managed efficiently.

No comments:

Post a Comment