Thursday, May 30, 2024

Optimizing SEO Headers for Digital Assets — Adobe Experience Manager (AEM)

 Search Engine Optimization (SEO) is crucial for ensuring that your digital content is easily discoverable by search engines and, consequently, your target audience. While much attention is often given to optimizing content pages, it’s equally important to focus on the SEO of digital assets such as images, videos, and documents. This is especially true for platforms like Adobe Experience Manager (AEM), where managing and delivering a vast range of digital assets is a core function. In this blog, we’ll explore the best practices for applying SEO headers to assets in AEM, focusing on the use of noindex and hreflang.

When it comes to digital assets, adding SEO headers such as noindex or hreflang can be more complex than for standard content pages. Unlike content pages, assets often serve a supporting role and may not need to be indexed by search engines. However, when they do, precise handling is required to ensure proper indexing and language targeting.

SEO Configurations for Content Pages:

For content pages, there are two primary methods to enable SEO configurations:

1. HTML Meta Tags:

SEO settings like noindex and hreflang can be added within the HTML metadata of content pages. This involves including these meta tags directly into the page’s HTML structure, either manually or using content authoring tools that automate their insertion.

2. HTTP Headers:

SEO directives such as noindex and hreflang can also be set at the server level using HTTP headers. This method provides an alternative to HTML meta tags for controlling SEO settings, allowing for centralized management of SEO configurations.

SEO Configurations for Assets:

For assets, the only viable method to implement SEO configurations is through HTTP headers, as assets do not have HTML structures where meta tags can be embedded:

1. HTTP Headers for Assets:

HTTP headers such as X-Robots-Tag for noindex and Link for hreflang can be used to manage SEO settings for digital assets like images, PDFs, and videos. These headers must be configured at the server or dispatcher level, ensuring that the appropriate SEO directives are applied to assets.

Example Syntax for Assets:

noindex:

The noindex directive tells search engines not to index a particular asset. This can be crucial for preventing duplicate content issues or for keeping non-essential assets out of search engine results.

X-Robots-Tag: noindex

Selective Application: Apply noindex to assets that do not provide direct value in search results, such as decorative images, icons, or internal documentation files.

hreflang:

The hreflang attribute is essential for assets that have multiple language versions. It helps search engines understand which version of an asset to serve based on the user's language preference.

Link: <https://example.com/assets/test-en.pdf>; rel="alternate"; hreflang="en"
Link: <https://example.com/assets/test-es.pdf>; rel="alternate"; hreflang="es"

Consistent Implementation: Ensure that hreflang tags are consistently applied across all versions of an asset.

Implementing SEO Headers for Assets in AEM:

Implementing these headers in AEM involves a combination of configuration and customization:

Dispatcher/CDN Configuration:

The Dispatcher/CDN configuration can be used to enable SEO headers based on specific patterns or directories. This configuration is usually done at the web server level (e.g., Apache) and is effective for broad, pattern-based rules.

Apache Configuration Examples

Folder Level Noindex Header:

<Directory /path/to/your/noindex-directory>
Header set X-Robots-Tag "noindex"
</Directory>

File Level Noindex and Hreflang Headers:

<Files "/path/to/your/specific-file.html">
Header set X-Robots-Tag "noindex"
Header set Link "<http://example.com/specific-file-en.html>; rel=\"alternate\"; hreflang=\"en\""
Header add Link "<http://example.com/specific-file-es.html>; rel=\"alternate\"; hreflang=\"es\""
</Files>

2. Custom Approach through Authoring

For more granular control, such as enabling SEO headers at the individual asset level, a custom approach is required. This involves using AEM’s metadata schema to allow authors to control SEO headers and implementing a custom Java filter to apply these headers to asset responses.

Step-by-Step Implementation

Step 1: Metadata Schema Customization

  1. Create or Edit Metadata Schema:
  • Navigate to Tools > Assets > Metadata Schemas in AEM.
  • Create a new schema or edit an existing one.
  • Add fields for noindex and hreflang. For example:
  • noindex: Checkbox
  • hreflang: Text field or multi-value field for multiple languages

2. Apply Metadata Schema:

  • Apply the schema to the relevant asset folders or types.

Step 2: Custom Java Filter

Create a custom Sling filter to read the metadata and set the appropriate headers.

1. Create a Sling Filter:

package com.example.aem.filters;

import org.apache.sling.api.resource.Resource;
import org.apache.sling.api.resource.ResourceResolver;
import org.apache.sling.engine.EngineConstants;
import javax.servlet.Filter;
import javax.servlet.FilterChain;
import javax.servlet.FilterConfig;
import javax.servlet.ServletException;
import javax.servlet.ServletRequest;
import javax.servlet.ServletResponse;
import org.apache.sling.api.SlingHttpServletRequest;
import org.apache.sling.api.SlingHttpServletResponse;
import org.osgi.service.component.annotations.Component;
import org.osgi.service.component.annotations.Reference;

import javax.servlet.ServletException;
import java.io.IOException;

@Component(service = Filter.class,
property = {
EngineConstants.SLING_FILTER_SCOPE + "=" + EngineConstants.FILTER_SCOPE_REQUEST,
EngineConstants.SLING_FILTER_PATTERN + "=/content/dam/.*"
})

public class SEOHeadersFilter implements Filter {


@Override
public void doFilter(
final ServletRequest request, final ServletResponse response, final FilterChain filterChain)

throws IOException, ServletException {

final SlingHttpServletRequest slingRequest = (SlingHttpServletRequest) request;
final SlingHttpServletResponse slingResponse = (SlingHttpServletResponse) response;

String metadataPath = slingRequest.getResource().getPath() + "/jcr:content/metadata";

Resource resource = slingRequest.getResourceResolver().getResource(metadataPath);
if (resource != null) {
String noindex = resource.getValueMap().get("noindex", String.class);
String[] hreflangArray = resource.getValueMap().get("hreflang", String[].class);

if ("true".equals(noindex)) {
slingResponse.setHeader("X-Robots-Tag", "noindex");
}
if (hreflangArray != null && hreflangArray.length > 0) {
for (String hreflangEntry : hreflangArray) {
String[] parts = hreflangEntry.split(":", 2);
if (parts.length == 2) {
String lang = parts[0].trim();
String url = parts[1].trim();
String hreflangUrl = String.format("<%s>; rel=\"alternate\"; hreflang=\"%s\"", url, lang);
slingResponse.addHeader("Link", hreflangUrl);
}
}
}
}

filterChain.doFilter(request, response);

}

@Override
public void init(FilterConfig filterConfig) {}

@Override
public void destroy() {}

}

2. Deploy the Filter:

  • Deploy the custom filter bundle to your AEM instance.

3. Caching SEO headers on dispatcher:

To make SEO headers available to subsequent requests after caching, the headers need to be cached at the dispatcher. This involves adding X-Robots-Tag and Link headers to the cache headers in the Dispatcher farm file configurations.

/headers {
"Cache-Control"
"Content-Type"
"Expires"
"Last-Modified"
"X-Content-Type-Options"
"X-Robots-Tag"
"Link"
}

Now the author can enable the required assets’ no-index or hreflang configurations.

[The AEM as a Cloud OOTB Fastly CDN blocks all custom headers set from AEM and allows only some standard headers. While the X-Robot-Tag header is currently supported, the Link header is not. Please contact Adobe through a support ticket to enable the Link header for your environments.]

Conclusion:

Optimizing SEO headers for assets in AEM is a nuanced process that differs significantly from optimizing content pages. By utilizing headers like noindex and hreflang, you can ensure that your digital assets are correctly indexed, served in appropriate languages, and managed efficiently.

Wednesday, May 22, 2024

GeoLocation Redirection in AEM as a Cloud

 In this post, we will explore how to enable GeoLocation redirection in AEM as a Cloud Service.

Sometimes, we may need to enable visitor country-based redirects to direct users to the appropriate page when they access a domain. Multiple approaches are possible to handle geolocation redirection. For example, client-side redirects using the Google Geocoder API or other Geocoder APIs can identify the visitor’s country. Additionally, server-side redirects can be enabled, such as using Apache in conjunction with geocoder services. Most CDN services also provide geo headers that capture the visitor’s country, which can be used to enable redirects.

Let’s now explore different approaches in AEM as a Cloud:

Option 1: Redirect through Apache Using CDN Geo Country Header

AEM as a Cloud uses Fastly CDN out of the box (OOTB). The CDN provides geo headers, such as x-aem-client-country, with every request, supplying an Alpha-2 country code (e.g., US, AR, etc.). This header can be utilized in Apache (Dispatcher) to redirect users to the appropriate country-specific URL. For instance, if a user’s request originates from the US for the domain www.test.com, they can be redirected to the corresponding country-specific homepage URL. To prevent the caching of redirects in the CDN and browser, caching should be disabled for the root path with max-age=0, no-cache, and no-store.

RewriteCond %{REQUEST_URI} ^/$
RewriteCond %{HTTP:x-aem-client-country} ^US$
RewriteRule ^.*$ https:/
/www.test.com/us/en/home.html.html [R=301,L]

Option 2: Redirect through CDN Using clientCountry

Another option is to handle the redirect directly through the CDN using the clientCountry header. The AEM as a Cloud service OOTB CDN now enables multiple capabilities that can be managed by the customer, such as origin selectors, request and response transformation, etc. For more details, refer to A Deep Dive into CDN Capabilities Within AEM as a Cloud | by Albin Issac | Tech Learnings | May, 2024 | Medium..

The latest addition to these capabilities is client redirects. The CDN now allows customers to configure different URL redirects like 301 and 302. This client redirect feature can be combined with the existing geo-location capability to redirect users to country-specific URLs. Note that this client redirect feature is not yet generally available. To join the early-adopter program, email [email protected].

You can add the following rule configuration to the cdn.yml file and add additional rules to meet your criteria:

experimental_redirects:
rules:
- name: country-redirect-us
when:
allOf:
- { reqProperty: clientCountry, equals: "US" }
- { reqProperty: domain, equals: "www.test.com" }
- { reqProperty: path, equals: "/" }
action:
type: redirect
location: https://www.test.com/us/en/home.html

- name: country-redirect-ca
when:
allOf:
- { reqProperty: clientCountry, equals: "CA" }
- { reqProperty: domain, equals: "www.test.com" }
- { reqProperty: path, equals: "/" }
action:
type: redirect
location: https://www.test.com/ca/en/home.html

- name: country-redirect-default
when:
allOf:
- { reqProperty: domain, equals: "www.test.com" }
- { reqProperty: path, equals: "/" }
action:
type: redirect
location: https://www.test.com/default/en/home.html

This configuration sets up rules to redirect users based on their country. For example, users from the US accessing www.test.com will be redirected to the country-specific URL https://www.test.com/us/en/home.html, and users from Canada accessing www.test.com will be redirected to the country-specific URL https://www.test.com/ca/en/home.html. The default rule sends visitors from the rest of the countries to the default website URL https://www.test.com/default/en/home.html. You can add more rules to the cdn.yml file to meet other redirection criteria.

Please note that since the redirect is an experimental feature, experimental_redirects: is used in the configuration. The experimental_ prefix should be removed once this capability becomes generally available (GA).