Friday, January 31, 2020

Efficient Error Handing in AEM(Adobe Experience Manager)

This tutorial explains the approach to configure efficient error handling in AEM(Adobe Experience Manager).

AEM by default uses Sling’s Error handler to handle the error scenarios(/apps/sling/servlet/errorhandler), the publisher returns the error pages for Not Found Resources.

The publisher sends the default error page with 404 status to the Dispatcher, Dispatcher directly send back the error response with 404 error code to the user without caching the error response(spools the error responses directly to the client)

Image for post

With the default Sling Error handler, all the websites display the same default error page to the users - minimal standard content.

Image for post

The default Sling Error handle can be overlayed to customize to enable site-specific, content-rich, and localized error pages.

This can cause performance problems as the 404 pages are not cached in Dispatcher, this makes publishers to process and render the error pages on every not found resources, the multiple requests to the error pages can cause performance issues with AEM publishers. This can lead to a DDOS attack on the platform by sending multiple bad URLs by the attackers, the publisher should process the 404 pages for every request that makes the platform un-responsive for the intended users.

This behavior can be changed by allowing the webserver(Apache) to handle the error scenarios.

Image for post

The publisher sends the default error page with 404 status to the Dispatcher for every Not Found resource’s, Dispatcher handover the error(404) processing to WebServer(Apache). The WebServer checks if the error page is in the cache, if not available then requests the custom error page from the publisher based on the ErrorDocument configurations and returns the error response to the users with the required 404 error code.

The error page response is cached in the WebServer, web server returns the error pages from the cache for the subsequent Not Found(404) requests.

Image for post

This will improve the overall performance as the webserver returns the cached 404 pages for the Not Found resource scenarios, only the first time request sent to the publisher.

Let us enable the required configurations to handle the error scenarios from WebServer(Apache)

As a first step, create a site-specific error page 404-page(use the same name across all the website) under the language node e.g /content/we-retail/us/en with the required components and data

Image for post

As a next step, create an error handler — errorhandler.conf with the required configurations and place it under /etc/httpd/conf/errorhandlers, you can create multiple error handlers as required.

SetEnvIfNoCase Request_URI "^/([^/]+)/([^/]+)" LOCALE=$1/$2<If "tolower(%{ENV:LOCALE}) in { 'us/en','us/es'}">
ErrorDocument 404 /%{ENV:LOCALE}/404-page.html
</If>
<Else>
ErrorDocument 404 /us/en/404-page.html
</Else>

The incoming content path in the above configuration is assumed as a shortened URL e.g /us/en by hiding the /content/we-retail(Apache always receive the page requests without /content/we-retail), change the regex configuration based on your content configurations.

You can define multiple <if> conditions if the URL pattern is completely different for a different sites, the <If> <Else> directive is supported in Apache 2.4 version.

The error handler can be modified to handles other error codes like 500, 403.

Now include the error handler(errorhandler.conf) to the individual virtualhosts

<VirtualHost *:80>ServerName    localhost
ServerAlias 127.0.0.1
DocumentRoot /var/www/html

Include /etc/httpd/conf/errorhandlers/errorhandler.conf

RewriteRule ^/content/weretail/(.*)$ %{HTTP:X-Forwarded-Proto}://%{SERVER_NAME}/$1 [L,R=301]
RewriteEngine on
RewriteCond %{REQUEST_URI} !^/apps [NC]
RewriteCond %{REQUEST_URI} !^/etc [NC]
RewriteCond %{REQUEST_URI} !^/libs [NC]
RewriteCond %{REQUEST_URI} !^/content [NC]
RewriteCond %{REQUEST_URI} !^/system [NC]
RewriteCond %{REQUEST_URI} !^/dam [NC]
RewriteCond %{REQUEST_URI} !^/conf [NC]
RewriteRule ^/(.*) /content/we-retail/$1 [PT]
<Directory />
<IfModule disp_apache2.c>
ModMimeUsePathInfo On
SetHandler dispatcher-handler
</IfModule>
Options Indexes FollowSymLinks Includes
AllowOverride None

</Directory>
</VirtualHost>

Let us now enable a dispatcher configuration to handover the error handling process to the WebServer(Apache), by default Dispatcher handles the error codes and send the response back to the users. This configuration makes an apache error handler to handle the error scenarios based on the ErrorDocument configurations.

Change the value of DispatcherPassError to 1 in dispatcher module configuration, most of the cases these configurations placed in httpd.conf file.

<IfModule disp_apache2.c>
# location of the configuration file. eg: 'conf/dispatcher.any'
DispatcherConfig conf/dispatcher.any
# location of the dispatcher log file. eg: 'logs/dispatcher.log'
DispatcherLog logs/dispatcher.log
# log level for the dispatcher log
# 0 Errors
# 1 Warnings
# 2 Infos
# 3 Debug
DispatcherLogLevel 3
# if turned to 1, the dispatcher looks like a normal module
DispatcherNoServerHeader 0
# if turned to 1, request to / are not handled by the dispatcher
# use the mod_alias then for the correct mapping
DispatcherDeclineRoot 0
# if turned to 1, the dispatcher uses the URL already processed
# by handlers preceeding the dispatcher (i.e. mod_rewrite)
# instead of the original one passed to the web server.
DispatcherUseProcessedURL 1
# Defines how to support 40x error codes for ErrorDocument handling:
# 0 - the dispatcher spools all error responses to the client.
# 1 - the dispatcher does not spool an error response to the client (where the status code is greater or equal than 400)
# but passes the status code to Apache, which e.g. allows an ErrorDocument directive to process such a status code.
DispatcherPassError 1
</IfModule>

Now you should receive the site-specific error pages. Accessing http://localhost/us/en/a.html responds with the 404 pages specific to /us/en website the same way other sites return the corresponding error pages based on the configurations.

efficient-error-handling-in-aem

The error pages are now cached in the Dispatcher.

Image for post

This will improve the overall platform performance and also mitigate some of the security issues — DDOS attach by caching the 404 error pages in the dispatcher.



Tuesday, January 21, 2020

Same site Cookie Attribute

Same site Cookie Attribute

This video explains the details on new Same site cookie Attribute recently introduced.
  • SameSite=Strict
  • SameSite=Lax
  • SameSite=None

Sling Dynamic Include — Deep Dive | Dynamically Include Page Components in AEM

This tutorial deep dive into Apache Sling Dynamic Include and explains how to use different include types in AEM to dynamically include the page components in AEM.

SDI(Sling Dynamic Include) Introduction

  • Most of the cases AEM sites contain static data that can be cached in Dispatcher, CDN, or other caching layers.

SDI Include Types

Include tags helps to add dynamic data before returning the page to the client, Include tag types available on different layers between the dispatcher and client.

sling-dynamic-include-in-aem

SSI(Server Side Include) — Apache(Webserver)/Dispatcher layer process includes and replaces the includes with real content.

ESI(Edge Site Include) — CDN layer process includes and replaces the includes with real content.

Javascript Include — The browser process includes and replaces the includes with real content through AJAX call. Javascript Include may impact the page load time as the separate Ajax call is initiated to replace the placeholder with real content also this will create the problem with browser that won't support the JavaScript.

SSI Flow — Apache and Dispatcher Module

Image for post

This is the SSI include flow through Apache Webserver and Dispatcher Module, the Apache layer process includes and replaces with the real content before sending the response to the client. While the user requests the SDI enabled content, Dispatcher requests the content from the AEM server and cache the content response with the include enabled, on every request Apache/Dispatcher Module process and replace the include place holder with real content before sending the response to the client. The same flow is followed for other include type ESI and JSI, the include tags are processed and replaced with real content in different layers.

Configure SDI

As a first step install the “Sling Dynamic Include” bundle, the bundle can be downloaded from https://sling.apache.org/downloads.cgi

sling-dynamic-include-aem

Install the bundle to AEM and ensure the bundle is in Active state.

Image for post

I have enabled a dynamic component with the name “dynamicdata” that will display the current user logged into the system

dynamicdata.html

<h1>${currentSession.userID}</h1>

Let us now enable the required configurations in Apache/Dispatcher

As a first step load the mod_include.so module through httpd.conf or the configuration file that loads the modules.

LoadModule include_module modules/mod_include.so

Configure the virtual-host to process the includes

<VirtualHost *:8080>
ServerAdmin [email protected]
DocumentRoot "C:\opt\communique\dispatcher\cache"
ServerName demo.albinsblog.com
ServerAlias cdn.albinsblog.com

RewriteEngine On
#RewriteRule ^/$ /en.html [R=301]
#RewriteRule ^/content/wknd/ca/(.*)$ /$1 [NE,L,R=301]

<Directory />
<IfModule disp_apache2.c>
SetHandler dispatcher-handler
</IfModule>
Options Indexes FollowSymLinks Includes
# Set includes to process .html files
AddOutputFilter INCLUDES .html
AddOutputFilterByType INCLUDES text/html

AllowOverride None
</Directory>
</VirtualHost>

Now disable the cache for “nocache” selector in dispatcher farm files under the cache rules section

/rules
{
/0000
{
/glob "*"
/type "allow"
}
/0001
{

/glob "/libs/cq/security/userinfo.*"
/type "deny"
}

/0004
{
/glob "*.nocache.html*"
/type "deny"
}


}

SSI Include

Let us now enable the SDI configuration for SSI include, in this scenario Apache/Dispatcher replaces the include place holder with real content.

sling-dynamic-include-aem
enabled – set it to true to enable SDI
path – SDI configuration will be enabled only for this path
resource-types – which components should be replaced with tags
include-type – type of include tag (Apache SSI, ESI or Javascript)
Filter selector – selector added to HTTP request for particular component and is used to get actual content.

Now access the page enabled with “dynamicdata” component — this will display the logged in user name

Image for post

If you see the page source, the dynamic data is already replaced by Apache/Dispatcher before sending the response to the client.

Image for post

Let us review the cached content in Apache/Dispatcher — the content is enabled with include placeholder and replaced with real content on every request.

Image for post

JavaScript Include

Let us now enable the SDI configuration for Java Script include, in this scenario Browser replaces the include place holder with real content through additional Ajax call after the initial page load.

Image for post

Now access the page enabled with “dynamicdata” component — this will display the logged in user name

Image for post

The include is enabled with javascript to replace the dynamic data through the Ajax call.

Image for post

Let us review the cached content in Apache/Dispatcher — the content is enabled with javascript to replace the dynamic data through the Ajax call

Image for post

ESI Include

Let us now enable the SDI configuration for ESI include, in this scenario CDN replaces the include place holder with real content.

I am going to enable the ESI to include through Cloudflare — Akami supports the ESI by enabling some configuration, CloudFront won't support ESI by default.

Create a worker and assign the worker to a route in Cloudflare to process the ESI includes and replace it with real content before sending the response to the client.

esihandler



Image for post
sling-dynamic-include-aem

Now access the page enabled with “dynamicdata” component — this will display the logged in user name, the CDN process the ESI includes placeholder, and replace with real content.

sling-dynamic-include-aem

Let us review the cached content in Apache/Dispatcher — the content is enabled with “ESI include” to replace the dynamic data by CDN.

Image for post

The Sling Dynamic include supports different include options — SSI Include, ESI Include, and Java Script Include, each option has its own pros and cons. The include option can be selected based on the use case and the project requirement.

References