Friday, January 31, 2020

Efficient Error Handing in AEM(Adobe Experience Manager)

This tutorial explains the approach to configure efficient error handling in AEM(Adobe Experience Manager).

AEM by default uses Sling’s Error handler to handle the error scenarios(/apps/sling/servlet/errorhandler), the publisher returns the error pages for Not Found Resources.

The publisher sends the default error page with 404 status to the Dispatcher, Dispatcher directly send back the error response with 404 error code to the user without caching the error response(spools the error responses directly to the client)

Image for post

With the default Sling Error handler, all the websites display the same default error page to the users - minimal standard content.

Image for post

The default Sling Error handle can be overlayed to customize to enable site-specific, content-rich, and localized error pages.

This can cause performance problems as the 404 pages are not cached in Dispatcher, this makes publishers to process and render the error pages on every not found resources, the multiple requests to the error pages can cause performance issues with AEM publishers. This can lead to a DDOS attack on the platform by sending multiple bad URLs by the attackers, the publisher should process the 404 pages for every request that makes the platform un-responsive for the intended users.

This behavior can be changed by allowing the webserver(Apache) to handle the error scenarios.

Image for post

The publisher sends the default error page with 404 status to the Dispatcher for every Not Found resource’s, Dispatcher handover the error(404) processing to WebServer(Apache). The WebServer checks if the error page is in the cache, if not available then requests the custom error page from the publisher based on the ErrorDocument configurations and returns the error response to the users with the required 404 error code.

The error page response is cached in the WebServer, web server returns the error pages from the cache for the subsequent Not Found(404) requests.

Image for post

This will improve the overall performance as the webserver returns the cached 404 pages for the Not Found resource scenarios, only the first time request sent to the publisher.

Let us enable the required configurations to handle the error scenarios from WebServer(Apache)

As a first step, create a site-specific error page 404-page(use the same name across all the website) under the language node e.g /content/we-retail/us/en with the required components and data

Image for post

As a next step, create an error handler — errorhandler.conf with the required configurations and place it under /etc/httpd/conf/errorhandlers, you can create multiple error handlers as required.

SetEnvIfNoCase Request_URI "^/([^/]+)/([^/]+)" LOCALE=$1/$2<If "tolower(%{ENV:LOCALE}) in { 'us/en','us/es'}">
ErrorDocument 404 /%{ENV:LOCALE}/404-page.html
</If>
<Else>
ErrorDocument 404 /us/en/404-page.html
</Else>

The incoming content path in the above configuration is assumed as a shortened URL e.g /us/en by hiding the /content/we-retail(Apache always receive the page requests without /content/we-retail), change the regex configuration based on your content configurations.

You can define multiple <if> conditions if the URL pattern is completely different for a different sites, the <If> <Else> directive is supported in Apache 2.4 version.

The error handler can be modified to handles other error codes like 500, 403.

Now include the error handler(errorhandler.conf) to the individual virtualhosts

<VirtualHost *:80>ServerName    localhost
ServerAlias 127.0.0.1
DocumentRoot /var/www/html

Include /etc/httpd/conf/errorhandlers/errorhandler.conf

RewriteRule ^/content/weretail/(.*)$ %{HTTP:X-Forwarded-Proto}://%{SERVER_NAME}/$1 [L,R=301]
RewriteEngine on
RewriteCond %{REQUEST_URI} !^/apps [NC]
RewriteCond %{REQUEST_URI} !^/etc [NC]
RewriteCond %{REQUEST_URI} !^/libs [NC]
RewriteCond %{REQUEST_URI} !^/content [NC]
RewriteCond %{REQUEST_URI} !^/system [NC]
RewriteCond %{REQUEST_URI} !^/dam [NC]
RewriteCond %{REQUEST_URI} !^/conf [NC]
RewriteRule ^/(.*) /content/we-retail/$1 [PT]
<Directory />
<IfModule disp_apache2.c>
ModMimeUsePathInfo On
SetHandler dispatcher-handler
</IfModule>
Options Indexes FollowSymLinks Includes
AllowOverride None

</Directory>
</VirtualHost>

Let us now enable a dispatcher configuration to handover the error handling process to the WebServer(Apache), by default Dispatcher handles the error codes and send the response back to the users. This configuration makes an apache error handler to handle the error scenarios based on the ErrorDocument configurations.

Change the value of DispatcherPassError to 1 in dispatcher module configuration, most of the cases these configurations placed in httpd.conf file.

<IfModule disp_apache2.c>
# location of the configuration file. eg: 'conf/dispatcher.any'
DispatcherConfig conf/dispatcher.any
# location of the dispatcher log file. eg: 'logs/dispatcher.log'
DispatcherLog logs/dispatcher.log
# log level for the dispatcher log
# 0 Errors
# 1 Warnings
# 2 Infos
# 3 Debug
DispatcherLogLevel 3
# if turned to 1, the dispatcher looks like a normal module
DispatcherNoServerHeader 0
# if turned to 1, request to / are not handled by the dispatcher
# use the mod_alias then for the correct mapping
DispatcherDeclineRoot 0
# if turned to 1, the dispatcher uses the URL already processed
# by handlers preceeding the dispatcher (i.e. mod_rewrite)
# instead of the original one passed to the web server.
DispatcherUseProcessedURL 1
# Defines how to support 40x error codes for ErrorDocument handling:
# 0 - the dispatcher spools all error responses to the client.
# 1 - the dispatcher does not spool an error response to the client (where the status code is greater or equal than 400)
# but passes the status code to Apache, which e.g. allows an ErrorDocument directive to process such a status code.
DispatcherPassError 1
</IfModule>

Now you should receive the site-specific error pages. Accessing http://localhost/us/en/a.html responds with the 404 pages specific to /us/en website the same way other sites return the corresponding error pages based on the configurations.

efficient-error-handling-in-aem

The error pages are now cached in the Dispatcher.

Image for post

This will improve the overall platform performance and also mitigate some of the security issues — DDOS attach by caching the 404 error pages in the dispatcher.



No comments:

Post a Comment