Friday, April 26, 2024

Revolutionizing Onsite Search: The Power of Generative AI | Generative Answering

 In this post, we will explore how Generative AI is transforming on-site search capabilities, enhancing user experiences, and streamlining access to information. We’ll also delve into some of the challenges that come with integrating this advanced technology into existing systems.

Introduction to Onsite Search

Onsite search refers to the functionality on websites that allows users to enter search queries and retrieve information or products within that particular website. This tool is crucial for enhancing user experience, especially on e-commerce platforms where finding products quickly and efficiently can significantly impact customer satisfaction and conversion rates. Effective onsite search can mimic the ease and precision of in-person shopping experiences by quickly guiding users to their desired products or information.

However, traditional search functionalities often face challenges such as dealing with synonyms, spelling errors, and understanding the intent behind a user’s query. These limitations can lead to irrelevant search results, a frustrating user experience, and potentially lost sales. Moreover, traditional searches might struggle with indexing and retrieving content in a way that aligns with how users think and search.

For a deeper dive into enhancing onsite search, see my previous articles Onsite Search: The Must-Have Features for a Seamless User Experience | by Albin Issac | Tech Learnings | Medium Selecting a Search Engine for on-site Search — Open Source vs Search as a Service | by Albin Issac | Medium

Overview of Generative AI

Generative AI refers to a subset of artificial intelligence technologies that can generate new content and ideas based on the data it has been trained on. This contrasts with other forms of AI that typically focus on classifying or predicting outcomes based on input data. Generative AI involves advanced machine learning techniques, including natural language processing (NLP) and neural networks, to produce outputs that are not explicitly programmed into the system.

Technologies behind generative AI, such as NLP, enable the AI to understand, interpret, and manipulate human language. Neural networks, particularly deep learning models, allow the AI to process large amounts of data, recognize patterns, and make decisions with minimal human intervention. These capabilities make generative AI particularly valuable for enhancing onsite search because it can understand and interpret the complexities and nuances of human language, predict what users are searching for even with vague or incomplete queries, and generate relevant results even from sparse data inputs.

By leveraging generative AI, websites can overcome many of the traditional challenges associated with onsite search, leading to a more intuitive and satisfying user experience. This is particularly beneficial in e-commerce settings, where the ability to quickly and accurately display products based on user queries can directly influence buying decisions and overall business success.

Search engines have long utilized AI technologies to enhance user experience and deliver more relevant content. Here are several AI-driven features that have been key in revolutionizing search functionalities:

  • Smart Snippets: Provide concise previews of content relevant to search queries, offering users quick insights into the content’s relevance without having to click through to the page.
  • Automatic Relevance Tuning (ART): This leverages user interaction data to dynamically adjust and refine the relevance of search results, ensuring that users find the most pertinent information based on collective user behaviors.
  • Query Suggestions: By analyzing aggregated search data, this feature suggests related queries in real-time as users type, helping them formulate more effective searches and discover content more efficiently.
  • Content Recommendations: Utilizing user behavior, previous search histories, and browsing patterns, AI suggests additional relevant content, potentially increasing engagement and time spent on a site.
  • Dynamic Navigation Experience: AI algorithms optimize and personalize navigation interfaces based on common user journeys and behaviors, making site navigation more intuitive and user-friendly.
  • Image and Video Recognition: Advanced visual recognition technologies enable search engines to index and retrieve images and videos based not just on metadata but on the visual content itself, enhancing the search capabilities for multimedia.
  • Document Search: AI enhances the ability to search within documents by recognizing and indexing text from various document formats. This technology enables users to perform deep content searches, finding specific information within large documents or across a collection of documents. AI-driven features like optical character recognition (OCR) and natural language understanding (NLU) are often employed to interpret the content accurately and return relevant results based on the content’s context rather than just keyword matches.
  • Sentiment Analysis: Employed by some search engines, this feature analyzes the emotional tone of content, useful for businesses monitoring brand reputation or media sites gauging public sentiment.
  • Personalization Engines: These tailor search results and content displays to individual users based on their unique preferences and past interactions, creating a more personalized and relevant browsing experience.

Generative Answering with AI in Search

Generative AI is transforming the landscape of search technology by introducing the advanced capability of generative answering. This innovative approach transcends the conventional mechanisms of search that primarily focus on retrieving data. Instead, it involves the understanding and generation of responses that directly address a user’s queries.

What is Generative Answering?

Generative answering is the capability of AI systems to construct responses from scratch, based on the extensive data they have been trained on. Utilizing models such as GPT (Generative Pre-trained Transformer) and BERT (Bidirectional Encoder Representations from Transformers), this technology grasps queries in a detailed and nuanced manner, delivering precise, context-aware answers.

Image from Coveo

Key Benefits of Generative Answering in Search

  • Direct Answers: Moving beyond traditional search engines that list documents or links, generative answering provides a direct and succinct response to queries, synthesizing information from multiple sources or extrapolating from existing data.
  • Natural Language Interaction: This capability fosters a more conversational interaction, allowing users to engage with the AI in a dialogue format where the system understands and responds to follow-up questions or requests for clarification.
  • Personalized Responses: The AI customizes responses based on individual user history or preferences, greatly enhancing the relevance and personalization of the search experience.
  • Efficiency in Information Discovery: By quickly delivering the specific information users seek, generative answering significantly reduces the time spent navigating through irrelevant results.

Expanded Applications of Generative AI in Onsite Search

  • Customer Support: Provides immediate, accurate answers to customer inquiries, streamlining customer service operations without the need for extensive human intervention.
  • E-commerce: Enhances product discovery and customer shopping experience by offering detailed product information and tailored recommendations directly in response to user queries.
  • Education: Supports an interactive learning environment by instantly responding to student queries with detailed explanations, promoting a more engaging educational experience.
  • Content-Based Sites: Significantly improves search and discovery on content-rich platforms such as news sites, blogs, and digital libraries. Generative AI can swiftly analyze large datasets, enabling it to deliver precise answers, recommend related content, and generate content summaries that aid users in efficiently navigating and consuming information.
  • AI-Powered Chatbots: These sophisticated chatbots utilize indexed data to conduct meaningful conversations, providing support and information directly through interactive dialogues.
  • Content Creation and Augmentation: AI not only helps in creating new content that aligns with current trends and user interests but also optimizes existing content to enhance its discoverability and engagement.

Generative answering, along with these applications, illustrates the profound impact Generative AI has on enhancing the functionality and user experience of search systems. As this technology continues to evolve, its integration into search platforms is expected to redefine the norms of how information is queried and retrieved, making searches more intuitive, interactive, and effective.

Most of search engines now support Generative AI capabilities to enable Generative Answering and chatbots, the search engines already have customer data indexed through a traditional approach but now the search engines started supporting the Retrieval Augmented Generation approach to enable generative AI capabilities. The content indexing and retrieval process is enabled with additional steps to support Generative Answering.

High-Level Steps in Indexing and Retrieval to Support Generative Answering:

1. Document Indexing and Embedding Generation: Before any search queries are processed, documents must be indexed. As part of this indexing process, each document is processed to generate embeddings. These embeddings, which are dense vector representations of the document’s content, capture the semantic essence of the text. Using AI models such as BERT or GPT, the search engine converts the text of each document into these embeddings and stores them in the search index. This allows for efficient retrieval based on semantic similarity rather than mere keyword matching.

2. Query Reception and Preprocessing: When a query is received, the first step is preprocessing. This involves normalizing the text by converting it to lowercase, removing punctuation, and correcting any typos. Such preprocessing simplifies the text to ensure more effective subsequent analysis.

3. Query Embedding Generation: The preprocessed query is then converted into an embedding using the same AI models used during document indexing. This ensures that the query can be semantically compared to the indexed documents.

4. Chunking and Contextual Analysis: If the query is complex or contains multiple elements, it is broken down into smaller chunks. This chunking isolates specific aspects of the query for more targeted processing. Contextual analysis is also performed, considering previously indexed information, user’s search history, or the specific session context.

5. Retrieval from Index: The query embedding is used to retrieve the most relevant documents from the search index. This is done using similarity metrics that compare the query’s embedding with the embeddings of documents stored in the index, ensuring the retrieval of content that matches the semantic intent of the query.

6. Generate Prompting: After relevant information is retrieved, the search engine constructs a detailed prompt for the generative model. This prompt includes the query, relevant context from the retrieved documents, and additional pertinent information that can guide the AI in generating an accurate response.

7. Send Prompt to Generative Model: This prompt is sent to a generative AI model, such as GPT-3 or GPT-4, which uses it to generate a response that directly answers the user’s query. The generative model synthesizes the information, producing coherent, context-aware text.

8. Answer Generation and Post-Processing: The generated answer undergoes post-processing to ensure it meets quality standards, refining it for clarity and appropriateness. If necessary, the prompt may be adjusted and the generation process rerun.

9. Delivery to User: The refined answer is then delivered to the user through an interface that may allow further interactions, such as query refinement, follow-up questions, or feedback. This feedback is crucial for continuous learning and system improvement.

The integration of Generative AI into search engines represents a paradigm shift away from traditional keyword-centric approaches towards more dynamic, context-sensitive systems. This advancement enriches the search experience, providing users with direct answers, engaging conversations, and tailored content, thereby refining the information discovery process.

Most onsite search engines have now adopted Generative AI capabilities. The advantage here is that since data has already been indexed using traditional methods, implementing Generative AI is relatively straightforward, avoiding the need for extensive content reprocessing and the setup of Retrieval Augmented Generation (RAG) workflows. Furthermore, these search engines offer APIs that facilitate the retrieval of data for both traditional search results and generative answering, making it possible to display information through various interfaces like search pages, chatbots, and more.

A concern with this approach, however, is that many search engine systems offer limited flexibility in selecting embedding models, configuring chunking, or choosing the GPT model. They often rely on predefined proprietary or open-source models, configurations, and prompts. This can be problematic for those who require more control over data processing and the customization of prompts and models.

The capacity for Generative AI to deliver quick and direct answers to customers is an impressive feature that significantly enhances user interaction. Looking forward, there is an anticipation that search engines will begin to provide greater flexibility and options to meet custom requirements for generative answering, catering to a wider range of use cases and further innovating the field of search.

Refer to About Relevance Generative Answering (RGA) | Coveo Machine Learning for understanding how Coveo supports Generative Answering



Friday, March 15, 2024

Responsive Authoring Issue in AEM as a Cloud

 

Photo by Taras Shypka on Unsplash

While migrating our websites from AMS (AEM 6.5) to AEM as a Cloud, we noticed that responsive authoring for certain Editable templates wasn’t functioning as expected. Interestingly, these same templates were working flawlessly in the AMS environment.

Additionally, selecting the breakpoints was not functioning correctly.

This issue was not present in the AMS (6.5) environment, where both responsive authoring and breakpoint selection were functioning as expected.

After analysis, the root cause of the issue is the logic difference between AME 6.5 and AEM as Cloud for responsive authoring on /libs/cq/gui/components/authoring/editors/clientlibs/core.lc-0e2523ebda58d68c5bc85efa684b50e6-lc.min.js (the hash may vary if any additional changes introduced)

In AEM 6.5 — check if the configuration width is greater than or equal to the device width.

if(cfg[bp].width>=deviceWidth

getDeviceBreakpoint:function(deviceWidth){var cfg=this.getBreakpoints(),closestBp;for(var bp in cfg)if(cfg[bp].width>=deviceWidth&&(!closestBp||cfg[bp].width<=cfg[closestBp].width))closestBp=bp;return closestBp}

In AEM as a Cloud — check if the configuration width is greater than the device width. This can lead to the selection of an incorrect breakpoint. Consequently, changes that should apply to the actual breakpoint, like small, end up being applied to a different one, such as medium.

if(cfg[bp].width>deviceWidth

getDeviceBreakpoint:function(deviceWidth){var cfg=this.getBreakpoints(),closestBp;for(var bp in cfg)if(cfg[bp].width>deviceWidth&&(!closestBp||cfg[bp].width<=cfg[closestBp].width))closestBp=bp;return closestBp}

To address the issue, it is essential to ensure that the responsive configuration sets the breakpoints in emulators to one less than the standard breakpoint value (standard breakpoint value — 1). Additionally, within the editable template structure’s responsive configuration, the breakpoint value should be maintained at the standard breakpoint value.

For instance, in the configuration of the emulator for a large breakpoint: /apps/<Project>/emulators/bootstrap/large/cq:emulatorConfig, (width:1199)

In the responsive configuration of the template structure for large breakpoints (for better management enable these configurations through the template-type): /conf/<project>/settings/wcm/templates/<template>/structure/jcr:content/cq:responsive/breakpoints/large (width:1200)

You can examine the emulator and responsive configuration of a page using the PageInfo servlet — For example, https://aemhost/libs/wcm/core/content/pageinfo.json?path=%2Fcontent%2Ftest%2Fus%2Fen%2Ftest-page

Now, you can start editing the page to make it work well on different devices.



Thursday, February 22, 2024

ReferenceError: XMLHttpRequest is not defined - Ollama Java Script

I encountered the following error while using Ollama JavaScript on my local system.

ReferenceError: XMLHttpRequest is not defined

    at \Development\ollama\js\node_modules\whatwg-fetch\dist\fetch.umd.js:540:17

    at new Promise (<anonymous>)

    at fetch (\Development\ollama\js\node_modules\whatwg-fetch\dist\fetch.umd.js:533:12)

    at file:////Development/ollama/js/node_modules/ollama/dist/utils.js:77:28

    at Generator.next (<anonymous>)

    at file:////Development/ollama/js/node_modules/ollama/dist/utils.js:7:71

    at new Promise (<anonymous>)

    at __awaiter (file:////Development/ollama/js/node_modules/ollama/dist/utils.js:3:12)

    at Module.post (file:////Development/ollama/js/node_modules/ollama/dist/utils.js:72:53)

    at Ollama.<anonymous> (file:////Development/ollama/js/node_modules/ollama/dist/index.js:59:42)

The actual issue was with the node and npm versions, please ensure the latest node/npm version is installed on your system before running the Ollama JS in your local system.

  • Node v20.11.0+
  • NPM v10.2.4+


Error: Cannot find module '@npmcli/config'

I encountered the following error while executing any npm command, such as 'npm -v', on a Windows system.

\AppData\Roaming\nvm\v20.11.1\node_modules\npm\lib\es6\validate-engines.js:31

    throw err

    ^

Error: Cannot find module '@npmcli/config'

Require stack:

- \AppData\Roaming\nvm\v20.11.1\node_modules\npm\lib\npm.js

- \AppData\Roaming\nvm\v20.11.1\node_modules\npm\lib\cli-entry.js

- \AppData\Roaming\nvm\v20.11.1\node_modules\npm\lib\cli.js

- AppData\Roaming\nvm\v20.11.1\node_modules\npm\bin\npm-cli.js

    at Module._resolveFilename (node:internal/modules/cjs/loader:1144:15)

    at Module._load (node:internal/modules/cjs/loader:985:27)

    at Module.require (node:internal/modules/cjs/loader:1235:19)

    at require (node:internal/modules/helpers:176:18)

    at Object.<anonymous> (\AppData\Roaming\nvm\v20.11.1\node_modules\npm\lib\npm.js:2:16)

    at Module._compile (node:internal/modules/cjs/loader:1376:14)

    at Module._extensions..js (node:internal/modules/cjs/loader:1435:10)

    at Module.load (node:internal/modules/cjs/loader:1207:32)

    at Module._load (node:internal/modules/cjs/loader:1023:12)

    at Module.require (node:internal/modules/cjs/loader:1235:19) {

  code: 'MODULE_NOT_FOUND',

  requireStack: [

    '\\AppData\\Roaming\\nvm\\v20.11.1\\node_modules\\npm\\lib\\npm.js',

    '\AppData\\Roaming\\nvm\\v20.11.1\\node_modules\\npm\\lib\\cli-entry.js',

    '\\albin\\AppData\\Roaming\\nvm\\v20.11.1\\node_modules\\npm\\lib\\cli.js',

    '\\albin\\AppData\\Roaming\\nvm\\v20.11.1\\node_modules\\npm\\bin\\npm-cli.js'

  ]

}

 I was managing Node.js using NVM, and the issue only occurred with versions 20.11.0 and above, while earlier versions worked perfectly. The PATH setup appeared to be correct. Ultimately, I resolved the issue by directly downloading and installing Node.js on the system.



Friday, February 9, 2024

Exploring Security Features in Adobe Experience Manager for Cloud Environments

 In this post, let us explore some of the security-related setup/configurations available on AEM as a Cloud platform to protect the platform.

Traffic Filter Rules:

Traffic filter rules can be used to block or allow requests at the CDN layer (Fastly). These traffic filter rules are available to all AEM as Cloud Service Sites and Forms customers OOTB.

The traffic filter rules can be used for multiple scenarios.

  • Rate Limit the requests based on client IPS’s.
  • Block traffic based on IP addresses, Request Path, Query String, method, domain, reqHeader, reqCookie, postParam, etc.
  • Black traffic from specific countries.

The traffic rules can be targeted to the Author or Publish tier or both together. You can also apply various operations — like equals, doesNotEquals, in, matches, etc. The CDN responds with a 406 return code if a rule is matched and blocked.

The traffic rules can be managed through a YAML file and deployed separately through the Cloud Manager Config pipeline to Non-prod and prod environments. Create a YAML file(cdn.yaml) specific to Dev environments and Stage/Prod environments, separate into different folders, e.g., Config-Dev/cdn.yaml and Config-Prod/cdn.yaml; the config files can be managed through a separate repository or within a Dispatcher module repository, crate config pipeline specific to dev and stage/prod environments and point to the corresponding config folder along with repository, branches and other configurations.

For more details, refer to this document — Traffic Filter Rules including WAF Rules | Adobe Experience ManagerExamples and result analysis of Traffic Filter rules including WAF rules | Adobe Experience Manager (mktossl.com)

WAF (Web Application Firewall):

A WAF helps protect web applications by filtering and monitoring HTTP traffic between a web application and the Internet. It typically protects web applications from attacks such as DDOS, cross-site forgery, cross-site scripting (XSS), file inclusion, and SQL injection. A shield is placed between the web application and the Internet by deploying a WAF in front of a web application. The WAF operates through a set of rules that aim to protect against vulnerabilities in the application by filtering out malicious traffic.

The WAF rules/flags, e.g., XSS, SQLI, LOG4J-JNDI, can be enabled along with the other traffic filter rules explained above. The WAF rules can be enabled through the same cdn.yaml file and the Config pipeline.

The WAF traffic filter rules require either an Enhanced Security license or a WAF-DDoS Protection license.

For more details, refer to this document — Traffic Filter Rules including WAF Rules | Adobe Experience Manager.

Mod_Security:

Mod_security is an Apache module that helps protect your website from various attacks. Mod_security acts as a Web Application Firewall (WAF) that filters and blocks known malicious HTTP requests. Blocked HTTP requests include many, but not all, forms of Brute Force, Cross-Site Scripting (XSS), Remote File Inclusion (RFI), Remote Execution, and SQL injection (SQLi) attacks. By default, the mod security module is enabled on AEM as a Cloud Dispatcher (Apache), but the required rules and configurations can be enabled based on your needs.

For more details, refer to this document — Use ModSecurity to protect your AEM site from DoS Attack | Adobe Experience Manager (mktossl.com)

IP Allow List:

IP allowlisting is a way of giving trusted individuals access to the business network. With an IP allow list, the network administrator can allow specific IP addresses to access your files, applications, and software remotely.

AEM as a cloud service is, by default, accessible via the internet. While security is handled through user authentication and authorization, IP allow-listing is a way to limit access only to trusted IP addresses. Cloud Manager’s IP allowlists can be used to limit and control access only to such trusted IP addresses.

Cloud Manager users with appropriate permissions can create allowlists of trusted IP addresses from which their site’s users can access their AEM domains. After adding IP allowlists — Enter an IP or IP CIDR block that can be applied/unapplied multiple times as a unit or entity to an author and/or publisher service in an environment. For instance, this would be helpful if you wish to allow access to your Author environment from your company’s network, VPN, or VDI but block external access.

For more details, refer to this document — Adding IP Allow Lists | Adobe Experience Manager Applying and Un-Applying IP Allow Lists | Adobe Experience Manager

CDN uses the IP Allowlists defined in Cloud Manager to block the incoming requests for a specific environment/tier. The IP Allow lists defined in Cloud Manager take precedence over Traffic Filters Rules.

Advanced Networking:

AEM as a Cloud Service provides advanced networking features that allow for precise management of connections to and from AEM as a Cloud Service program.

AEM as a Cloud supports various networking configurations, including — Flexible Port egress, Dedicated egress IP address, and VPN.

Virtual Private Network (VPN) allows an AEM as a Cloud Service customer to connect the AEM environments within a Cloud Manager Program to an existing, supported VPN. This allows secure and controlled connections between AEM as a Cloud Service and services within the customer’s network.

Flexible port egress allows for custom, specific port forwarding rules to be attached to AEM as a Cloud Service, allowing connections from AEM to external services to be made.

The dedicated egress IP address allows requests from AEM as a Cloud Service to use a dedicated IP address, allowing the external services to filter incoming requests by this IP address.

For more details, refer to this document — Advanced networking | Adobe Experience Manager Demystifying Dedicated Egress IPs in AEM Cloud Services | by Albin Issac | Tech Learnings | Dec, 2023 | Medium

Dispatcher Filters:

The request can also be restricted at the dispatcher layer by the /filter section to specify the HTTP requests the Dispatcher accepts. All other requests are sent back to the web server with a 404 error code (page not found). If no /filter section exists, all requests are accepted.

The /filter section consists of a series of rules that either deny or allow access to content according to patterns in the request-line part of the HTTP request. Use an allowlist strategy for your /filter section:

  • First, deny access to everything.
  • Allow access to content as needed.
/filter {
/0001 { /glob "*" /type "deny" }
/0002 { /type "allow" /method "POST" /url "/content/[.]*.form.html" }
}

For more details, refer to this document — Configuring Dispatcher | Adobe Experience Manager

HIPPA (Health Insurance Portability and Accountability Act) Compliance:

HIPAA compliance ensures the protection and confidential handling of patient health information, adhering to strict standards set by the Health Insurance Portability and Accountability Act.

Adobe provides healthcare customers with services that are ready to accept PHI, referring to these services as HIPAA-Ready Services. These HIPAA-Ready Services have additional features and functionalities that allow both customers, Covered Entities or Business Associates, and Adobe to comply with their respective HIPAA obligations.

The Adobe Experience Manager (AEM) as a Cloud Service is part of the HIPPA-ready service provided by Adobe.

Additional licensing is associated with enabling the HIPPA-ready service for the AEM as a Cloud service.

For more details, refer to this document — HIPAA Ready (adobe.com)

Mutual Transport Layer Security (mTLS) authentication from AEM:

AEM supports integrating with the external APIs that require mTLS authentication. The mTLS or two-way TLS authentication enhances the security of the TLS protocol by requiring both the client and the server to authenticate each other. This authentication is done by using digital certificates. It is commonly used in scenarios where strong security and identity verification are critical.

For more details, refer to this document — Mutual Transport Layer Security (mTLS) authentication from AEM | Adobe Experience Manager (mktossl.com)

Server-to-server Token-Based Authentication:

AEM’s Developer Console grants access to Service Credentials, which are production-ready service-to-service access tokens used to facilitate external applications, systems, and services to interact with AEM Author or Publish services over HTTP programmatically. Also, Local Development Access Token can be used by developers building integrations that require programmatic access to AEM as a cloud service needs a simple, quick way to obtain temporary access tokens for AEM to facilitate local development activities. To satisfy this need, AEM’s Developer Console allows developers to self-generate temporary access tokens that can be used to access AEM programmatically.

For more details, refer to this document — Service credentials | Adobe Experience Manager (mktossl.com) Local Development Access Token | Adobe Experience Manager (mktossl.com)

Data encryption:

All data in transit between AEM as a Cloud Service and external components is conducted over secure, encrypted connections using TLS 1.2 or greater. The cloud service provider encrypts all data at rest.

AEM as a Cloud Service also has a FIPS-approved crypto library and support for encryption keys to crypt all the critical data present in the cloud repository.

For more details, refer to this document — aem-cloud-service-security-overview.pdf (adobe.com)

OAuth2 Support for the Mail Service:

AEM as a Cloud Service offers OAuth2 support for its integrated Mail Service, allowing organizations to adhere to secure email requirements.

For more details, refer to this document — OAuth2 Support for the Mail Service | Adobe Experience Manager

Secret Variable Management through Cloud Manager:

In AEM as a Cloud service, the environment-specific configurations can be enabled using the Cloud manager environment variable. Two value types can be enabled — secret values and standard variables. The secret values can be centrally managed through Cloud Manager UI rather than managed through the code base to improve security; the secret variables can be referred to OSGI services, JAVA code, etc.

For more details, refer to this document — Support Custom Run Modes in AEM as a Cloud | Environment Specific Variables in AEM as a Cloud | by Albin Issac | Tech Learnings | Medium

Network security:

The AEM as a Cloud Service security model includes tenant and node-level isolation for all services. Each AEM as a Cloud Service tenant exists within its own isolated namespaces, including its own networking policies, computing, and storage.

Reference — aem-cloud-service-security-overview.pdf (adobe.com)

IAM integration:

AEM as a Cloud Service integrates Adobe Identity Management Service (IMS) for user verification. Various other Adobe products, including the Adobe Admin Console, also utilize this IMS authentication method. For AEM Authors in AEM as a cloud service, Adobe IMS authentication is activated, a change from previous AEM versions where identity and access management (IAM) settings had to be implemented individually on each AEM author server. With AEM in the cloud, single sign-on (SSO) configurations for AEM Authors and user and group management are centrally handled through the Adobe Admin Console using Adobe IMS.

For more details, refer to this document — AEM as a Cloud: IMS based SSO Authentication for Authors | by Albin Issac | Tech Learnings | Dec, 2023 | Medium

Also, the IAM system can be integrated with the publishers to enable an authenticated experience for the users who visit AEM websites.

For more details, refer to these documents — Enable User Authentication for AEM Websites — Azure AD B2C OAuth 2.0 | by Albin Issac | Tech Learnings | Medium Enable User Authentication for AEM Websites — Azure AD B2C | SAML Application with Azure AD B2C | by Albin Issac | Tech Learnings | Medium Social Login with Google OAuth2 — Adobe Experience Manager (AEM) | by Albin Issac | Tech Learnings | Medium Social Login with LinkedIn — Adobe Experience Manager (AEM) | by Albin Issac | Tech Learnings | Medium

Security Headers:

Security headers are HTTP response headers that define whether a set of security precautions should be activated or deactivated on the web browser. The security headers can be enabled through the Dispatcher (Apache) layer also; if required, some of the security headers can be directly enabled through the AEM publisher,

For more details, refer to this document — Adobe Experience Manager(AEM): HTTP Security Headers for Websites | by Albin Issac | Medium

Protect against Cross-Site Scripting (XSS)

Cross-site scripting (XSS) allows attackers to inject code into web pages viewed by other users. Malicious web users can exploit this security vulnerability to bypass access controls.

AEM applies the principle of filtering all user-supplied content upon output. Preventing XSS is given the highest priority during both development and testing.

The XSS protection mechanism provided by AEM is based on the AntiSamy Java™ Library provided by OWASP (The Open Web Application Security Project). The default AntiSamy configuration can be found at

/libs/cq/xssprotection/config.xml

It is important that you adapt this configuration to your own security needs by overlaying the configuration file. The official AntiSamy documentation provides you with all the information you need to implement your security requirements.

Reference — How to Protect AEM Websites from Cross-Site Scripting(XSS) (youtube.com)

Protect against Cross-Site Request Forgery Attacks

Cross-site request forgery (CSRF) is a web vulnerability that lets a malicious hacker trick the victim into submitting a request that allows the attacker to perform state-changing actions on behalf of the victim.

AEM uses CSRF tokens, and the Sling Referrer Filter — Adobe Experience Manager’s Referrer Filter enables access from third-party hosts to protect the websites from CSRF attacks.

For more details, refer to this document — CSRF protection | Adobe Experience Manager Referrer Filter configuration with AEM Headless | Adobe Experience Manager

In summary, security is essential in the cybersecurity landscape for any platform or website. It is imperative that every platform implements necessary measures to safeguard both the platform and user data. As a cloud service, AEM offers multiple layers of security configurations to protect the platform, enabling these configurations as needed.