Monday, May 13, 2024

Key Learnings from Migration to AEM as a Cloud

 In this post, we will explore some of my key learnings from migrating to AEM as a Cloud Service. In a previous post, ‘Key Considerations for Migrating to AEM as a Cloud Service,’ we discussed the essential factors to consider when moving to this platform

Photo by Lukas Blazek on Unsplash

Content/Asset Migration:

Content and asset migration to AEM as a Cloud is a complex task, especially depending on the size and nature of your existing repository. While the Content Transfer tool from Cloud Acceleration Manager can assist, the first hurdle is ensuring adequate infrastructure, such as sufficient disk space, CPU, and disk I/O, to handle the migration process.

If you are migrating from AMS, one strategy is to set up a new dedicated AMS environment specifically for migration. This approach prevents disruption to ongoing work on the current platform but does incur additional costs. Alternatively, repurposing one of the existing environments might be more cost-effective, though it may still require upgrades — Disk Space, Environment Size, and disk I/O capabilities, to meet migration demands.

The recommended steps are to clone your current production environment to this dedicated migration environment, ensure the environment is upgraded to meet your capacity needs mainly disk space and DISK IO (Higher DISK I/O is critical to extract and ingest high volume content seamlessly) then use it to extract and ingest content/assets to the AEM Cloud environments. This ensures a smoother transition and helps manage the load effectively during the migration.

Configuring Environment-Specific ETC MAPs in AEM Cloud:

Custom run modes are not supported in AEM as a Cloud; only ‘dev’, ‘stage’, and ‘prod’ run modes are available. This makes enabling environment-specific variables challenging, especially when multiple development environments are used for different purposes. However, this issue can be addressed by using Cloud Manager to support custom run modes. For more details, see my article: Support Custom Run Modes in AEM as a Cloud.

Another challenge is enabling environment-specific ETC MAPs for resource resolving, particularly when defining site-specific test domains in different environments. Although AEM does not support run mode-specific ETC MAPs directly, this can be managed by configuring the Resource Resolver Factory to point to environment-specific ETC MAP folders. In the cloud, where all development environments are typically set to ‘dev’ run mode, this can be managed using Cloud Manager environment variables. For more information, refer to my article: How to Configure Environment-Specific ETC MAPs for AEM as a Cloud Service?.

Migration of existing users and groups:

User migration is a critical component in AEM as a Cloud, where user management is facilitated through IMS. First, create the necessary groups in IMS and associate the existing users to the corresponding groups. For efficiency, you may consider using a bulk CSV upload to map current users to the appropriate IMS groups. User permissions can be managed through local AEM groups by linking them with IMS groups. If needed, local group ACLs can be defined through repo init scripts. For more information on this process, refer to ‘AEM as a Cloud: IMS based SSO Authentication for Authors’ by Albin Issac on Tech Learnings.

If CUG (Closed User Group) based users and groups are enabled in your current AEM publishers, use the ACL Packager to migrate the users to the cloud publisher. Upload the package to the Author environment and then replicate it. Make sure that the CUG users are not installed into the Author environment to avoid any permission overlaps, especially if some users are common between Authors and CUG. Live user synchronization between publish pods is not enabled by default. Create a support ticket to enable user sync across publisher pods.

Environment-specific SAML configurations within AEM for supporting Author login through the customer’s IAM system are no longer necessary. This configuration has now been centralized and is managed globally through IMS.

DAM Asset Update Workflow:

The default DAM Asset Update workflow in previous versions of Experience Manager is no longer available. Instead, asset microservices provide a scalable, readily available service that covers most of the default asset processing (renditions, metadata extraction, and text extraction for indexing). Refer to the configuration and use of asset microservices for more details. To include customized workflow steps in the processing, post-processing workflows can be used.

Synthetic URL Monitoring:

Synthetic URL monitoring is a crucial tool in web performance monitoring, used to assess the availability, functionality, and responsiveness of web applications, services, and APIs. If you currently utilize custom monitoring tools such as New Relic or Datadog, these can be effectively used to monitor URLs and APIs. However, it’s important to note that while New Relic is integrated into AEM as a Cloud, it does not support specific URL monitoring for AMS platforms.

In AEM as a Cloud, URL monitoring is only available through the Advanced Cloud Support package, which incurs additional costs. If you opt out of Advanced Cloud Support and do not have an existing custom monitoring setup, your only alternative is to find a custom solution that meets your URL monitoring needs.

Supporting Parallel Development and Content Authoring:

Supporting the ongoing development of projects on the platform and content authoring is critical. Ensure you plan this in advance while selecting System Integrators and during the planning phase. Supporting parallel development and content authoring can alter your project’s timeline and required effort, so it’s crucial to develop a strategy for parallel development and content authoring. This includes merging ongoing development code to the cloud, ensuring cloud compatibility, conducting tests, and migrating delta content.

Static Templates and ETC Designs:

Although static templates and ETC/Designs are supported in the cloud, they are not recommended. You should migrate ETC/Designs folders from the current platform to the cloud. However, a challenge arises as there are no miscadmin pages available to manage and publish ETC/Designs. While designs can be modified through design mode, there is no direct method to activate these changes. To activate design changes, you could use one of the following approaches: package the changes and install them on the publisher, or utilize a tree activation workflow.

Splunk Log forwarding:

The AEM logs, including CDN logs, can be forwarded to a customer-owned Splunk instance. To enable this forwarding, please create a support ticket with the necessary details: the Splunk HEC endpoint address, Splunk port, and Splunk HEC token. Three different indexes — dev, stage, and prod — are created; logs from all dev-type instances are sent to the dev index.

SSL and DNS Configurations:

The SSL and DNS configurations can be managed through Cloud Manager as a Self-Service. To support site-specific test domains across various environments, create a test domain strategy. It’s advisable to use a common domain pattern to simplify management, such as site1-dev.test.com, site1-uat.test.com, and site1-stage.test.com.

Enable a separate wildcard SSL, for example, *.test.com, to support various test domains across different environments (site1-dev.test.com, site1-uat.test.com, site1-stage.test.com). Additionally, consider enabling individual or SAN SSL to support live domains in production.

Domains can be directed to either the Preview or Publish environments. The Preview environment allows you to preview changes before they are published. After publishing changes to the Preview server and conducting tests, you can then move the changes to the Publish environment. By default, the Preview service is blocked for all traffic, but you can manage IP allowlisting through Cloud Manager

CDN Configurations:

The caching headers for the CDN can be enabled through dispatcher configurations. For more details, please refer to the ‘Caching in AEM as a Cloud Service’ section on the Adobe Experience Manager website. It’s important to assess the current CDN cache behavior configurations as well. When migrating from AMS and CloudFront to AEM as a Cloud, I recommend not merely transferring individual configurations from the current CDN directly. Instead, identify your specific caching needs and implement the necessary configurations. This might include common configuration files used in every virtual host to control caching behaviors.

Additionally, you can define other CDN configurations such as rate limiting, filtering rules, and WAF rules, if licensed, through the cdn.yml file in a separate config folder, refer to Traffic Filter Rules including WAF Rules | Adobe Experience Manager for more details. Deploy these configurations separately through the Config Cloud Manager Pipeline. The CDN configuration also supports enabling a Reverse Proxy (Origin Selector) for non-AEM servers, as well as request and response transformers, refer to Configuring Traffic at the CDN | Adobe Experience Manager for more details.

Purge-Key. You should request the environment-specific purge key from the Adobe team by creating a support ticket.

There are two types of purges available: soft and hard. A soft purge marks the affected object as stale rather than making it completely inaccessible, whereas a hard purge immediately makes the object inaccessible to new requests. To execute a hard purge, simply remove Fastly-Soft-Purge:1 from the request.

curl -X PURGE -H "Fastly-Soft-Purge:1" <URL_TO_BE_ PURGED> -H "x-aem-purge-key: <X_AEM_PRUGE_KEY>" 

DevOps Process:

Adjusting and updating your current DevOps process to adapt to cloud best practices is critical. The cloud platform enables additional deployment pipelines, such as the Dispatcher pipeline for deploying Dispatcher configurations and the Config pipeline for deploying CDN configurations. If you are not currently using Cloud Manager to manage deployments, you should define an end-to-end DevOps process that includes both your code repository and the Cloud Manager repository, along with the deployment pipelines.

Testing:

Testing is critical for the success of the project. Ensure that a comprehensive test strategy is defined to cover all required tests: Functional, Content, Integration, Regression, Performance, and Security. If any integrations require IP whitelisting, you will need to enable dedicated IP configuration in AEM Cloud and whitelist the IP. For more details, refer to ‘Demystifying Dedicated Egress IPs in AEM Cloud Services’ by Albin Issac on Tech Learnings | Medium.

Another challenge involves functionalities that cannot be tested in non-production environments. You may need to enable test services to assess these functionalities. If it’s not feasible to test ahead of the go-live, consider testing these functionalities by adding host entries on your local machine to point the sites to the AEM Cloud platform.

Adobe Support:

Adobe Support is critical. If you opt for Ultimate Support, you will gain access to focused ticket support. Additionally, Launch Advisory services provide extra guidance from an Adobe Architect and their team during the project, including any necessary reviews and monitoring during the migration. If you decide against Ultimate Support, you will need to plan how to navigate support tickets in case of any blockers. Furthermore, if you currently use AMS, you may need to arrange for the required support from a Customer Success Engineer (CSE) for tasks such as cloning the environment, supporting content migration, or obtaining details from the current platform, such as CDN configurations.

Restrict release orchestration for Production:

You can raise a support ticket to prevent the production environment from automatically upgrading during your go-live window, thereby avoiding any unexpected issues. The request should be submitted at least one week in advance, specifying the period during which the release orchestration for production should be disabled (i.e., preventing the production environment from receiving updates).

Maintenance Tasks:

In the cloud, Adobe handles some maintenance tasks, but customers are responsible for others. You will need to schedule these tasks according to your specific needs. For more information, refer to the Maintenance Tasks in AEM as a Cloud Service on the Adobe Experience Manager website.

Content Back Sync:

If you have enabled scheduled content back-sync from production to non-production environments in AMS or your own AEM platform, be aware that this functionality is not supported in AEM as a Cloud. Instead, AEM as a Cloud allows you to create content sets and migrate content between Author instances as needed, but this is not scheduled nor does it extend to the Publisher. You should consider implementing a package-based batch content sync, such as using the Multi-Channel Publishing (MCP) tool, from Author to Publisher.

Operational Changes:

Some of the operations previously performed by CSE on the AMS platform can now be managed directly by customers. These include managing deployments, installing mutable content packages through the Package Manager, provisioning new environments, and configuring IP whitelisting, SSL, and DNS, among others. Additionally, the repositories on stage and production can be accessed through the repository browser available in the Developer Console.

Configure Notification Services:

Configure Incident Notification and Proactive Notification, refer to Notification Profiles | Adobe Experience Manager for more details.

Post Go Live Reviews:

Review the CDN cache hit ratio through the Cloud Manager, which also provides recommendations for improving this metric.

Issues:

Responsive Authoring Issue in AEM as a Cloud | by Albin Issac | Tech Learnings | Mar, 2024 | Medium

Tuesday, April 30, 2024

How to Configure Environment-Specific ETC MAPs for AEM as a Cloud Service?

 

Photo by Alvaro Reyes on Unsplash

In Adobe Experience Manager (AEM), managing environment-specific ETC maps is crucial for defining DNS content mappings to support multiple domains and to enable resource mapping and resolution. For detailed guidance on resource resolution in AEM, refer to the post “Configure Sling Mappings for Resource Resolution in Adobe Experience Manager(AEM) — Deep Dive | by Albin Issac | Medium”.

Key Configurations:

  • ETC Map Usage: The ETC map, used in conjunction with the JCR Resource Resolver, supports resource resolution and mapping across multiple domains. Unfortunately, ETC map configurations do not consider run modes during the application, necessitating a combination with run mode-specific configurations using org.apache.sling.jcr.resource.internal.JcrResourceResolverFactoryImpl
  • Creating Environment-Specific Folders: Create specific /etc/map folders like /etc/map.dev.publish/etc/map.uat.publish/etc/map.stage.publish, and /etc/map.prod.publish

Run Mode-Specific JCR Resolver Configurations: These should be set up as follows:

Dev/Publish(config.dev.publish) run modes: org.apache.sling.jcr.resource.internal.JcrResourceResolverFactoryImpl.cfg.json

resource.resolver.map.location”: “/etc/map.dev.publish”

{
"resource.resolver.map.observation": ["/libs", "/apps", "/content", "/etc", "/conf"],
"resource.resolver.searchpath": ["/apps", "/libs", "/apps/foundation/components/primary", "/libs/foundation/components/primary"],
"resource.resolver.required.providers": ["org.apache.sling.jcr.resource.internal.helper.jcr.JcrResourceProviderFactory"],
"resource.resolver.default.vanity.redirect.status": 302,
"resource.resolver.mapping": ["/-/"],
"resource.resolver.virtual": ["/:/"],
"resource.resolver.vanitypath.whitelist": ["/apps/", "/libs/", "/content/"],
"resource.resolver.manglenamespaces": true,
"resource.resolver.map.location": "/etc/map.dev.publish",
"resource.resolver.log.unclosed": false,
"resource.resolver.allowDirect": true,
"resource.resolver.vanitypath.blacklist": ["/content/usergenerated"]
}

Stage/Publish(config.stage.publish) run modes: org.apache.sling.jcr.resource.internal.JcrResourceResolverFactoryImpl.cfg.json

resource.resolver.map.location”: “/etc/map.stage.publish”

{
"resource.resolver.map.observation": ["/libs", "/apps", "/content", "/etc", "/conf"],
"resource.resolver.searchpath": ["/apps", "/libs", "/apps/foundation/components/primary", "/libs/foundation/components/primary"],
"resource.resolver.required.providers": ["org.apache.sling.jcr.resource.internal.helper.jcr.JcrResourceProviderFactory"],
"resource.resolver.default.vanity.redirect.status": 302,
"resource.resolver.mapping": ["/-/"],
"resource.resolver.virtual": ["/:/"],
"resource.resolver.vanitypath.whitelist": ["/apps/", "/libs/", "/content/"],
"resource.resolver.manglenamespaces": true,
"resource.resolver.map.location": "/etc/map.stage.publish",
"resource.resolver.log.unclosed": false,
"resource.resolver.allowDirect": true,
"resource.resolver.vanitypath.blacklist": ["/content/usergenerated"]
}

Prod/Publish(config.prod.publish) run modes: org.apache.sling.jcr.resource.internal.JcrResourceResolverFactoryImpl.cfg.json

“resource.resolver.map.location”: “/etc/map.prod.publish”

{
"resource.resolver.map.observation": ["/libs", "/apps", "/content", "/etc", "/conf"],
"resource.resolver.searchpath": ["/apps", "/libs", "/apps/foundation/components/primary", "/libs/foundation/components/primary"],
"resource.resolver.required.providers": ["org.apache.sling.jcr.resource.internal.helper.jcr.JcrResourceProviderFactory"],
"resource.resolver.default.vanity.redirect.status": 302,
"resource.resolver.mapping": ["/-/"],
"resource.resolver.virtual": ["/:/"],
"resource.resolver.vanitypath.whitelist": ["/apps/", "/libs/", "/content/"],
"resource.resolver.manglenamespaces": true,
"resource.resolver.map.location": "/etc/map.prod.publish",
"resource.resolver.log.unclosed": false,
"resource.resolver.allowDirect": true,
"resource.resolver.vanitypath.blacklist": ["/content/usergenerated"]
}

Challenges and Solutions for Dev Environments:

Although Stage and Prod environments load specific ETC maps properly, the challenge arises in Dev environments. These are always enabled with a Dev run mode but can be designated for various purposes like Development or UAT. This mismatch can cause disruptions if not handled correctly.

Integrating with Cloud Manager Variables:

To address this, combine the approach with Cloud Manager Variables. Cloud Manager allows for the setting of environment-specific variables that can be referenced in JAVA, OSGI, and Dispatcher configurations — Refer to Support Custom Run Modes in AEM as a Cloud | Environment Specific Variables in AEM as a Cloud | by Albin Issac | Tech Learnings | Medium for more details.

Define an ETC_MAP_PATH variable in Cloud Manager corresponding to the environment-specific ETC MAP path (e.g., /etc/map.dev.publish/etc/map.uat.publish) and apply this variable to the Publish environment. This approach ensures uniformity across all environments, including Stage and Prod.

Applying the Configuration: (config.publish) — org.apache.sling.jcr.resource.internal.JcrResourceResolverFactoryImpl.cfg.json

“resource.resolver.map.location”: “$[env:ETC_MAP_PATH]”

{
"resource.resolver.map.observation": ["/libs", "/apps", "/content", "/etc", "/conf"],
"resource.resolver.searchpath": ["/apps", "/libs", "/apps/foundation/components/primary", "/libs/foundation/components/primary"],
"resource.resolver.required.providers": ["org.apache.sling.jcr.resource.internal.helper.jcr.JcrResourceProviderFactory"],
"resource.resolver.default.vanity.redirect.status": 302,
"resource.resolver.mapping": ["/-/"],
"resource.resolver.virtual": ["/:/"],
"resource.resolver.vanitypath.whitelist": ["/apps/", "/libs/", "/content/"],
"resource.resolver.manglenamespaces": true,
"resource.resolver.map.location": "$[env:ETC_MAP_PATH]",
"resource.resolver.log.unclosed": false,
"resource.resolver.allowDirect": true,
"resource.resolver.vanitypath.blacklist": ["/content/usergenerated"]
}

This refined approach ensures each cloud environment loads the ETC map based on the path specified in the Cloud Manager variable ETC_MAP_PATH, allowing for precise environment-specific resource mapping.

Friday, April 26, 2024

Revolutionizing Onsite Search: The Power of Generative AI | Generative Answering

 In this post, we will explore how Generative AI is transforming on-site search capabilities, enhancing user experiences, and streamlining access to information. We’ll also delve into some of the challenges that come with integrating this advanced technology into existing systems.

Introduction to Onsite Search

Onsite search refers to the functionality on websites that allows users to enter search queries and retrieve information or products within that particular website. This tool is crucial for enhancing user experience, especially on e-commerce platforms where finding products quickly and efficiently can significantly impact customer satisfaction and conversion rates. Effective onsite search can mimic the ease and precision of in-person shopping experiences by quickly guiding users to their desired products or information.

However, traditional search functionalities often face challenges such as dealing with synonyms, spelling errors, and understanding the intent behind a user’s query. These limitations can lead to irrelevant search results, a frustrating user experience, and potentially lost sales. Moreover, traditional searches might struggle with indexing and retrieving content in a way that aligns with how users think and search.

For a deeper dive into enhancing onsite search, see my previous articles Onsite Search: The Must-Have Features for a Seamless User Experience | by Albin Issac | Tech Learnings | Medium Selecting a Search Engine for on-site Search — Open Source vs Search as a Service | by Albin Issac | Medium

Overview of Generative AI

Generative AI refers to a subset of artificial intelligence technologies that can generate new content and ideas based on the data it has been trained on. This contrasts with other forms of AI that typically focus on classifying or predicting outcomes based on input data. Generative AI involves advanced machine learning techniques, including natural language processing (NLP) and neural networks, to produce outputs that are not explicitly programmed into the system.

Technologies behind generative AI, such as NLP, enable the AI to understand, interpret, and manipulate human language. Neural networks, particularly deep learning models, allow the AI to process large amounts of data, recognize patterns, and make decisions with minimal human intervention. These capabilities make generative AI particularly valuable for enhancing onsite search because it can understand and interpret the complexities and nuances of human language, predict what users are searching for even with vague or incomplete queries, and generate relevant results even from sparse data inputs.

By leveraging generative AI, websites can overcome many of the traditional challenges associated with onsite search, leading to a more intuitive and satisfying user experience. This is particularly beneficial in e-commerce settings, where the ability to quickly and accurately display products based on user queries can directly influence buying decisions and overall business success.

Search engines have long utilized AI technologies to enhance user experience and deliver more relevant content. Here are several AI-driven features that have been key in revolutionizing search functionalities:

  • Smart Snippets: Provide concise previews of content relevant to search queries, offering users quick insights into the content’s relevance without having to click through to the page.
  • Automatic Relevance Tuning (ART): This leverages user interaction data to dynamically adjust and refine the relevance of search results, ensuring that users find the most pertinent information based on collective user behaviors.
  • Query Suggestions: By analyzing aggregated search data, this feature suggests related queries in real-time as users type, helping them formulate more effective searches and discover content more efficiently.
  • Content Recommendations: Utilizing user behavior, previous search histories, and browsing patterns, AI suggests additional relevant content, potentially increasing engagement and time spent on a site.
  • Dynamic Navigation Experience: AI algorithms optimize and personalize navigation interfaces based on common user journeys and behaviors, making site navigation more intuitive and user-friendly.
  • Image and Video Recognition: Advanced visual recognition technologies enable search engines to index and retrieve images and videos based not just on metadata but on the visual content itself, enhancing the search capabilities for multimedia.
  • Document Search: AI enhances the ability to search within documents by recognizing and indexing text from various document formats. This technology enables users to perform deep content searches, finding specific information within large documents or across a collection of documents. AI-driven features like optical character recognition (OCR) and natural language understanding (NLU) are often employed to interpret the content accurately and return relevant results based on the content’s context rather than just keyword matches.
  • Sentiment Analysis: Employed by some search engines, this feature analyzes the emotional tone of content, useful for businesses monitoring brand reputation or media sites gauging public sentiment.
  • Personalization Engines: These tailor search results and content displays to individual users based on their unique preferences and past interactions, creating a more personalized and relevant browsing experience.

Generative Answering with AI in Search

Generative AI is transforming the landscape of search technology by introducing the advanced capability of generative answering. This innovative approach transcends the conventional mechanisms of search that primarily focus on retrieving data. Instead, it involves the understanding and generation of responses that directly address a user’s queries.

What is Generative Answering?

Generative answering is the capability of AI systems to construct responses from scratch, based on the extensive data they have been trained on. Utilizing models such as GPT (Generative Pre-trained Transformer) and BERT (Bidirectional Encoder Representations from Transformers), this technology grasps queries in a detailed and nuanced manner, delivering precise, context-aware answers.

Image from Coveo

Key Benefits of Generative Answering in Search

  • Direct Answers: Moving beyond traditional search engines that list documents or links, generative answering provides a direct and succinct response to queries, synthesizing information from multiple sources or extrapolating from existing data.
  • Natural Language Interaction: This capability fosters a more conversational interaction, allowing users to engage with the AI in a dialogue format where the system understands and responds to follow-up questions or requests for clarification.
  • Personalized Responses: The AI customizes responses based on individual user history or preferences, greatly enhancing the relevance and personalization of the search experience.
  • Efficiency in Information Discovery: By quickly delivering the specific information users seek, generative answering significantly reduces the time spent navigating through irrelevant results.

Expanded Applications of Generative AI in Onsite Search

  • Customer Support: Provides immediate, accurate answers to customer inquiries, streamlining customer service operations without the need for extensive human intervention.
  • E-commerce: Enhances product discovery and customer shopping experience by offering detailed product information and tailored recommendations directly in response to user queries.
  • Education: Supports an interactive learning environment by instantly responding to student queries with detailed explanations, promoting a more engaging educational experience.
  • Content-Based Sites: Significantly improves search and discovery on content-rich platforms such as news sites, blogs, and digital libraries. Generative AI can swiftly analyze large datasets, enabling it to deliver precise answers, recommend related content, and generate content summaries that aid users in efficiently navigating and consuming information.
  • AI-Powered Chatbots: These sophisticated chatbots utilize indexed data to conduct meaningful conversations, providing support and information directly through interactive dialogues.
  • Content Creation and Augmentation: AI not only helps in creating new content that aligns with current trends and user interests but also optimizes existing content to enhance its discoverability and engagement.

Generative answering, along with these applications, illustrates the profound impact Generative AI has on enhancing the functionality and user experience of search systems. As this technology continues to evolve, its integration into search platforms is expected to redefine the norms of how information is queried and retrieved, making searches more intuitive, interactive, and effective.

Most of search engines now support Generative AI capabilities to enable Generative Answering and chatbots, the search engines already have customer data indexed through a traditional approach but now the search engines started supporting the Retrieval Augmented Generation approach to enable generative AI capabilities. The content indexing and retrieval process is enabled with additional steps to support Generative Answering.

High-Level Steps in Indexing and Retrieval to Support Generative Answering:

1. Document Indexing and Embedding Generation: Before any search queries are processed, documents must be indexed. As part of this indexing process, each document is processed to generate embeddings. These embeddings, which are dense vector representations of the document’s content, capture the semantic essence of the text. Using AI models such as BERT or GPT, the search engine converts the text of each document into these embeddings and stores them in the search index. This allows for efficient retrieval based on semantic similarity rather than mere keyword matching.

2. Query Reception and Preprocessing: When a query is received, the first step is preprocessing. This involves normalizing the text by converting it to lowercase, removing punctuation, and correcting any typos. Such preprocessing simplifies the text to ensure more effective subsequent analysis.

3. Query Embedding Generation: The preprocessed query is then converted into an embedding using the same AI models used during document indexing. This ensures that the query can be semantically compared to the indexed documents.

4. Chunking and Contextual Analysis: If the query is complex or contains multiple elements, it is broken down into smaller chunks. This chunking isolates specific aspects of the query for more targeted processing. Contextual analysis is also performed, considering previously indexed information, user’s search history, or the specific session context.

5. Retrieval from Index: The query embedding is used to retrieve the most relevant documents from the search index. This is done using similarity metrics that compare the query’s embedding with the embeddings of documents stored in the index, ensuring the retrieval of content that matches the semantic intent of the query.

6. Generate Prompting: After relevant information is retrieved, the search engine constructs a detailed prompt for the generative model. This prompt includes the query, relevant context from the retrieved documents, and additional pertinent information that can guide the AI in generating an accurate response.

7. Send Prompt to Generative Model: This prompt is sent to a generative AI model, such as GPT-3 or GPT-4, which uses it to generate a response that directly answers the user’s query. The generative model synthesizes the information, producing coherent, context-aware text.

8. Answer Generation and Post-Processing: The generated answer undergoes post-processing to ensure it meets quality standards, refining it for clarity and appropriateness. If necessary, the prompt may be adjusted and the generation process rerun.

9. Delivery to User: The refined answer is then delivered to the user through an interface that may allow further interactions, such as query refinement, follow-up questions, or feedback. This feedback is crucial for continuous learning and system improvement.

The integration of Generative AI into search engines represents a paradigm shift away from traditional keyword-centric approaches towards more dynamic, context-sensitive systems. This advancement enriches the search experience, providing users with direct answers, engaging conversations, and tailored content, thereby refining the information discovery process.

Most onsite search engines have now adopted Generative AI capabilities. The advantage here is that since data has already been indexed using traditional methods, implementing Generative AI is relatively straightforward, avoiding the need for extensive content reprocessing and the setup of Retrieval Augmented Generation (RAG) workflows. Furthermore, these search engines offer APIs that facilitate the retrieval of data for both traditional search results and generative answering, making it possible to display information through various interfaces like search pages, chatbots, and more.

A concern with this approach, however, is that many search engine systems offer limited flexibility in selecting embedding models, configuring chunking, or choosing the GPT model. They often rely on predefined proprietary or open-source models, configurations, and prompts. This can be problematic for those who require more control over data processing and the customization of prompts and models.

The capacity for Generative AI to deliver quick and direct answers to customers is an impressive feature that significantly enhances user interaction. Looking forward, there is an anticipation that search engines will begin to provide greater flexibility and options to meet custom requirements for generative answering, catering to a wider range of use cases and further innovating the field of search.

Refer to About Relevance Generative Answering (RGA) | Coveo Machine Learning for understanding how Coveo supports Generative Answering