Sunday, June 28, 2020

How to implement autocompletion and search suggestion in AEM through Lucene | Predictive Search in AEM | AEM Search Suggestions

How to implement autocompletion and search suggestion in AEM through Lucene | Predictive Search in AEM | AEM Search Suggestions


This tutorial explain the approach to implement autocompletion and search suggestion in AEM through Lucene.

When you start typing something in search form most of the applications helps you by suggesting the data matching to your search term.

aem-autocompletion-search-suggestion

The purpose of autocomplete is to resolve a partial query , i.e., to search within a controlled vocabulary for items matching a given character string.

Starting from AEM 6.1 the feature of suggestion is available through the suggest module of Lucene. Prior to AEM 6.1, all the possible combination of the words needs to be indexed to support the autocompletion.

The Lucene Suggest module provides a dedicated and optimized data structure allows the engine to give autocompletion and suggestion feature without indexing all the possible combination of a word. 

There is a specific analyzer (AnalyzingInfixSuggester) used that loads the completion values from the indexed data and then build the optimized structure in memory for a fast lookup. 

In order to implements the autosuggestion, feature you need to define an index of type Lucene and for each property X of nodes that you are indexing you can add a specific property useInSuggest to tell to the engine to use X for suggesting query to the user.

Refer the following URL for details on enabling custom index - https://www.albinsblog.com/2020/04/oak-lucene-index-improve-query-in-aem-configure-lucene-index.html  

I have already enabled a custom Lucene index(testindex) for the property "id", add a property "useInSuggest" to tell the engine to use id for suggesting query to the user.

aem-autocompletion-search-suggestion


An additional property suggestUpdateFrequencyMinutes define the frequency of updating the indexed suggestions - useful to mitigate performance issues that can arise if indexed properties are frequently updated by the users of your application. The default value is 10 minutes but the values can be modified as required.

To enable the property "suggestUpdateFrequencyMinutes ", create a node with name "suggest" of type "nt:unstructured" under "testindex" and update the value as required

aem-autocompletion-search-suggestion


 In order to use Lucene index to perform search suggestions, the index definition node (the one of type oak:QueryIndexDefinition) needs to have the compatVersion set to 2. 

aem-autocompletion-search-suggestion


Let us now execute the query to find the suggestions - either one of the below query can be used.

The testindex was defined for the content path "/content/sampledata" so the query will be executed based on the "testindex" but the index name is explicitly defined in the first query.

aem-autocompletion-search-suggestion


SELECT [rep:suggest()] FROM [nt:unstructured] WHERE SUGGEST('te') OPTION(INDEX NAME [testindex]) /* oak-internal */ 

SELECT [rep:suggest()] FROM [nt:unstructured] WHERE SUGGEST('te') AND ISDESCENDANTNODE('/content/sampledata')

The above query uses path restriction to filter the data, it requires evaluatePathRestrictions property should enabled as true on index definition.

aem-autocompletion-search-suggestion


The Query tool shows the total number of unique suggestions matching with the search data but it wont displays the matching node details

aem-autocompletion-search-suggestion


The below Servlet can be used to fetch the suggestion data through QueryManager API
import java.io.IOException;

import javax.jcr.RepositoryException;
import javax.jcr.Session;
import javax.jcr.query.InvalidQueryException;
import javax.jcr.query.Query;
import javax.jcr.query.QueryManager;
import javax.jcr.query.QueryResult;
import javax.jcr.query.Row;
import javax.jcr.query.RowIterator;
import javax.servlet.Servlet;
import javax.servlet.ServletException;

import org.apache.sling.api.SlingHttpServletRequest;
import org.apache.sling.api.SlingHttpServletResponse;
import org.apache.sling.api.servlets.HttpConstants;
import org.apache.sling.api.servlets.SlingSafeMethodsServlet;
import org.json.JSONArray;
import org.osgi.framework.Constants;
import org.osgi.service.component.annotations.Component;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;

@Component(immediate = true, service = Servlet.class, property = {
		Constants.SERVICE_DESCRIPTION + "=Custom Root Mapping", "sling.servlet.methods=" + HttpConstants.METHOD_GET,
		"sling.servlet.paths=" + "/bin/getSuggestions", "service.ranking=" + 100001 })
public class SuggestionData extends SlingSafeMethodsServlet {

	/**
	* 
	*/
	private static final long serialVersionUID = 1L;

	protected void doGet(final SlingHttpServletRequest req, final SlingHttpServletResponse resp)
			throws ServletException, IOException {

		Logger logger = LoggerFactory.getLogger(this.getClass());
		logger.error("inside custom servlet");

		final Session session = req.getResourceResolver().adaptTo(Session.class);

		final JSONArray suggestions = new JSONArray();

		String queryString = "SELECT [rep:suggest()]  FROM [nt:unstructured] WHERE "
							 +"SUGGEST('te') OPTION(INDEX NAME [testindex]) /* oak-internal */ ";

		try {
			QueryManager queryManager = session.getWorkspace().getQueryManager();
			Query query = queryManager.createQuery(queryString, Query.JCR_SQL2);
			QueryResult result = query.execute();
			RowIterator rows = result.getRows();

			while (rows.hasNext()) {
				suggestions.put(((Row) rows.next()).getValue("rep:suggest()").getString());
			}

		} catch (InvalidQueryException e) { // TODO Auto-generated catch block
			e.printStackTrace();
		} catch (RepositoryException e) { // TODO Auto-generated
			e.printStackTrace();
		} finally {
			session.logout();

		}

		resp.setContentType("application/json");
		resp.getWriter().write(suggestions.toString());

	}

}

aem-autocompletion-search-suggestion


This suggestion data cab be used to display  the search autocompletion/suggest data to the website users.





Sunday, June 21, 2020

Social Login with Google OAuth2— Adobe Experience Manager (AEM)

Social Login with Google OAuth2— Adobe Experience Manager (AEM)


Social login is the ability to present the option for a site visitor to sign in with their social accounts like Facebook, Twitter, LinkedIn and etc. AEM supports OOTB Facebook and Twitter Social logins but Google login is not supported OOTB and need to build custom Provider to support the log in flow for websites.

AEM internally uses the scribejava module to support the Social login flows, scribejava supports multiple providers and both OAuth 1.0 and OAuth 2.0 protocols.

This tutorial explains the steps and the customization required to support the Google login in AEM as Cloud version, the same should work with minimal change for other AEM versions.

Prerequisites

  • Google Account
  • AEM as Cloud Publisher
  • WKND Sample Website
  • Git Terminal
  • Maven
  • Google Login Flow

aem-social-login-with-google


AEM Login URL


http://localhost:4503/j_security_check?configid=google

Auth Page URL


https://accounts.google.com/o/oauth2/auth?response_type=code&client_id=%s&redirect_uri=%s&scope=%s

Access Token URL(POST)


https://oauth2.googleapis.com/token?grant_type=authorization_code&client_id=&client_secret&code=<authorization code received>&redirect_uri=



Steps

  • Create Project in Google Developers Console
  • Setup Custom Google OAuth Provider
  • Configure Service User
  • Configure OAuth Application and Provider
  • Enable OAuth Authentication
  • Test the Login Flow
  • Encapsulated Token Support
  • Sling Distribution user’s synchronization


Create Project in Google Developers Console


As a first step create a new project in google to setup OAuth Client, access https://console.developers.google.com/cloud-resource-manager

Click on “Create Project”

aem-social-login-with-google

Enter a project name and click on Create

aem-social-login-with-google

The project is created now

aem-social-login-with-google

Let us now configure the OAuth client, access settings

aem-social-login-with-google

Search for “API & Services”, click on “APIs & Services”

aem-social-login-with-google

Click on Credentials

aem-social-login-with-google

Click on “Create Credentials” then OAuth client ID

aem-social-login-with-google

The Consent Screen should be configured to initiate the OAuth client id configurations

aem-social-login-with-google

Select User Type as “Internal” or “External” based on your requirement — “Internal” is only available for G-Suite users

aem-social-login-with-google

Enter the application name, Application Logo and Support Email

aem-social-login-with-google

The scopes “email”, “profile” and “openid” are added by default, “profile” scope is enough for basic authentication.

aem-social-login-with-google

Save the configurations now

Now again click on “Create Credentials” → OAuth client ID

aem-social-login-with-google

Select the application type “Web Application” and enter application name

aem-social-login-with-google

“Authorized Javascript origins”— the URL initiate the login, i am going with localhost for demo

“Authorized redirect URI’s” — the URL to be invoked on successful login,http://localhost:4503/callback/j_security_check (use the valid domain for real authentication)

Click on Create button

aem-social-login-with-google

OAuth client is created now, copy the Client ID and Client Secret — these values required to enable the OAuth Authentication handler in AEM.

aem-social-login-with-google

To use the client in production, the OAuth Consent Screen should be submitted for approval.

Click on “Configure Consent Screen” again

aem-social-login-with-google

Enter the required values, “Authorized domains”, “Application Home Page Link”, “Application Privacy Policy Link” and submit for approval

The approval may takes days or weeks, meanwhile the project can be used for development

aem-social-login-with-google


Google Project is ready for use now to test the login in flow

Configure Service User


Enable the service user with required permissions to manage the users in the system, you can use one of the existing service users with required access, I thought of defining new service user(oauth-google-service — name referred in GoogleOAuth2ProviderImpl.java, change the name if required)

Create a system user with name oauth-google-service, navigate to http://localhost:4503/crx/explorer/index.jsp and login as an admin user and click on user administration

aem-social-login-with-google

aem-social-login-with-google

Now enable the required permissions for the user, navigate to http://localhost:4503/useradmin(somehow I am still comfortable with useradmin UI for permission management)

aem-social-login-with-google

Now enable the service user mapping for provider bundle — add an entry into Apache Sling Service User Mapper Service Amendment google.oauth.provider:oauth-google-service=oauth-google-service

aem-social-login-with-google

Setup Custom Google OAuth Provider


As mentioned earlier AEM won’t support Google authentication OOTB, define a new provider to support the authentication with Google.

The custom Google Provider can be downloaded from — https://github.com/techforum-repo/bundles/tree/master/google-oauth-provider

GoogleOAuth2ProviderImpl.java — Provider class to support the Google authentication

GoogleOAuth2Api.java — API class extended from default scribe DefaultApi20 to support Google OAuth 2.0 API integration

GoogleOauth2ServiceImpl.java — Service class to get the Access Token from Google service response

The provider bundle enabled with aem-sdk-api jar for AEM as Cloud Service, the other AEM versions can use the same bundle by changing aem-sdk-api to uber jar.

Clone the repository — git clone https://github.com/techforum-repo/bundles.git

Deploy google-oauth-provider bundle — change the directory to bundles\google-oauth-provider and execute mvn clean install -PautoInstallBundle -Daem.port=4503

Here I am going to enable the authentication for publisher websites, change the port number and deploy to Author if required.

After the successful deployment, you should able to see the Google provider in config manager.

aem-social-login-with-google

The oauth.provider.id can be changed but the same value should b e used while configuring “Adobe Granite OAuth Application and Provider”.

Configure OAuth Application and Provider


Let us now enable the “Adobe Granite OAuth Application and Provider” for Google

Config ID — Enter a unique value, this value should be used while invoking the AEM login URL
Client ID — Copy the Client ID value from Google OAuth Client
Client Secret — Copy the Client Secret value from Google OAuth Client
Scope —”profile”
Provider ID — google
Create users — Select the check box to create AEM users for Google profiles
Callback URL — the same value configured in Google OAuth Client (http://localhost:4503/callback/j_security_check)

aem-social-login-with-google

Enable OAuth Authentication


By default, “Adobe Granite OAuth Authentication Handler” is not enabled by default, the handler can be enabled by opening and saving without doing any changes.



Test the Login Flow


Now the configurations are ready, let us initiate the login — access http://localhost:4503/j_security_check?configid=google from browser(in real scenario you can enable a link or button pointing to this URL). This will take the user to Google Sign-in screen

aem-social-login-with-google

Now you will be logged in to WKND website after successful login from Google Sign in page

aem-social-login-with-google

The user profile is created in AEM

aem-social-login-with-google


aem-social-login-with-google

Whenever the profile data is changed (e.g family_name and given_name) in Google account the same will be reflected to AEM in subsequent login based on the “Apache Jackrabbit Oak Default Sync Handler” configuration.

AEM creates “Apache Jackrabbit Oak Default Sync Handler” configuration specific to each OAuth provider implementations.

The sync handler syncs the user profile data between the external authentication system and AEM repository.

The user profile data is synced based on the User Expiration Time setting, the user data will get synced on the subsequent login after the synced user data expired(default is 1 hr)

Modify the configurations based on the requirement.

aem-social-login-with-google


aem-social-login-with-google


aem-social-login-with-google


Encapsulated Token Support


By default the authentication token is persisted in the repository under user’s profile. That means the authentication mechanism is stateful. Encapsulated Token is the way to configure stateless authentication. It ensures that the cookie can be validated without having to access the repository but the still the user should available in all the publishers for farm configuration.

Refer https://docs.adobe.com/content/help/en/experience-manager-65/administering/security/encapsulated-token.html#StatelessAuthenticationwiththeEncapsulatedToken for more details on Encapsulated Token Support

Enable the Encapsulated Token Support in “Adobe Granite Token Authentication Handler”

aem-social-login-with-google

Sling Distribution user’s synchronization


The users created in a publisher should be synced to all the other publishers in the farm to support the seamless authentication. I am not finding good reference document to explain the user sync in AEM as Cloud(AEM Communities features are not enabled in AEM as Cloud Service, the user sync was enabled through the community components for other AEM version), planning to cover the user sync in another tutorial.

Conclusion


This tutorial is mainly focused on enabling the authenticate the website users through Google account but the same solution can be used with small changes to support different providers. Feel free to give your feed back and changes on the provider bundle.


Saturday, June 13, 2020

Sling Content Distribution in AEM (Part 2) — Reverse Distribution

This tutorial is the continuation of earlier tutorial on Sling Content Distribution in AEM, refer the following URL for part1 tutorial - https://www.albinsblog.com/2020/05/sling-content-distribution-sync-use.html


In this tutorial let us see the details on Sling Reverse Distribution on AEM.

REVERSE DISTRIBUTION - DEFINITION

  • A reverse distribution setup allows one to transfer content from a farm of source instances(publisher) to a target instance(author). 
  • That is done by pulling the content from source instances(publisher) into the target instance(author).

sling-reverse-distribution
This will help us to sync the data generated in farms of publishers into the Author instances.

REVERSE DISTRIBUTION - CONFIGURATIONS

sling-reverse-distribution

  • configure a “queue” agent and package exporter on publisher(source instance)
org.apache.sling.distribution.agent.impl.QueueDistributionAgentFactory-reverse.json            
    name="reverse"

org.apache.sling.distribution.packaging.impl.exporter.AgentDistributionPackageExporterFactory-reverse
    name="reverse"
    agent.target="(name=reverse)"
  • configure a “reverse" agent on author(target instance) pointing to the URL of the exporter on publish, multiple publisher endpoints can be configured
org.apache.sling.distribution.agent.impl.ReverseDistributionAgentFactory-reverse.json            
    name="reverse"
    packageExporter.endpoints=["http://localhost:4503/libs/sling/distribution/services/exporters/reverse"]

REVERSE DISTRIBUTION - DEMO

  • Configure Reverse Agent in Author
  • Configure Queue agent  and exporter on Publisher
  • Enable Triggers – Scheduled/JCREvent
  • Test – CURL/Triggers

Configure Reverse Agent in Author


Configure a Reverse Agent in Author that will PULL distribution content from publishers endpoints based on the configuration.

Access http://localhost:4502/aem/start.html, Tools - Deployments - Distribution

sling-content-distribution-aem



Create new Distribution agent of type - Reverse Distribution 

Enter a name - "reverse"
Title - "reverse"
Check "Enabled"
Service Name - Service name is optional, if required create a service user with required permission
Change the lo level if required
Add exporter endpoint URL's - the URL point to the publisher, multiple endpoint URL's can be configured, http://localhost:4503/libs/sling/distribution/services/exporters/reverse(reverse is the Queue Distribution agent name of publisher)

distribution-agent-aem


distribution-agent-aem



Save the configurations , the agent is created now but the status is paused 

distribution-agent-aem


Resume the agent by clicking on the resume button on the agent detail page.

distribution-agent-aem


distribution-agent-aem

The agent is now ready to Pull the distribution content from publisher.

Configure Queue agent and exporter on Publisher

Configure a queue agent that places the changes into the queues and an exporter that exports packages from the queue agent.

Access http://localhost:4503/aem/start.html, Tools - Deployments - Distribution

Create new Distribution agent of type - Queue
Enter a name - "reverse"
Title - "reverse"
Check "Enabled"
Service Name - Service name is optional, if required create a service user with required permission
Change the lo level if required
Allowed Roots - Add the root paths the agent is responsible for distribution e.g /content/we-retail(if required multiple root paths can be configured )

distribution-agent-aem


Save the configurations, Queue Distribution Agent is enabled now

distribution-agent-aem


Let us now configure Exporter


Enter name - "reverse"
The target reference for the DistributionAgent that will be used to export packages - "(name=reverse)", here "reverse" is the queue agent name configured in the previous step

distribution-agent-aem


Now the initial configurations are ready, let us test the reverse distribution scenario through curl commands

Modify some content under /content/we-retail node in publisher

distribution-agent-aem.

Execute the below curl commands

curl -u admin:admin http://localhost:4503/libs/sling/distribution/services/agents/reverse -d "action=ADD" -d "path=/content/we-retail/jcr:content"   (add the modified content to publisher Distribution queue)

distribution-agent-aem

Now the content is queued to the publisher distribution queue

distribution-agent-aem


curl -u admin:admin http://localhost:4502/libs/sling/distribution/services/agents/reverse -d "action=PULL" (PULL the content from publisher queue to author)

distribution-agent-aem

 Now the content is pulled to author

distribution-agent-aem

Let us now see how to automate the reverse distribution through triggers

Configure a JCR Event Trigger in Publisher


Configure a JCR Event Trigger in Publisher to add the JCR changes under the configured path to the Distribution queue

Access http://localhost:4503/system/console/configMgr/org.apache.sling.distribution.trigger.impl.JcrEventDistributionTriggerFactory

Enter name - "reverse-sync"
Path for which the changes are distributed - "/content/we-retail"
Service Name - Enter the service name with required access, i am using the default one for demo(socialpubsync-distributionService), the trigger will not be activated without configuring the service user
Use deep distribution - Enable this if want to distribute the subtree of the configured node on any events

distribution-agent-aem

Now link the trigger to the "Apache Sling Distribution Agent - Queue Agents Factory"  configured with the name "reverse" in the earlier step, Triggers - (name=reverse-sync)

distribution-agent-aem

Configure a Scheduled Event Trigger in Author

Configure a Scheduled Event Trigger in Author to pull the content from publishers Distribution Queue


Enter name - "reverse-sync"
Distribution Type - "PULL"
Distributed Path, the path to be distributed periodically- "/content/we-retail"
Service Name - Enter the service name with required access, i am using the default one for demo(socialpubsync-distributionService),  the trigger will not be activated without configuring the service user
Interval in Seconds - he number of seconds between distribution requests. Default 30 seconds

distribution-agent-aem


Now link the trigger to the "Apache Sling Distribution Agent - Reverse Agents Factory"  configured with the name "reverse" in the earlier step, Triggers - (name=reverse-sync)

distribution-agent-aem


Now the content modification from publisher under /content/we-retail node will be synced to author on every 30 seconds


This concludes the reverse distribution configuration between publisher and author, the content changes from publisher is pulled to author. We can configure multiple publisher endpoints in the Author reverse distribution agent to pull the content changes . The triggers can be configured in Author and Publishers to completely automate the reverse distribution of the contents. Let us continue with distribution sync in next tutorial.