Thursday, December 6, 2018

How to use Attribute Loader in Adobe Search and Promote

How to use Attribute Loader in Adobe Search and Promote


This tutorial explains the details on how to use Attribute Loader in Adobe Search and Promote.

Attribute loader help us to provide additional meta data to the URL’s crawled from website.

For example, the PDF’s document crawled from the website will not have any additional metadata specified but the additional metadata can be loaded through Attribute Loader.

e.g while crawling the pdf document from website it will be possible to provide only pdf URL and file name but will not be able to provide the additional details like title, description etc, these additional metadatas can be provided via Attribute Loader.

The values will be merged during indexing through primary key value.

PDF URL Crawled from website — https://www.example.com/test/Albin.pdf

Additional Meta Data-

url- https://www.example.com/test/Albin.pdf(primary key)
Tittle — test PDF
Description — test PDF

The Attribute Loader is executed before actual indexing and the metadata data values are merged based on the primary key during indexing.

Adobe-search&promote-attribute-loader

Attribute Loader option is available under Settings → Metadata →Attribute Loader

Adobe-search&promote-attribute-loader

Add new Attribute Loader definition

Adobe-search&promote-attribute-loader

Sample Feed XML with additional meta data , the data should be available through any one of the following channel — HTTP(S),FTP, sFTP and File

The meta data for each PDF document is represented by Item tag in the XML data

<attributes xmlns:xs=”http://www.w3.org/2001/XMLSchema" version=”2.0">

<channel>

<title>Attribute Loader Feed</title>

<Item>

<title>test PDF1</title>

<desc>test PDF1</desc>

<url>https://www.example.com/test/Albin1.pdf</url>

</Item>

<Item>

<title>test PDF2</title>

<desc>test PDF2</desc>

<url>https://www.example.com/test/Albin2.pdf</url>

</Item>

</channel>

</attributes>

Enter the Feed details to the Attribute Loader

Adobe-search&promote-attribute-loader

Map the Feed data fields to the Adobe S&P meta data definitions, specify a primary key to map the attribute loader data with the data crawled from other channels e.g website. Here the PDF URL is considered as a primary key, the URL is available through both website and Attribute Loader feed.

The Attribute Loader data is merged to the document based on the primary key during the indexing.

Adobe-search&promote-attribute-loader

The Attribute Loader data can be previewed after configuration, to preview the Attribute Loader Data — Click on Load Attribute Loader Data then Start Load

Adobe-search&promote-attribute-loader
Adobe-search&promote-attribute-loader

Click on Start Load, this will show the preview of the data loaded from the feed

Adobe-search&promote-attribute-loader

Make sure the Content-Types for the required document types(e.g application/pdf) are selected to enable the crawler to crawl those document types from website — the documents for those the Content Types enabled will be crawled from the website

The Content Types can be enabled from Settings → Crawling →Content Types

Adobe-search&promote-attribute-loader

Configure the URL entrypoint — website URL from where the documents should be crawled and the URL mask — the matching URL that should be considered for crawling.

Settings →Crawling →URL Entrypoints

Adobe-search&promote-attribute-loader

The URL mask can be enabled from Settings →Crawling →URL Masks

Adobe-search&promote-attribute-loader

Sample URL

https://www.example.com/home.html

<html>

<body>

<a href=”https://www.example.com/test/Albin1.pdf">test1 pdf

<a href=”https://www.example.com/test/Albin2.pdf">test2 pdf

</body>

</html>

Run the live index by configuring the website URL entrypoint that has the reference to PDF documents, now the search result displays the metadata provided by Attribute Loader for PDF documents

The Attribute Loader is not enabled by default, this should be enabled in S&P account by your Adobe account representative or by Adobe Support.

The attribute loaders add the capability to provide additional meta data to the documents crawled through a channel that is enabled with limited data.

Tuesday, December 4, 2018

How to use Index Connector in Adobe Search and Promote

How to use Index Connector in Adobe Search and Promote


This tutorial explains the details on how to use Index Connector in Adobe Search and Promote.


Index Connector


Index Connector enable us to define additional input sources for indexing XML pages or any kind of feed.

Search and Promote allows us to add the website URL’s as an entry point to crawl the pages for indexing, the URL’s also can be crawled and indexed through Index Connector.(the URL entry points and Index Connectors can be defined together for crawling and indexing)

For example, the Index Connector can be used to index the large number of product data from eCommerce systems to reduce the crawling and indexing time. Index Connector approach provides better crawling/indexing performance.

A XML data feed consists of records corresponds to the individual documents that can be added to the index

A text data feed contains new-line-delimited records corresponds to the individual documents that can be added to the index

Mapping can be enabled to map the feed data to the metadata fields in the resulting index

Multiple protocols can be used to connect to the input feed sources from Index Connector — HTTP(S)/FTP/SFTP/FILE



The IndexConnector is not enabled by default in S&P account, the same should be enabled by Adobe S&P account team.

Define Index Connector



After enabling the Index Connector to the account, the same can be accessed from Settings →Crawling →Index Connector




As a first step add a Index Connector



Sample product feed file(XML)

<feed xmlns:xs=”http://www.w3.org/2001/XMLSchema" version=”2.0">
<channel>
<title>Product Feed</title>
<Item>
<title>
<![CDATA[product-title]]>
</title>
<pubDate>05/09/2011</pubDate>
<pubYear>2011</pubYear>
<description>
<![CDATA[<p>product description</p>]]>
</description>
<productType>Research</productType>
<category>
<![CDATA[Financial Planning|Financial Planners|Research]]>
</category>
<ProductId>123</ProductId>
<imageUrl>/content/dam/Images/product/123.jpg</imageUrl>
</Item>
<Item>
<title>
<![CDATA[product-title]]>
</title>
<pubDate>05/09/2011</pubDate>
<pubYear>2011</pubYear>
<description>
<![CDATA[<p>product description</p>]]>
</description>
<productType>Research</productType>
<category>
<![CDATA[Financial Planning|Financial Planners|Research]]>
</category>
<ProductId>1234</ProductId>
<imageUrl>/content/dam/Images/product/1234.jpg</imageUrl>
</Item>
<Item>
<link>https:/www.example.com/product-title/p/12345</link>
<title>
<![CDATA[product-title]]>
</title>
<pubDate>05/09/2011</pubDate>
<pubYear>2011</pubYear>
<description>
<![CDATA[<p>product description</p>]]>
</description>
<productType>Research</productType>
<category>
<![CDATA[Financial Planning|Financial Planners|Research]]>
</category>
<ProductId>12345</ProductId>
<imageUrl>/content/dam/Images/product/12345.jpg</imageUrl>
</Item>
</channel>
</feed>

The feed file is available through HTTP(S) URL — www.example.com/product/feed.xml, the Index Connector can also access the feed through FTP, SFTP and FILE protocol’s.

Enter a name for the Index Connector

Select Type as Feed

Select Enabled

Configure Host Address and File Path

Select the appropriate Protocol

Configure the Timeout and Retries as required

Itemtag — tag represents the individual records



Enable the mapping for the fields from feed file to metadata defined, define a primary key value that will identify each record uniquely.



The configurations can be previewed before adding the Index Connector, click on Preview button



The Index Connector configuration is now ready, enable the Index Connector as URL entry point for crawling and indexing Setting → Crawling → URL Entrypoints



Select the Index Connector defined in the above step from the drop down “ — Add Index Connector Configurations — “



Now the configurations are ready, run a full live index so the new records will be reflected in the search result.

The Index Connector will provide the easy option to index the documents from feed data, this provides better performance during crawling and indexing. The Index connector can be used to index large volume of data for eCommerce systems.