Tuesday, December 4, 2018

Search&Promote – Crawling(IndexConnector)

Search&Promote – Crawling(IndexConnector)


IndexConnector:


Enable to define additional input sources for indexing XML pages or any kind of feed

The IndexConnector can be used to index the product data from ecommerce systems with large number of product data to reduce the crawling and indexing time. IndexConnector approach better crawling/indexing performance.

An XML data source consists of XML records, that contain information that corresponds to individual documents that can be added to the index

A text data feed contains individual new-line-delimited records that correspond to individual documents that can be added to the index

Mapping can be defined, how each record's items are used to populate the metadata fields in the resulting index

Multiple protocols can be used to connect to the input sources from IndexConnecter – HTTP(S)/FTP/SFTP/FILE

IndexConnector

The IndexConnector is not enabled by default in S&P account, the same should be enabled by Adobe S&P account team.

Defining IndexConnector:


IndexConnector

IndexConnector2.png

Sample product feed file(XML)

<feed
    xmlns:xs="http://www.w3.org/2001/XMLSchema" version="2.0">
    <channel>
        <title>Product Feed</title>
        <Item>
            <link>https://www.example.com/product-title/p/123</link>
            <title>
                <![CDATA[product-title]]>
            </title>
            <pubDate>05/09/2011</pubDate>
            <pubYear>2011</pubYear>
            <description>
                <![CDATA[<p>product description</p>]]>
            </description>
            <productType>Research</productType>
            <category>
                <![CDATA[Financial Planning|Financial Planners|Research]]>
            </category>
            <ProductId>123</ProductId>
            <imageUrl>/content/dam/Images/product/123.jpg</imageUrl>
        </Item>
        <Item>
            <link>https://www.example.com/product-title/p/1234</link>
            <title>
                <![CDATA[product-title]]>
            </title>
            <pubDate>05/09/2011</pubDate>
            <pubYear>2011</pubYear>
            <description>
                <![CDATA[<p>product description</p>]]>
            </description>
            <productType>Research</productType>
            <category>
                <![CDATA[Financial Planning|Financial Planners|Research]]>
            </category>
            <ProductId>1234</ProductId>
            <imageUrl>/content/dam/Images/product/1234.jpg</imageUrl>
        </Item>
        <Item>
            <link>https:/www.example.com/product-title/p/12345</link>
            <title>
                <![CDATA[product-title]]>
            </title>
            <pubDate>05/09/2011</pubDate>
            <pubYear>2011</pubYear>
            <description>
                <![CDATA[<p>product description</p>]]>
            </description>
            <productType>Research</productType>
            <category>
                <![CDATA[Financial Planning|Financial Planners|Research]]>
            </category>
            <ProductId>12345</ProductId>
            <imageUrl>/content/dam/Images/product/12345.jpg</imageUrl>
        </Item>
    </channel>
</feed>

Configure feed file location and the Itemtag

IndexConnector

Map the fields from feed file to metadata defined, define a primary key value that will identify each record uniquely.

IndexConnector

Preview the configuration

IndexConnector

Define the IndexConnector as URL entrypoint for crawling

IndexConnector

IndexConnector


Now run the full live index, the new records will be reflected in the search result after the completion of the indexing.



No comments:

Post a Comment