Wednesday, April 6, 2016

How to restrict crawling/indexing of specific URLs in Adobe Search and Promote(Adobe S&P)

How to restrict crawling/indexing of specific URLs in Adobe Search and Promote(Adobe S&P)

Some cases we may need to index specific type of URLs from the website and excluding all other URLs available.

The URL Masks can be used in Adobe S&P to achieve this.

URL mask will help us to define the rules to include or exclude the specific URLs during the indexing.

We will be able to define include and exclude rules

Include - pattern that specifies the URLs will be indexed
Exclude - pattern that specifies the URLs will be excluded from the indexing.

To index the URLs that is starting with mask.


The crawler will index all the URLs that starts with https://server.com/content/doc


To index the URLs that is in the particular format.

This crawler will index all the URLs matching with  - https://server.com/content/doc/*.html?id=*

e.g. https://server.com/content/doc/sample.html?id=123

Regex can be used to match the URLs for indexing


This crawler will index all the URLs matching with the regex ^.*/content/doc/.*\.html$
e.g. https://server.com/content/doc/sample.html


No comments:

Post a Comment