Skip to content
rmerizalde edited this page Mar 4, 2013 · 14 revisions

The index schema varies from one application to another. OpenCommerceSearch default schema can be customized for different use cases.

Actually, OCS uses two schemas. One for the product catalog and a second one for the business rules. This section describes the former.

Table of Contents

Multiple Locales

The default schema is ready to support multiple locales. Each language has its on collection, which means that each product is indexed for every supported language. This approach will increase the space requirement because fields that are not localized are indexed once multiple times. Here is an excerpt from the English schema.

All product catalog collections (rules as well) shared the same configuration. The configuration is uploaded into Zookeeper and then each configuration is linked to the collections.

  <xi:include href="xinclude/fields.xml" parse="xml"
    xmlns:xi="http://www.w3.org/2001/XInclude">
  </xi:include>

  <fields>
    
    <field name="text"           type="text_en_splitting" indexed="true" stored="false" multiValued="true"  omitNorms="false" />
    <field name="textSpellCheck" type="text_en_splitting" indexed="true" stored="false" multiValued="true"  omitNorms="true" />
    <field name="keyword"        type="text_en_splitting" indexed="true" stored="false" multiValued="false" omitNorms="true" />

    
    <field name="highest" type="text_en_splitting" indexed="true"  stored="false" required="false" multiValued="true" omitNorms="false" />
    <field name="high"    type="text_en_splitting" indexed="true"  stored="false" required="false" multiValued="true" omitNorms="false" />
    <field name="medium"  type="text_en_splitting" indexed="true"  stored="false" required="false" multiValued="true" omitNorms="false" />
    <field name="low"     type="text_en_splitting" indexed="true"  stored="false" required="false" multiValued="true" omitNorms="false" />
  </fields>

Some fields may be country specific. For such cases there will be a field per country. Each field is defined as a dynamic field and the field name is prefixed by the country code while indexing the content. Here are a few examples:

  <dynamicField name="stockLevel*"      type="int"     indexed="true" stored="true"  required="fasle"  multiValued="false" omitNorms="true" />
  <dynamicField name="onsale*"          type="boolean" indexed="true"  stored="true"  required="false"  multiValued="false" omitNorms="true"  />
  <dynamicField name="freeGift*"        type="boolean" indexed="true"  stored="true"  required="fasle"  multiValued="false" omitNorms="false" />
  <dynamicField name="seoUrl*"          type="string"  indexed="false" stored="true"  required="true"   multiValued="false" omitNorms="true" />

To review the full schema checkout the following links:

Multiple Sites

OpenCommerceSearch framework supports multi-site implementations. Site assignment is determine by the category fields. For more details see Search Feed.

Analyzers

The current version supports English and French out of the box. Adding support for more languages is simple. In fact, the analyzers use out of the box components provided by Lucene/Solr.

Synonyms, Protected Words & Stop Words

OpenCommerceSearch provides a business tool where business users can manage synonyms, protected words and stop words via ATG BCC. Using the BCC, users can create new assets and preview them before deploying their projects to production.

Every time a user deploys a project a deployment listener will be notified of such modifications and OCS will send the new configuration files to SolrCloud/Zookeeper

Each language has its own set of files. Each schema definition must reference the appropriate localized file.

Default fields

The default schema supports the following fields:

  • id: generally the sku/style id of the product. Must be unique.
  • productId: the product id of a set of skus. Multiple products will share the same product id. This field is used by OCS to collapse/group documents in the search results.
  • title: the product's title
  • image: the sku's image URL
  • brand: the product's brand name
  • brandId: the product's brand id
  • year: the sku's year
  • season: a custom value representing the sku's season
  • categoryLeaves: a multi-value field with product's parent category names
  • categoryNodes: am multi-value field with the ancestor category names of the product parent category
  • size: the sku's size
  • scale: the sku's size scale. This field is used when product's size in the same result set have different scales
  • colorFamily: the sku's color family (e.g. red, blue, black, etc.). A product may have multiple colors
  • country: a list of countries where the sku is available
  • stockLevel*: the sku's inventory per country (e.g. stockLevelUS)
  • reviewAverage*: the product's review average per site (e.g. reviewAverageSite1)
  • reviews*: the product's review count per site (e.g. stockLevelSite1)
  • freeGift*: a flat to indicate if the product has a free gift associated with it. This setting is per site.
  • url*: a flat to indicate if the product has a free gift associated with it. This setting is per site.
  • listPrice*: the sku's list price per country
  • salePrice*: the sku's sale price per country
  • discountPercent*: the sku's discount per country
  • isToos: indicates if the product is temporarily out of stock
  • isPastSeason: indicates if the product is from a previous season
  • feature_*: a catch all field for product specifications (e.g. feature_material=Carbon Fiber)
  • attr_*: a catch all field for value that are intended to be used facets (example: attr_recommendeduse=Road Biking)
The schema includes other fields used for spell checking and boosting for instance. For more details searchable fields see Rank Profile
Clone this wiki locally