In this post I want to show how to address a missing feature that was a part of “old” lucene index implementation. This article will provide an example how one can customize Lucene search configuration so that it’s possible to add custom fields to the index.
First off, let’s create a configuration that would allow us to add additional fields to the indexed data.
<index id="News" type="Sitecore.Search.Index, Sitecore.Kernel">
<param desc="name">$(id)</param>
<param desc="folder">_news</param>
<Analyzer ref="search/analyzer" />
<locations hint="list:AddCrawler">
<examples-news type="LuceneExamples.DatabaseCrawler,LuceneExamples">
<Database>web</Database>
<Root>/sitecore/content</Root>
<IndexAllFields>true</IndexAllFields>
<include hint="list:IncludeTemplate">
<news>{788EF1BE-B71E-4D59-9276-50519BD4F641}</news>
<tag>{4DD970FB-2695-4E50-96F3-A766F7D6CAF1}</tag>
</include>
<fields hint="raw:AddCustomField">
<field luceneName="author" storageType="no" indexType="tokenized">__updated by</field>
<field luceneName="changed" storageType="yes" indexType="untokenized">__updated</field>
</fields>
</examples-news>
</locations>
</index>
There is a new configuration section in this example. It’s <fields> section that introduces two fields “author” and “changed”. These fields will be added to a fields collection of each indexed item. Basically, there is AddCustomField method that gets called for every <field> configuration entry to identify a custom field that is going to be added to the fields collection.
Description of configuration attributes:
- luceneName is a field name that appears in lucene index.
- storageType is a storage type for lucene field. It can have the following values:
- no
- yes
- compress
- indexType is an index type for lucene field. It can have the following values:
- no
- tokenized
- untokenized
- nonorms
Refere to Lucene documentation to find out what each of these options mean: store and index.
Now all you need to do is to loop through the collection of custom fields in the overridden AddAllFields method and add them to the indexed data.
I created a custom class called CustomField that helps to manage custom field entries. Below is the example of this class as well as additional methods for extended DatabaseCrawler. Since code for the DatabaseCrawler was already published in this blog post, I’m not going to duplicate it here.
Here is a code for CustomField class.
using System.Xml;
using Sitecore.Data;
using Sitecore.Data.Items;
using Sitecore.Xml;
using Lucene.Net.Documents;
namespace LuceneExamples
{
public class CustomField
{
public CustomField()
{
FieldID = ID.Null;
FieldName = "";
LuceneFieldName = "";
}
public ID FieldID
{
get;
private set;
}
public string FieldName { get; private set; }
public Field.Store StorageType { get; set; }
public Field.Index IndexType { get; set; }
public string LuceneFieldName { get; private set; }
public static CustomField ParseConfigNode(XmlNode node)
{
CustomField field = new CustomField();
string fieldName = XmlUtil.GetValue(node);
if (ID.IsID(fieldName))
{
field.FieldID = ID.Parse(fieldName);
}
else
{
field.FieldName = fieldName;
}
field.LuceneFieldName = XmlUtil.GetAttribute("luceneName", node);
field.StorageType = GetStorageType(node);
field.IndexType = GetIndexType(node);
if (!IsValidField(field))
{
return null;
}
return field;
}
public string GetFieldValue(Item item)
{
if (!ID.IsNullOrEmpty(FieldID))
{
return item[ID.Parse(FieldID)];
}
if(!string.IsNullOrEmpty(FieldName))
{
return item[FieldName];
}
return string.Empty;
}
private static bool IsValidField(CustomField field)
{
if ((!string.IsNullOrEmpty(field.FieldName) || !ID.IsNullOrEmpty(field.FieldID)) && !string.IsNullOrEmpty(field.LuceneFieldName))
{
return true;
}
return false;
}
private static Field.Index GetIndexType(XmlNode node)
{
string indexType = XmlUtil.GetAttribute("indexType", node);
if (!string.IsNullOrEmpty(indexType))
{
switch (indexType.ToLowerInvariant())
{
case "no":
return Field.Index.NO;
case "tokenized":
return Field.Index.TOKENIZED;
case "untokenized":
return Field.Index.UN_TOKENIZED;
case "nonorms":
return Field.Index.NO_NORMS;
}
}
return Field.Index.TOKENIZED;
}
private static Field.Store GetStorageType(XmlNode node)
{
string storage = XmlUtil.GetAttribute("storageType", node);
if (!string.IsNullOrEmpty(storage))
{
switch (storage.ToLowerInvariant())
{
case "no":
return Field.Store.NO;
case "yes":
return Field.Store.YES;
case "compress":
return Field.Store.COMPRESS;
}
}
return Field.Store.NO;
}
}
}
And the code for additional methods for DatabaseCrawler.
/// <summary>
/// Loops through the collection of custom fields and adds them to fields collection of each indexed item.
/// </summary>
/// <param name="document">Lucene document</param>
/// <param name="item">Sitecore data item</param>
private void AddCustomFields(Document document, Item item)
{
foreach(CustomField field in _customFields)
{
document.Add(CreateField(field.LuceneFieldName, field.GetFieldValue(item), field.StorageType, field.IndexType, Boost));
}
}
/// <summary>
/// Creates a Lucene field.
/// </summary>
/// <param name="fieldKey">Field name</param>
/// <param name="fieldValue">Field value</param>
/// <param name="storeType">Storage option</param>
/// <param name="indexType">Index type</param>
/// <param name="boost">Boosting parameter</param>
/// <returns></returns>
private Fieldable CreateField(string fieldKey, string fieldValue, Field.Store storeType, Field.Index indexType, float boost)
{
Field field = new Field(fieldKey, fieldValue, storeType, indexType);
field.SetBoost(boost);
return field;
}
/// <summary>
/// Parses a configuration entry for a custom field and adds it to a collection of custom fields.
/// </summary>
/// <param name="node">Configuration entry</param>
public void AddCustomField(XmlNode node)
{
CustomField field = CustomField.ParseConfigNode(node);
if (field == null)
{
throw new InvalidOperationException("Could not parse custom field entry: " + node.OuterXml);
}
_customFields.Add(field);
}
Last thing that is left to do is to call AddCustomFields method from AddAllFields one.
protected override void AddAllFields (Documentdocument, Itemitem, bool versionSpecific)
{
………………………………………
AddCustomFields(document, item);
}
You can take it even further and add support for some field interpreter for each field configuration entry.
Hope you'll find it useful.