Sunday 14 June 2015

Querying Lucene index with Sitecore ContentSearch API on Date field

Leave a Comment
Last month I’ve been involved in implementing Sitecore Content Search API for a project and stuck up in a situation where I have to index some Sitecore items and to perform some operations on Sitecore Date field type. Data template is having below fields:
BlogDate field is created with the type ‘Date’. I’ve to get all the blog items within a specific year.  Sitecore Content Search API enables us to write search queries using LINQ statements. I’ve written below code to get all items which are created using Blog template and belong to specific year:

POCO Class
public class BlogItem : SearchResultItem
{

    [IndexField("BlogTitle")]
    public string BlogTitle { get; set; }

    [IndexField("BlogDate")]
    public DateTime BlogDate { get; set; }

    [IndexField("BlogContent")]
    public string BlogContent { get; set; }
}
Lucene Index Configuration
<field fieldName="BlogTitle"                storageType="YES"  indexType="TOKENIZED"    vectorType="NO" boost="1f" type="System.String" settingType="Sitecore.ContentSearch.LuceneProvider.LuceneSearchFieldConfiguration, Sitecore.ContentSearch.LuceneProvider" />

<field fieldName="BlogDate"                storageType="YES"  indexType="TOKENIZED"    vectorType="NO" boost="1f" type="System.DateTime" settingType="Sitecore.ContentSearch.LuceneProvider.LuceneSearchFieldConfiguration, Sitecore.ContentSearch.LuceneProvider" />

<field fieldName="BlogContent"                storageType="YES"  indexType="TOKENIZED"    vectorType="NO" boost="1f" type="System.String" settingType="Sitecore.ContentSearch.LuceneProvider.LuceneSearchFieldConfiguration, Sitecore.ContentSearch.LuceneProvider" />
C# Code
public void GetItemsByYear(int year)
        {
            ID blogTemplateId = new Sitecore.Data.ID("{48587868-4A19-48BB-8FC9-F1DD31CB6C8E}");
            var index = Sitecore.ContentSearch.ContentSearchManager.GetIndex("custom_index");
            List<BlogItem> allBlogItems = new List<BlogItem>();
            using (Sitecore.ContentSearch.IProviderSearchContext context = index.CreateSearchContext())
            {
                allBlogItems = context.GetQueryable<BlogItem>()
                                   .Where(s => s.TemplateId == blogTemplateId && s.BlogDate.Year == year).ToList();
            }            
        }
Unfortunately above code doesn’t return all items which are created using Blog template and belong to specific year and I can’t debug Lamba expression to know what is the exact value of BlogDate.Year. So I’ve started troubleshooting this issue. Sitecore stores date field value as ISO 8601 date time string format (yyyyMMddThhmmss). It can be verified in raw value:
I’ve also checked format of date field in Lucene index by Luke. Are you new to Luke? No problems. I would strongly suggest you to read this article by Dan to understand how to debug Lucene based ContentSearch in Sitecore. Below is the format of date field in Lucene index:
So basically value of data field is getting indexed in yyyyMMddThhmmss format. I’ve modified Lucene index configuration file as below for BlogDate field and added format=”yyyyMMdd”:
<field fieldName="BlogDate"                storageType="YES"  indexType="TOKENIZED"    vectorType="NO" boost="1f" type="System.DateTime" format="yyyyMMdd" settingType="Sitecore.ContentSearch.LuceneProvider.LuceneSearchFieldConfiguration, Sitecore.ContentSearch.LuceneProvider" />
I re-indexed Lucene index and now Date field is stored in yyyyMMdd format:
One important thing to note here is that Lucene indexes date field in a whole number format i.e.  Year, month and day value are stored in same field. It means that I can’t compare date field value with only year or only month or only day. So if you want to query against a date field in Sitecore LINQ query then you must compare it with DateTime type. Therefore I’ve modified my code logic as below to filter out blog items based on a particular year:
public void GetItemsByYear(int year)
        {
            DateTime startDate = new DateTime(year, 1, 1);
            DateTime endDate = new DateTime(year + 1, 1, 1);
            ID blogTemplateId = new Sitecore.Data.ID("{48587868-4A19-48BB-8FC9-F1DD31CB6C8E}");
            var index = Sitecore.ContentSearch.ContentSearchManager.GetIndex("custom_index");
            List<BlogItem> allItems = new List<BlogItem>();
            using (Sitecore.ContentSearch.IProviderSearchContext context = index.CreateSearchContext())
            {
                allItems = context.GetQueryable<BlogItem>()
                                   .Where(s => s.TemplateId == blogTemplateId && s.BlogDate >= startDate && s.BlogDate < endDate).ToList();
            }            
        } 
This logic solved the problem and I am getting blog items for a specific year.  Comments and suggestions are most welcome. Happy coding! 

0 comments :

Post a Comment