SearchExtensions: Multiple property search support with ranking in c#

3 min read

The updates are coming thick and fast. I am pleased to say that I have extended NinjaNye.SearchExtensions to include support for searching with ranked results for both IQueryable and IEnumerable.

The SearchExtensions nuget package is available by running the following in your package manage console

<p class="nuget-badge"><code>PM> Install-Package NinjaNye.SearchExtensions</code></p>

Performing a ranked search

Performing a ranked search imitates the current Search functionality but instead returns an IQueryable<IRanked<T>>

// Retrieve ranked search results where FirstName or LastName contains "john"
var result = context.Users.RankedSearch("john", u => u.FirstName, u => u.LastName);

So what is IRanked<T>. Well, It is simply as follows:

public interface IRanked<out T>
{
    int Hits { get; }
    T Item { get; }
}

Meaning given the follwing users...

<table class="table table-condensed table-striped table-bordered" > <thead> <tr> <th>Id</th><th>FirstName</th><th>MiddleNames</th><th>LastName</th> </tr> </thead> <tbody> <tr><td>53</td><td>John</td><td></td><td>Nye</td></tr> <tr><td>54</td><td>Johnny</td><td>Jimmy</td><td>Johnson</td></tr> <tr><td>55</td><td>Jimmy</td><td>James</td><td>Smith</td></tr> </tbody> </table>

The above RankedSearch() query return the following data:

<table class="table table-condensed table-striped table-bordered" > <thead> <tr> <th>Hits</th><th>Id</th><th>FirstName</th><th>MiddleNames</th><th>LastName</th> </tr> </thead> <tbody> <tr><td>1</td><td>53</td><td>John</td><td></td><td>Nye</td></tr> <tr><td>2</td><td>54</td><td>Johnny</td><td>Jimmy</td><td>Johnson</td></tr> </tbody> </table>

Notice it strips out Jimmy Smith since there are no matches, and instead of simply returning the data as Search function would, a RankedSearch() builds a new IRanked result which includes data about the search, specifically the amount of times a search term was hit.

Order a ranked search by most relevant

The returned IRanked<T> result is still IQueryable meaning you can the sort your ranked search results as you wish. The following example is fairly typical in that we are ordering our results by the most Hits.

// Retrieve ranked search results where FirstName or LastName contains "john"
var result = context.Users.RankedSearch("john", u => u.FirstName, u => u.LastName)
                          .OrderByDescending(r => r.Hits);

The SQL produced (when using a sql provider)

In order to count the number of occurrences (or hits) a term appears in a way that can translate to SQL, I have constructed an expression tree equivalent to the following Lambda for each property and search term

[property].Length - ([property].Replace([searchTerm], "").Length) / [searchTerm].Length

For our user search example this translates to:

([user.FirstName].Length - ([user.FirstName].Replace("john", "").Length) / "john".Length)
+ // Add the two hit counts together for each property
([user.LastName].Length - ([user.LastName].Replace("john", "").Length) / "john".Length)

This, coupled with some null checks produces SQL similar to the following

SELECT 
    [Project1].[Id] AS [Id], 
    [Project1].[C1] AS [C1], 
    [Project1].[FirstName] AS [FirstName], 
    [Project1].[MiddleNames] AS [MiddleNames], 
    [Project1].[LastName] AS [LastName], 
    FROM ( SELECT 
        [Extent1].[Id] AS [Id], 
        [Extent1].[FirstName] AS [FirstName], 
        [Extent1].[MiddleNames] AS [MiddleNames], 
        [Extent1].[LastName] AS [LastName], 
        ((( CAST(LEN(CASE WHEN ([Extent1].[FirstName] IS NULL) THEN N'' ELSE [Extent1].[FirstName] END) AS int)) - ( CAST(LEN(REPLACE(CASE WHEN ([Extent1].[FirstName] IS NULL) THEN N'' ELSE [Extent1].[FirstName] END, N'john', N'')) AS int))) / 4) 
        + 
        ((( CAST(LEN(CASE WHEN ([Extent1].[LastName] IS NULL) THEN N'' ELSE [Extent1].[LastName] END) AS int)) - ( CAST(LEN(REPLACE(CASE WHEN ([Extent1].[LastName] IS NULL) THEN N'' ELSE [Extent1].[LastName] END, N'john', N'')) AS int))) / 4) AS [C1]
        FROM [dbo].[Users] AS [Extent1]
        WHERE (CASE WHEN ([Extent1].[FirstName] IS NULL) THEN N'' ELSE [Extent1].[FirstName] END LIKE N'%john%') OR (CASE WHEN ([Extent1].[LastName] IS NULL) THEN N'' ELSE [Extent1].[LastName] END LIKE N'%john%')
    )  AS [Project1]
    ORDER BY [Project1].[C1] DESC
    

That is all pretty difficult to read but the important part is that as part of the query you can see the following sql for each property and each search term

((( CAST(LEN(                                        -- Get length of property
    CASE WHEN ([Extent1].[FirstName] IS NULL)        -- Check property for null value
         THEN N''                                    -- Substitute null for empty string
         ELSE [Extent1].[FirstName]                  -- Use property value
     END) AS int)) 
-                                                    -- Minus the property without search term
   ( CAST(LEN(                                       -- Get the length of the replaced property
       REPLACE(                                      -- Replace search term in property
           CASE WHEN ([Extent1].[FirstName] IS NULL) --Null check repeated
                THEN N'' 
                ELSE [Extent1].[FirstName] 
           END, N'john', N'')) AS int))) 
       / 4)                                          -- Divide diff by the search term length
       

Hopefully this helps to explain the latest feature of the SearchExtensions project. I look forward to hearing your comments.