MSSQL Levenshtein

5. February 2011 11:31

This is a great extention to have for mssql. There is more information about what the function does on the levenshtein wiki page in short it will give you a number of changes (add / delete / change) to get from the first string to the second string.

For example: "Hello World" and "Hello Worlds" has a distance of 1. This can be great if you are trying to do searches where you want to pick up on typo's or spelling mistakes.

The c# code for the extenstion is

 

[Microsoft.SqlServer.Server.SqlFunction(IsDeterministic = true, IsPrecise = false)]
    public static SqlInt32 Levenshtein(SqlString S1, SqlString S2)
    {
        if (S1.IsNull || S2.IsNull)
            throw (new ArgumentNullException());

        int n = S1.Value.Length;
        int m = S2.Value.Length;

        int[,] d = new int[n + 1, m + 1];
        int cost = 0;

        if (n == 0)
            return m;
        if (m == 0)
            return n;

        for (int i = 0; i <= n; i++)
            d[i, 0] = i;

        for (int j = 0; j <= m; j++)
            d[0, j] = j;

        for (int i = 1; i <= n; i++)
        {
            for (int j = 1; j <= m; j++)
            {
                if (S1.Value[i-1] == S2.Value[j-1])
                    cost = 0;
                else
                    cost = 1;

                d[i, j] = System.Math.Min(System.Math.Min(d[i - 1, j] + 1, d[i, j - 1] + 1), d[i - 1, j - 1] + cost);
            }
        }

        return d[n, m];
    }

 

Once compiled you can create a function for mssql woth the following code.

 

CREATE Function Levenshtein(@S1 nvarchar(512), @S2 nvarchar(512))
	RETURNS int as EXTERNAL NAME SqlExtensions.UserDefinedFunctions.Levenshtein
GO


You can read more about enabling clr support in mssql here

E-mail Kick it! DZone it! del.icio.us Permalink


Pingbacks and trackbacks (1)+

Add comment




  Country flag
biuquote
  • Comment
  • Preview
Loading