Note that there are some explanatory texts on larger screens.

plurals
  1. PO
    primarykey
    data
    text
    <p>I had to use a variety of techniques suggested. Thanks pointing me in the right direction(s). Hopefully, the following will help someone else out with this type of problem to solve.</p> <p><strong>Removing excess characters</strong></p> <pre><code>CREATE FUNCTION [dbo].[fn_StripCharacters] ( @String NVARCHAR(MAX), @MatchExpression VARCHAR(255) ) RETURNS NVARCHAR(MAX) AS BEGIN SET @MatchExpression = '%['+@MatchExpression+']%' WHILE PatIndex(@MatchExpression, @String) &gt; 0 SET @String = Stuff(@String, PatIndex(@MatchExpression, @String), 1, '') RETURN @String END </code></pre> <p><em>Usage:</em> </p> <pre><code>--remove all non-alphanumeric and non-white space dbo.fn_StripCharacters(@Value, , '^a-z^0-9 ') </code></pre> <p><strong>Split name into parts</strong></p> <pre><code>CREATE FUNCTION [dbo].[SplitTable] (@sep char(1), @sList StringList READONLY) RETURNS @ResultList TABLE ( [ID] VARCHAR(MAX), [Val] VARCHAR(MAX) ) AS BEGIN declare @OuterCursor cursor declare @ID varchar(max) declare @Val varchar(max) set @OuterCursor = cursor fast_forward for (SELECT * FROM @sList) FOR READ ONLY open @OuterCursor fetch next from @OuterCursor into @ID, @Val while (@@FETCH_STATUS=0) begin INSERT INTO @ResultList (ID, Val) select @ID, split.s from dbo.Split(@sep, @Val) as split where len(split.s) &gt; 0 fetch next from @OuterCursor into @ID, @Val end close @OuterCursor deallocate @OuterCursor CREATE FUNCTION [dbo].[Split] (@sep char(1), @s varchar(8000)) RETURNS table AS RETURN ( WITH Pieces(pn, start, stop) AS ( SELECT 1, 1, CHARINDEX(@sep, @s) UNION ALL SELECT pn + 1, stop + 1, CHARINDEX(@sep, @s, stop + 1) FROM Pieces WHERE stop &gt; 0 ) SELECT pn, LTRIM(RTRIM(SUBSTRING(@s, start, CASE WHEN stop &gt; 0 THEN stop-start ELSE 8000 END))) AS s FROM Pieces ) RETURN </code></pre> <p><em>Usage:</em> </p> <pre><code>--create split name list DECLARE @NameList StringList INSERT INTO @NameList (ID, Val) SELECT id, firstname FROM dbo.[User] u WHERE PATINDEX('%[^a-z]%', u.FirstName) &gt; 0 ----remove split dups select u.ID, COUNT(*) from dbo.import_SplitTable(' ', @NameList) splitList INNER JOIN dbo.[User] u ON splitList.id = u.id </code></pre> <p><strong>Common nicknames:</strong></p> <p>I created a table based on <a href="http://deron.meranda.us/data/nicknames.txt" rel="noreferrer">this list</a> and used it to join on common name equivalents.</p> <p>Usage:</p> <pre><code>SELECT u.id , u.FirstName , u_nickname_maybe.Name AS MaybeNickname , u.LastName , c.ID AS ContactID from FROM dbo.[User] u INNER JOIN nickname u_nickname_match ON u.FirstName = u_nickname_match.Name INNER JOIN nickname u_nickname_maybe ON u_nickname_match.relatedid = u_nickname_maybe.id LEFT OUTER JOIN ( SELECT c.id, c.LastName, c.FirstName, c_nickname_maybe.Name AS MaybeFirstName FROM dbo.Contact c INNER JOIN nickname c_nickname_match ON c.FirstName = c_nickname_match.Name INNER JOIN nickname c_nickname_maybe ON c_nickname_match.relatedid = c_nickname_maybe.id WHERE c_nickname_match.Name &lt;&gt; c_nickname_maybe.Name ) as c ON c.AccountHolderID = ah.ID AND u_nickname_maybe.Name = c.MaybeFirstName AND u.LastName = c.LastName WHERE u_nickname_match.Name &lt;&gt; u_nickname_maybe.Name </code></pre> <p><strong>Phonetic algorithms (Jaro Winkler):</strong></p> <p>The amazing article, <a href="http://anastasiosyal.com/archive/2009/01/11/18.aspx" rel="noreferrer">Beyond SoundEx - Functions for Fuzzy Searching in MS SQL Server</a>, shows how to install and use the <a href="http://sourceforge.net/projects/simmetrics/" rel="noreferrer">SimMetrics</a> library into SQL Server. This library lets you find relative similarity between strings and includes numerous algorithms. I ended up mostly using <a href="http://www.dcs.shef.ac.uk/~sam/stringmetrics.html#jarowinkler" rel="noreferrer">Jaro Winkler</a> to match the names. </p> <p><em>Usage:</em> </p> <pre><code>SELECT u.id AS UserID ,c.id AS ContactID ,u.FirstName ,c.FirstName ,u.LastName ,c.LastName ,maxResult.CombinedScores from ( SELECT u.ID , max( dbo.JaroWinkler(lower(u.FirstName), lower(c.FirstName)) * dbo.JaroWinkler(LOWER(u.LastName), LOWER(c.LastName)) ) AS CombinedScores FROM dbo.[User] u, dbo.[Contact] c WHERE u.ContactID IS NULL GROUP BY u.id ) AS maxResult INNER JOIN dbo.[User] u ON maxResult.id = u.id INNER JOIN dbo.[Contact] c ON maxResult.CombinedScores = dbo.JaroWinkler(lower(u.FirstName), lower(c.FirstName)) * dbo.JaroWinkler(LOWER(u.LastName), LOWER(c.LastName)) </code></pre>
    singulars
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. VO
      singulars
      1. This table or related slice is empty.
    2. VO
      singulars
      1. This table or related slice is empty.
    3. VO
      singulars
      1. This table or related slice is empty.
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload