Ureader.com  
Microsoft software help and Community
   home   |   control panel login   |   archive   |  
 
SQL
ce
clients
clustering
connect
datamining
datawarehouse
dts
fulltext
jdbcdriver
msde
mseq
newusers
notificationsvcs
odbc
olap
programming
replication
reportingsvcs
security
securitytools
server
setup
sqlxml.viewmapper
tools
xml
  
 
date: 10 Mar 2007 05:36:08 -0800,    group: microsoft.public.sqlserver.fulltext        back       


Phonetic Searching and Full Text Indexing For Business and Personal Names an Addresses   
I am undertaking a project which requires the ability to perform
phonetic searching across business and personal names and addresses.
This data is relatively uncleansed to date.  The amount of work to
cleanse the data is tremendous based upon existing data analysis.  My
thoughts are that if we can find the right tools we should be able to
perform matching and searching across the data without implementing
heavy duty cleansing prior to the search.  When I talk about
uncleansed data in terms of names I am talking about things like the
following:

Acme, Inc
Acme, Incorp.
Acme Incorporated

**Note variations in abbreviations of Inc. this occurs on other words
such as Company, Association, Limited, DBA and other unknowns.

Smith Plumbing
Smyth Plumbing
Kool Runnings
Cool Runnings

**Note variations in phonetic spelling of words

Ultimately what I am wondering is to what success have people had with
searching fuzzy type searches like this using solely Microsofts Full
Text Indexing.  I would also be interested to know if SQL Server 2005
improves on the capabilities of fuzzy matching.  In addition I found
some information regardng SQL Turbo http://www.quest.com/sql_turbo/
which seems to be an extender or different implementation of FTS which
does support phonetic matche and I would be interested to know if
anyone has used this tool as well.  Any thoughts on how to approach
this matter would be appreciated.
date: 10 Mar 2007 05:36:08 -0800   author:   unknown

Re: Phonetic Searching and Full Text Indexing For Business and Personal Names an Addresses   
You need to build a thesaurus with all the alternative spellings to 
implement this in SQL FTS. You could try to cleanse your data before 
indexing it using fuzzy grouping, or by rolling your own levenstein edit 
distance. SQL FTS fuzzy search (aka freetext search) will stem your search 
phrase for verb forms but will not do the kind of fuzzy searches you are 
looking for.

There was a time when SQL Turbo was actually faster than SQL 2000 FTS, but 
SQL 2005 is now faster, but I have not compared it against the new version 
of SQL Turbo yet.

-- 
Hilary Cotter

Looking for a SQL Server replication book?
http://www.nwsu.com/0974973602.html

Looking for a FAQ on Indexing Services/SQL FTS
http://www.indexserverfaq.com



 wrote in message 
news:1173533768.419591.192960@t69g2000cwt.googlegroups.com...
>I am undertaking a project which requires the ability to perform
> phonetic searching across business and personal names and addresses.
> This data is relatively uncleansed to date.  The amount of work to
> cleanse the data is tremendous based upon existing data analysis.  My
> thoughts are that if we can find the right tools we should be able to
> perform matching and searching across the data without implementing
> heavy duty cleansing prior to the search.  When I talk about
> uncleansed data in terms of names I am talking about things like the
> following:
>
> Acme, Inc
> Acme, Incorp.
> Acme Incorporated
>
> **Note variations in abbreviations of Inc. this occurs on other words
> such as Company, Association, Limited, DBA and other unknowns.
>
> Smith Plumbing
> Smyth Plumbing
> Kool Runnings
> Cool Runnings
>
> **Note variations in phonetic spelling of words
>
> Ultimately what I am wondering is to what success have people had with
> searching fuzzy type searches like this using solely Microsofts Full
> Text Indexing.  I would also be interested to know if SQL Server 2005
> improves on the capabilities of fuzzy matching.  In addition I found
> some information regardng SQL Turbo http://www.quest.com/sql_turbo/
> which seems to be an extender or different implementation of FTS which
> does support phonetic matche and I would be interested to know if
> anyone has used this tool as well.  Any thoughts on how to approach
> this matter would be appreciated.
>
date: Sat, 10 Mar 2007 12:01:40 -0500   author:   Hilary Cotter

Re: Phonetic Searching and Full Text Indexing For Business and Personal Names an Addresses   
Have you looked at: http://www.codeproject.com/string/dmetaphone6.asp

The double-metaphone algorithm is available and has been implemented as an 
extended stored procedure.  It is pretty good for people names, but I don't 
know if it would work well for business names.

FWIW - RLF

 wrote in message 
news:1173533768.419591.192960@t69g2000cwt.googlegroups.com...
>I am undertaking a project which requires the ability to perform
> phonetic searching across business and personal names and addresses.
> This data is relatively uncleansed to date.  The amount of work to
> cleanse the data is tremendous based upon existing data analysis.  My
> thoughts are that if we can find the right tools we should be able to
> perform matching and searching across the data without implementing
> heavy duty cleansing prior to the search.  When I talk about
> uncleansed data in terms of names I am talking about things like the
> following:
>
> Acme, Inc
> Acme, Incorp.
> Acme Incorporated
>
> **Note variations in abbreviations of Inc. this occurs on other words
> such as Company, Association, Limited, DBA and other unknowns.
>
> Smith Plumbing
> Smyth Plumbing
> Kool Runnings
> Cool Runnings
>
> **Note variations in phonetic spelling of words
>
> Ultimately what I am wondering is to what success have people had with
> searching fuzzy type searches like this using solely Microsofts Full
> Text Indexing.  I would also be interested to know if SQL Server 2005
> improves on the capabilities of fuzzy matching.  In addition I found
> some information regardng SQL Turbo http://www.quest.com/sql_turbo/
> which seems to be an extender or different implementation of FTS which
> does support phonetic matche and I would be interested to know if
> anyone has used this tool as well.  Any thoughts on how to approach
> this matter would be appreciated.
>
date: Mon, 12 Mar 2007 15:42:54 -0400   author:   Russell Fields

Google
 
Web ureader.com


    COPYRIGHT 2007, YARDI TECHNOLOGY LIMITED, ALL RIGHT RESERVE  |   contact us