|
|
|
date: Fri, 18 Jan 2008 09:49:04 -0800 (PST),
group: microsoft.public.sqlserver.fulltext
back
Quoted literal strings won't force a phrase match
Hello all,
From what I've read, SQL Server is supposed to do a phrase match when
you do a full text search that contains quoted literal strings. So,
for example, if I did a full text search on the phrase "time out" and
I put it in quotes, it's supposed to search for the full phrase "time
out" and not just look for rows that contain the words "time" or
"out." However, this isn't working for me.
Here is the query that I'm using :
SELECT *
FROM Content_Items ci
INNER JOIN FREETEXTTABLE(Content_Items, hed, '"time out"') AS ft
ON ci.contentItemId = ft.[KEY]
ORDER BY ft.RANK DESC
What's it's doing is this : it's returning a bunch of rows that have
the words "time" or "out" in the column called hed. It's also
returning rows that have the full phrase "time out", but it's giving
those rows the same rank as rows that only contain the word "time."
In this case, that rank is 180.
Is there anything else I should be doing in my query, or is there some
configuration option I should have turned on?
Thanks.
date: Fri, 18 Jan 2008 09:49:04 -0800 (PST)
author: unknown
Re: Quoted literal strings won't force a phrase match
Ok, I've made some progress on this problem. Apparently SQL Server is
ignoring noise words in my phrase match.
For example, I ran this query :
SELECT *
FROM Content_Items ci
INNER JOIN FREETEXTTABLE(Content_Items, hed, '"time capsule"') AS ft
ON ci.contentItemId = ft.[KEY]
ORDER BY ft.RANK DESC
And it did exactly what it was supposed to do, since neither "time"
nor "capsule" is a noise word.
My impression was that noise words aren't stripped out of a full text
search if the search phrase is a quoted literal. Thus, my search for
"time out" should look for the full phrase "time out", and not just
the word "time."
Does anybody know why SQL Server is removing my noise word from the
phrase match?
On Jan 18, 12:49 pm, Afrobla...@gmail.com wrote:
> Hello all,
>
> From what I've read, SQL Server is supposed to do a phrase match when
> you do a full text search that contains quoted literal strings. So,
> for example, if I did a full text search on the phrase "time out" and
> I put it in quotes, it's supposed to search for the full phrase "time
> out" and not just look for rows that contain the words "time" or
> "out." However, this isn't working for me.
>
> Here is the query that I'm using :
>
> SELECT *
> FROM Content_Items ci
> INNER JOIN FREETEXTTABLE(Content_Items, hed, '"time out"') AS ft
> ON ci.contentItemId = ft.[KEY]
> ORDER BY ft.RANK DESC
>
> What's it's doing is this : it's returning a bunch of rows that have
> the words "time" or "out" in the column called hed. It's also
> returning rows that have the full phrase "time out", but it's giving
> those rows the same rank as rows that only contain the word "time."
> In this case, that rank is 180.
>
> Is there anything else I should be doing in my query, or is there some
> configuration option I should have turned on?
>
> Thanks.
date: Fri, 18 Jan 2008 12:48:45 -0800 (PST)
author: Lepidopterist
Re: Quoted literal strings won't force a phrase match
A noise word is always a noise word. Noise words are applied to the
building of the index, so the full-text search has nothing to find.
Therefore, if you change the noise word list, you must rebuild the index
before you can search for the former noise word. (It is common to run with
either a single blank or a single nonsense word in the noise word file, so
as to get no noise words.)
Of course, you can do a string search for '%time out%' in addition to the
full-text query.
RLF
"Lepidopterist" wrote in message
news:19bd6a5a-c6b0-486b-a69a-45fc1d5b9e92@f47g2000hsd.googlegroups.com...
> Ok, I've made some progress on this problem. Apparently SQL Server is
> ignoring noise words in my phrase match.
>
> For example, I ran this query :
>
> SELECT *
> FROM Content_Items ci
> INNER JOIN FREETEXTTABLE(Content_Items, hed, '"time capsule"') AS ft
> ON ci.contentItemId = ft.[KEY]
> ORDER BY ft.RANK DESC
>
> And it did exactly what it was supposed to do, since neither "time"
> nor "capsule" is a noise word.
>
> My impression was that noise words aren't stripped out of a full text
> search if the search phrase is a quoted literal. Thus, my search for
> "time out" should look for the full phrase "time out", and not just
> the word "time."
>
> Does anybody know why SQL Server is removing my noise word from the
> phrase match?
>
> On Jan 18, 12:49 pm, Afrobla...@gmail.com wrote:
>> Hello all,
>>
>> From what I've read, SQL Server is supposed to do a phrase match when
>> you do a full text search that contains quoted literal strings. So,
>> for example, if I did a full text search on the phrase "time out" and
>> I put it in quotes, it's supposed to search for the full phrase "time
>> out" and not just look for rows that contain the words "time" or
>> "out." However, this isn't working for me.
>>
>> Here is the query that I'm using :
>>
>> SELECT *
>> FROM Content_Items ci
>> INNER JOIN FREETEXTTABLE(Content_Items, hed, '"time out"') AS ft
>> ON ci.contentItemId = ft.[KEY]
>> ORDER BY ft.RANK DESC
>>
>> What's it's doing is this : it's returning a bunch of rows that have
>> the words "time" or "out" in the column called hed. It's also
>> returning rows that have the full phrase "time out", but it's giving
>> those rows the same rank as rows that only contain the word "time."
>> In this case, that rank is 180.
>>
>> Is there anything else I should be doing in my query, or is there some
>> configuration option I should have turned on?
>>
>> Thanks.
>
date: Sun, 20 Jan 2008 13:42:56 -0500
author: Russell Fields
Re: Quoted literal strings won't force a phrase match
A couple of points, in SQL 2005 if your search phrase is wrapped in
double quotes it is a literal search for both FreeText and Contains.
In SQL 2000, wrapping your search phrase in double quotes for a
FreeText search has no effect - it is stemmed.
The second point it that if SQL FTS detects the presence of noise
words in your search phrase it throws them away but respects the
positional placement of them.
So a search a FreeText search on University of California (where of is
a noise word) would result in hits to University of California,
University at California and University in California ranked higher
than University California or California University.
On Jan 20, 1:42 pm, "Russell Fields" wrote:
> A noise word is always a noise word. Noise words are applied to the
> building of the index, so the full-text search has nothing to find.
> Therefore, if you change the noise word list, you must rebuild the index
> before you can search for the former noise word. (It is common to run with
> either a single blank or a single nonsense word in the noise word file, so> as to get no noise words.)
>
> Of course, you can do a string search for '%time out%' in addition to the
> full-text query.
>
> RLF
>
> "Lepidopterist" wrote in message
>
> news:19bd6a5a-c6b0-486b-a69a-45fc1d5b9e92@f47g2000hsd.googlegroups.com...
>
>
>
> > Ok, I've made some progress on this problem. Apparently SQL Server is> > ignoring noise words in my phrase match.
>
> > For example, I ran this query :
>
> > SELECT *
> > FROM Content_Items ci
> > INNER JOIN FREETEXTTABLE(Content_Items, hed, '"time capsule"') AS ft
> > ON ci.contentItemId = ft.[KEY]
> > ORDER BY ft.RANK DESC
>
> > And it did exactly what it was supposed to do, since neither "time"
> > nor "capsule" is a noise word.
>
> > My impression was that noise words aren't stripped out of a full text
> > search if the search phrase is a quoted literal. Thus, my search for
> > "time out" should look for the full phrase "time out", and not just
> > the word "time."
>
> > Does anybody know why SQL Server is removing my noise word from the
> > phrase match?
>
> > On Jan 18, 12:49 pm, Afrobla...@gmail.com wrote:
> >> Hello all,
>
> >> From what I've read, SQL Server is supposed to do a phrase match when
> >> you do a full text search that contains quoted literal strings. So,
> >> for example, if I did a full text search on the phrase "time out" and
> >> I put it in quotes, it's supposed to search for the full phrase "time
> >> out" and not just look for rows that contain the words "time" or
> >> "out." However, this isn't working for me.
>
> >> Here is the query that I'm using :
>
> >> SELECT *
> >> FROM Content_Items ci
> >> INNER JOIN FREETEXTTABLE(Content_Items, hed, '"time out"') AS ft
> >> ON ci.contentItemId = ft.[KEY]
> >> ORDER BY ft.RANK DESC
>
> >> What's it's doing is this : it's returning a bunch of rows that have
> >> the words "time" or "out" in the column called hed. It's also
> >> returning rows that have the full phrase "time out", but it's giving
> >> those rows the same rank as rows that only contain the word "time."
> >> In this case, that rank is 180.
>
> >> Is there anything else I should be doing in my query, or is there some
> >> configuration option I should have turned on?
>
> >> Thanks.- Hide quoted text -
>
> - Show quoted text -
date: Wed, 23 Jan 2008 09:02:55 -0800 (PST)
author: Hilary Cotter
Re: Quoted literal strings won't force a phrase match
Hey all,
Thanks for your responses, and sorry it took me so long to get back to
you.
In the end, I decided to opt for accuracy over performance, and have
emptied my noise word file. The results so far have been encouraging
- my index went from being 9MB to a very manageable 20MB, and response
times have remained unchanged.
It almost seems to good to be true. I feel like noise word
elimination doesn't really give you that much of a performance
advantage.
Let's imagine for the moment that I'm wholly unconcerned about disk
space usage. What level of search traffic would necessitate noise
word elimination? Does noise word elimination ever make that much of
a difference to response times?
Thanks again for the responses.
On Jan 23, 12:02 pm, Hilary Cotter wrote:
> A couple of points, in SQL 2005 if your search phrase is wrapped in
> double quotes it is a literal search for both FreeText and Contains.
> In SQL 2000, wrapping your search phrase in double quotes for a
> FreeText search has no effect - it is stemmed.
>
> The second point it that if SQL FTS detects the presence of noise
> words in your search phrase it throws them away but respects the
> positional placement of them.
>
> So a search a FreeText search on University of California (where of is
> a noise word) would result in hits to University of California,
> University at California and University in California ranked higher
> than University California or California University.
>
> On Jan 20, 1:42 pm, "Russell Fields" wrote:
>
> > A noise word is always a noise word. Noise words are applied to the
> > building of the index, so the full-text search has nothing to find.
> > Therefore, if you change the noise word list, you must rebuild the index
> > before you can search for the former noise word. (It is common to run with
> > either a single blank or a single nonsense word in the noise word file, so
> > as to get no noise words.)
>
> > Of course, you can do a string search for '%time out%' in addition to the
> > full-text query.
>
> > RLF
>
> > "Lepidopterist" wrote in message
>
> >news:19bd6a5a-c6b0-486b-a69a-45fc1d5b9e92@f47g2000hsd.googlegroups.com...
>
> > > Ok, I've made some progress on this problem. Apparently SQL Server is
> > > ignoring noise words in my phrase match.
>
> > > For example, I ran this query :
>
> > > SELECT *
> > > FROM Content_Items ci
> > > INNER JOIN FREETEXTTABLE(Content_Items, hed, '"time capsule"') AS ft
> > > ON ci.contentItemId = ft.[KEY]
> > > ORDER BY ft.RANK DESC
>
> > > And it did exactly what it was supposed to do, since neither "time"
> > > nor "capsule" is a noise word.
>
> > > My impression was that noise words aren't stripped out of a full text
> > > search if the search phrase is a quoted literal. Thus, my search for
> > > "time out" should look for the full phrase "time out", and not just
> > > the word "time."
>
> > > Does anybody know why SQL Server is removing my noise word from the
> > > phrase match?
>
> > > On Jan 18, 12:49 pm, Afrobla...@gmail.com wrote:
> > >> Hello all,
>
> > >> From what I've read, SQL Server is supposed to do a phrase match when
> > >> you do a full text search that contains quoted literal strings. So,
> > >> for example, if I did a full text search on the phrase "time out" and
> > >> I put it in quotes, it's supposed to search for the full phrase "time
> > >> out" and not just look for rows that contain the words "time" or
> > >> "out." However, this isn't working for me.
>
> > >> Here is the query that I'm using :
>
> > >> SELECT *
> > >> FROM Content_Items ci
> > >> INNER JOIN FREETEXTTABLE(Content_Items, hed, '"time out"') AS ft
> > >> ON ci.contentItemId = ft.[KEY]
> > >> ORDER BY ft.RANK DESC
>
> > >> What's it's doing is this : it's returning a bunch of rows that have
> > >> the words "time" or "out" in the column called hed. It's also
> > >> returning rows that have the full phrase "time out", but it's giving
> > >> those rows the same rank as rows that only contain the word "time."
> > >> In this case, that rank is 180.
>
> > >> Is there anything else I should be doing in my query, or is there some
> > >> configuration option I should have turned on?
>
> > >> Thanks.- Hide quoted text -
>
> > - Show quoted text -
date: Thu, 24 Jan 2008 14:56:40 -0800 (PST)
author: Lepidopterist
Re: Quoted literal strings won't force a phrase match
On one of my clients we ran with 14,000 noise word lists, response
times were much better. We knew in advance that the cleint's customers
would only be searching on limted keywords - like Gold Watch,
Bracelet, earrings, etc.
On Jan 24, 5:56 pm, Lepidopterist wrote:
> Hey all,
>
> Thanks for your responses, and sorry it took me so long to get back to
> you.
>
> In the end, I decided to opt for accuracy over performance, and have
> emptied my noise word file. The results so far have been encouraging
> - my index went from being 9MB to a very manageable 20MB, and response
> times have remained unchanged.
>
> It almost seems to good to be true. I feel like noise word
> elimination doesn't really give you that much of a performance
> advantage.
>
> Let's imagine for the moment that I'm wholly unconcerned about disk
> space usage. What level of search traffic would necessitate noise
> word elimination? Does noise word elimination ever make that much of
> a difference to response times?
>
> Thanks again for the responses.
>
> On Jan 23, 12:02 pm, Hilary Cotter wrote:
>
>
>
> > A couple of points, in SQL 2005 if your search phrase is wrapped in
> > double quotes it is a literal search for both FreeText and Contains.
> > In SQL 2000, wrapping your search phrase in double quotes for a
> > FreeText search has no effect - it is stemmed.
>
> > The second point it that if SQL FTS detects the presence of noise
> > words in your search phrase it throws them away but respects the
> > positional placement of them.
>
> > So a search a FreeText search on University of California (where of is
> > a noise word) would result in hits to University of California,
> > University at California and University in California ranked higher
> > than University California or California University.
>
> > On Jan 20, 1:42 pm, "Russell Fields" wrote:
>
> > > A noise word is always a noise word. Noise words are applied to the> > > building of the index, so the full-text search has nothing to find.
> > > Therefore, if you change the noise word list, you must rebuild the index
> > > before you can search for the former noise word. (It is common to run with
> > > either a single blank or a single nonsense word in the noise word file, so
> > > as to get no noise words.)
>
> > > Of course, you can do a string search for '%time out%' in addition to the
> > > full-text query.
>
> > > RLF
>
> > > "Lepidopterist" wrote in message
>
> > >news:19bd6a5a-c6b0-486b-a69a-45fc1d5b9e92@f47g2000hsd.googlegroups.com.> > > > Ok, I've made some progress on this problem. Apparently SQL Server is
> > > > ignoring noise words in my phrase match.
>
> > > > For example, I ran this query :
>
> > > > SELECT *
> > > > FROM Content_Items ci
> > > > INNER JOIN FREETEXTTABLE(Content_Items, hed, '"time capsule"') AS ft> > > > ON ci.contentItemId = ft.[KEY]
> > > > ORDER BY ft.RANK DESC
>
> > > > And it did exactly what it was supposed to do, since neither "time"
> > > > nor "capsule" is a noise word.
>
> > > > My impression was that noise words aren't stripped out of a full text
> > > > search if the search phrase is a quoted literal. Thus, my search for
> > > > "time out" should look for the full phrase "time out", and not just
> > > > the word "time."
>
> > > > Does anybody know why SQL Server is removing my noise word from the
> > > > phrase match?
>
> > > > On Jan 18, 12:49 pm, Afrobla...@gmail.com wrote:
> > > >> Hello all,
>
> > > >> From what I've read, SQL Server is supposed to do a phrase match when
> > > >> you do a full text search that contains quoted literal strings. So,
> > > >> for example, if I did a full text search on the phrase "time out" and
> > > >> I put it in quotes, it's supposed to search for the full phrase "time
> > > >> out" and not just look for rows that contain the words "time" or
> > > >> "out." However, this isn't working for me.
>
> > > >> Here is the query that I'm using :
>
> > > >> SELECT *
> > > >> FROM Content_Items ci
> > > >> INNER JOIN FREETEXTTABLE(Content_Items, hed, '"time out"') AS ft
> > > >> ON ci.contentItemId = ft.[KEY]
> > > >> ORDER BY ft.RANK DESC
>
> > > >> What's it's doing is this : it's returning a bunch of rows that have
> > > >> the words "time" or "out" in the column called hed. It's also
> > > >> returning rows that have the full phrase "time out", but it's giving
> > > >> those rows the same rank as rows that only contain the word "time."> > > >> In this case, that rank is 180.
>
> > > >> Is there anything else I should be doing in my query, or is there some
> > > >> configuration option I should have turned on?
>
> > > >> Thanks.- Hide quoted text -
>
> > > - Show quoted text -- Hide quoted text -
>
> - Show quoted text -
date: Fri, 25 Jan 2008 09:52:49 -0800 (PST)
author: Hilary Cotter
|
|