|
|
|
date: Mon, 1 Sep 2008 14:57:01 -0700,
group: microsoft.public.access.queries
back
Re: Check for a non exact match - eg serial number with 1 digit differ
On Mon, 1 Sep 2008 14:57:01 -0700, Andrew P.
wrote:
>Hi All
>
>I need to do a search for serial numbers - I need to find exact match (easy)
>but also check for miss types - ie find any numbers that have one or two
>digits different to something thats already in the table. The serial number
>is 14 digits long and is always letters and numbers.
>So if I do a check on:
>BZ1A0831002415, then I would want to know if there was already
>BZ1A0881002415 or anything similar existing in the table.
>
>Not sure the best way to go about this.
>
>Thanks a lot
>Regards
>Andrew
There's no *easy* way. The best way is to prevent them from being entered in
the first place, e.g. by providing the user with a combo box or other tool to
let them *select* a value known to be correct, rather than forcing them to
type a value. I admit that this isn't always practical - might it be in your
context?
You should also be aware that reversed digits (i.e. BZ1A0881004215) is a very
frequent error type, as is "perseverance" - e.g. seeing 344 and typing 334.
One way to find single-character mismatches is to use fourteen criteria in an
OR:
LIKE "?" & Mid([SN], 2) OR LIKE (Left([SN], 1) & "?" & Mid([SN], 3) OR LIKE
(Left([SN], 3) & "?" & Mid([SN], 4) <et cetera et cetera>
but it's going to be really slow and inefficient.
--
John W. Vinson [MVP]
date: Mon, 01 Sep 2008 16:14:58 -0600
author: John W. Vinson
RE: Check for a non exact match - eg serial number with 1 digit differ
I think the easiest way to approach this is to write a function that will
return the "differences" between the two strings. Then you can write a query
that will identify those records that are similar, based on the number of
characters that are different, keeping in mind, that if just two characters
are different, you potentially have 100 "similar" values.
I'm not sure how you would implement this, since someone could type in a
"valid" serial number (one that is actually found in the database), but which
is not the one they are looking for. So you might start out looking for the
record that is an exact match, and if that one is not found, or is not the
serial number you are looking for, then use a query to identify the "similar"
serial numbers and allow the user to select form that list.
The function might look like:
Public Function CompareStrings(Value1 As Variant, Value2 As Variant) As
Variant
Dim intLoop As Integer
Dim strVal1 As String, strVal2 As String
If IsNull(Value1) Or IsNull(Value2) Then
CompareStrings = Null
Exit Function
End If
CompareStrings = 0
'Set the longer of the two strings to strVal1
If Len(Value1) >= Len(Value2) Then
strVal1 = Value1
strVal2 = Value2
Else
strVal1 = Value2
strVal2 = Value1
End If
For intLoop = 1 To Len(strVal1)
If Len(strVal2) < intLoop Then
CompareStrings = CompareStrings + 1
ElseIf Mid(strVal2, intLoop, 1) <> Mid(strVal1, intLoop, 1) Then
CompareStrings = CompareStrings + 1
End If
Next
End Function
and you might use it like:
SELECT SerialNum, ItemDesc
FROM yourTable
WHERE CompareStrings([SerialNum], "BZ1A0831002415") Between 1 and 2
--
HTH
Dale
Don''t forget to rate the post if it was helpful!
email address is invalid
Please reply to newsgroup only.
"Andrew P." wrote:
> Hi All
>
> I need to do a search for serial numbers - I need to find exact match (easy)
> but also check for miss types - ie find any numbers that have one or two
> digits different to something thats already in the table. The serial number
> is 14 digits long and is always letters and numbers.
> So if I do a check on:
> BZ1A0831002415, then I would want to know if there was already
> BZ1A0881002415 or anything similar existing in the table.
>
> Not sure the best way to go about this.
>
> Thanks a lot
> Regards
> Andrew
date: Tue, 2 Sep 2008 05:30:00 -0700
author: Dale Fye
Re: Check for a non exact match - eg serial number with 1 digit di
You might be able to use Levenshtein Distance to find all values that
differed by 1.
Paste the following into a module and save it and then call it as
needed. With large sets of data it may be too slow for practical use.
For instance to return all serial numbers that are within one character
of matching, you could use a query like the following.
SELECT SerialNumber
FROM SomeTable
WHERE LD([Forms]![FormEntry]![txtSerial],SerialNumber) <= 1
'Fuzzy Matching - Levenshtein Distance
'See http://www.merriampark.com/ld.htm#VB
'*******************************
'*** Get minimum of three values
'*******************************
Private Function Minimum(ByVal a As Integer, _
ByVal b As Integer, _
ByVal c As Integer) As Integer
Dim mi As Integer
mi = a
If b < mi Then
mi = b
End If
If c < mi Then
mi = c
End If
Minimum = mi
End Function
'********************************
'*** Compute Levenshtein Distance
'********************************
Public Function LD(ByVal s As String, ByVal t As String) As Integer
Dim d() As Integer ' matrix
Dim m As Integer ' length of t
Dim n As Integer ' length of s
Dim i As Integer ' iterates through s
Dim j As Integer ' iterates through t
Dim s_i As String ' ith character of s
Dim t_j As String ' jth character of t
Dim cost As Integer ' cost
' Step 1
n = Len(s)
m = Len(t)
If n = 0 Then
LD = m
Exit Function
End If
If m = 0 Then
LD = n
Exit Function
End If
ReDim d(0 To n, 0 To m) As Integer
' Step 2
For i = 0 To n
d(i, 0) = i
Next i
For j = 0 To m
d(0, j) = j
Next j
' Step 3
For i = 1 To n
s_i = Mid$(s, i, 1)
' Step 4
For j = 1 To m
t_j = Mid$(t, j, 1)
' Step 5
If s_i = t_j Then
cost = 0
Else
cost = 1
End If
' Step 6
d(i, j) = Minimum(d(i - 1, j) + 1, d(i, j - 1) + 1, d(i - 1, j -
1) + cost)
Next j
Next i
' Step 7
LD = d(n, m)
Erase d
End Function
date: Wed, 03 Sep 2008 09:27:06 -0400
author: John Spencer
|
|