|
|
|
date: Mon, 6 Oct 2008 14:02:13 -0700 (PDT),
group: microsoft.public.word.vba.general
back
Re: determine document language
Hi Macropod,
> Check out the DetectLanguage Method in Word's VBA help file. There's even a working example there of how the
method can be used.
>
The one problem with this (vs. using the LanguageID property of a Range or Style definition) is that it requires
that the language be installed in the Windows Control Panel/Regional Settings AND recognized for Office. This works
fine if you can be sure there will be a limited number of languages, but it will cause problems if Word can't find
the language on the system.
Plus, don't forget that this actually works on a Range, and the document may contain multiple languages if what the
user types isn't something in the dictionary, or is misspelled, or if the user pastes something (from the Internet,
for example).
Personally, I think this method should *not* be used to determine in which language a document was written. But
there are developers who use it in this manner.
Cindy Meister
INTER-Solutions, Switzerland
http://homepage.swissonline.ch/cindymeister (last update Jun 17 2005)
http://www.word.mvps.org
This reply is posted in the Newsgroup; please post any follow question or reply in the newsgroup and not by e-mail
:-)
date: Tue, 07 Oct 2008 18:29:30 +0200
author: Cindy M.
Re: determine document language
On 7 okt, 18:29, Cindy M. wrote:
> Hi Macropod,
>
> > Check out the DetectLanguage Method in Word's VBA help file. There's even a working example there of how the
> method can be used.
>
> The one problem with this (vs. using the LanguageID property of a Range or Style definition) is that it requires
> that the language be installed in the Windows Control Panel/Regional Settings AND recognized for Office. This works
> fine if you can be sure there will be a limited number of languages, but it will cause problems if Word can't find
> the language on the system.
>
> Plus, don't forget that this actually works on a Range, and the document may contain multiple languages if what the
> user types isn't something in the dictionary, or is misspelled, or if the user pastes something (from the Internet,
> for example).
>
> Personally, I think this method should *not* be used to determine in which language a document was written. But
> there are developers who use it in this manner.
>
> Cindy Meister
> INTER-Solutions, Switzerlandhttp://homepage.swissonline.ch/cindymeister(last update Jun 17 2005)http://www.word.mvps.org
>
> This reply is posted in the Newsgroup; please post any follow question or reply in the newsgroup and not by e-mail
> :-)
Cindy,
appreciate your comments.
How would you solve such a problem?
Marco
date: Tue, 7 Oct 2008 10:39:43 -0700 (PDT)
author: Co
Re: determine document language
On 8 okt, 12:18, "Klaus Linke" wrote:
> > Cindy,
>
> > appreciate your comments.
> > How would you solve such a problem?
>
> Not Cindy, but start with ActiveDocument.Content.LanguageID?
> If that's wdUndefined (mixed languages), look further to see what language
> is applied to most of the text.
>
> In my experience, the LanguageID tends to be mostly applied properly.
> If you're sure you have docs in which it isn't, you could use the method you
> proposed originally... Maybe the stopword list I mentioned would come in
> handy.
>
> Regards,
> Klaus
Klaus,
Is there a way to retrieve this Word stopword list for say English,
French, German, Dutch and Italian?
Marco
date: Wed, 8 Oct 2008 04:27:18 -0700 (PDT)
author: Co
Re: determine document language
My memory probably deceived me: The stopword list in the ORK is a list of
words that aren't indexed:
http://www.microsoft.com/downloads/details.aspx?FamilyID=74B29874-4F1E-4909-8CB3-6473CEC6EE0C&displaylang=en
Still, I'm pretty sure I saw a list of the words used by language
autodetection...
If I find it, I'll post it.
Klaus
"Co" wrote:
> On 8 okt, 12:18, "Klaus Linke" wrote:
>> > Cindy,
>>
>> > appreciate your comments.
>> > How would you solve such a problem?
>>
>> Not Cindy, but start with ActiveDocument.Content.LanguageID?
>> If that's wdUndefined (mixed languages), look further to see what
>> language
>> is applied to most of the text.
>>
>> In my experience, the LanguageID tends to be mostly applied properly.
>> If you're sure you have docs in which it isn't, you could use the method
>> you
>> proposed originally... Maybe the stopword list I mentioned would come in
>> handy.
>>
>> Regards,
>> Klaus
>
> Klaus,
>
> Is there a way to retrieve this Word stopword list for say English,
> French, German, Dutch and Italian?
>
> Marco
date: Wed, 8 Oct 2008 14:56:42 +0200
author: Klaus Linke
Re: determine document language
On 8 okt, 13:41, Cindy M. wrote:
> Hi Klaus,
>
> > but start with ActiveDocument.Content.LanguageID?
> > If that's wdUndefined (mixed languages), look further to see what language
> > is applied to most of the text.
>
> Agreed.
>
> If we're talking 2003 or 2007, I might then pick up XML property and parse
> through that, rather than "walk" the object model.
>
> Cindy Meister
> INTER-Solutions, Switzerlandhttp://homepage.swissonline.ch/cindymeister(last update Jun 17 2005)http://www.word.mvps.org
>
> This reply is posted in the Newsgroup; please post any follow question or
> reply in the newsgroup and not by e-mail :-)
Cindy,
What exactly do you mean with that:
" I might then pick up XML property and parse
through that, rather than "walk" the object model"
Could you give me an example here?
MArco
date: Wed, 8 Oct 2008 09:10:05 -0700 (PDT)
author: Co
|
|