1*b1cdbd2cSJim Jagielski# Copyright (c) 2003, WiseGuys Internet B.V.
2*b1cdbd2cSJim Jagielski#
3*b1cdbd2cSJim Jagielski# All rights reserved.
4*b1cdbd2cSJim Jagielski#
5*b1cdbd2cSJim Jagielski# Redistribution and use in source and binary forms, with or without
6*b1cdbd2cSJim Jagielski# modification, are permitted provided that the following conditions are
7*b1cdbd2cSJim Jagielski# met:
8*b1cdbd2cSJim Jagielski#
9*b1cdbd2cSJim Jagielski# - Redistributions of source code must retain the above copyright
10*b1cdbd2cSJim Jagielski# notice, this list of conditions and the following disclaimer.
11*b1cdbd2cSJim Jagielski#
12*b1cdbd2cSJim Jagielski# - Redistributions in binary form must reproduce the above copyright
13*b1cdbd2cSJim Jagielski# notice, this list of conditions and the following disclaimer in the
14*b1cdbd2cSJim Jagielski# documentation and/or other materials provided with the distribution.
15*b1cdbd2cSJim Jagielski#
16*b1cdbd2cSJim Jagielski# - Neither the name of the WiseGuys Internet B.V. nor the names of its
17*b1cdbd2cSJim Jagielski# contributors may be used to endorse or promote products derived from
18*b1cdbd2cSJim Jagielski# this software without specific prior written permission.
19*b1cdbd2cSJim Jagielski#
20*b1cdbd2cSJim Jagielski# THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
21*b1cdbd2cSJim Jagielski# "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
22*b1cdbd2cSJim Jagielski# LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
23*b1cdbd2cSJim Jagielski# A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
24*b1cdbd2cSJim Jagielski# OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
25*b1cdbd2cSJim Jagielski# SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
26*b1cdbd2cSJim Jagielski# LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
27*b1cdbd2cSJim Jagielski# DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
28*b1cdbd2cSJim Jagielski# THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
29*b1cdbd2cSJim Jagielski# (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
30*b1cdbd2cSJim Jagielski# OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
31*b1cdbd2cSJim Jagielski#
32*b1cdbd2cSJim Jagielski
33*b1cdbd2cSJim Jagielski# A sample config file for the language models
34*b1cdbd2cSJim Jagielski# provided with Gertjan van Noords language guesser
35*b1cdbd2cSJim Jagielski# (http://odur.let.rug.nl/~vannoord/TextCat/)
36*b1cdbd2cSJim Jagielski#
37*b1cdbd2cSJim Jagielski# Notes:
38*b1cdbd2cSJim Jagielski# - You may consider eliminating a couple of small languages from this
39*b1cdbd2cSJim Jagielski# list because they cause false positives with big languages and are
40*b1cdbd2cSJim Jagielski# bad for performance. (Do you really want to recognize Drents?)
41*b1cdbd2cSJim Jagielski# - Putting the most probable languages at the top of the list
42*b1cdbd2cSJim Jagielski# improves performance, because this will raise the threshold for
43*b1cdbd2cSJim Jagielski# likely candidates more quickly.
44*b1cdbd2cSJim Jagielski#
45*b1cdbd2cSJim Jagielski
46*b1cdbd2cSJim Jagielski# this file have been modified (to OOo by Jocelyn MERAND
47*b1cdbd2cSJim Jagielski# joc.merATgmail.com) to include country and encoding
48*b1cdbd2cSJim Jagielski# guess strings are made as following : language-country-encoding
49*b1cdbd2cSJim Jagielski
50*b1cdbd2cSJim Jagielskiafrikaans.lm                         af--utf8
51*b1cdbd2cSJim Jagielskialbanian.lm                          sq--utf8
52*b1cdbd2cSJim Jagielskiamharic_utf.lm                       am--utf8
53*b1cdbd2cSJim Jagielskiarabic.lm                            ar--utf8
54*b1cdbd2cSJim Jagielskibasque.lm                            eu--utf8
55*b1cdbd2cSJim Jagielskibelarus.lm                           be--utf8
56*b1cdbd2cSJim Jagielskibosnian.lm                           bs--utf8
57*b1cdbd2cSJim Jagielskibreton.lm                            br--utf8
58*b1cdbd2cSJim Jagielskicatalan.lm                           ca--utf8
59*b1cdbd2cSJim Jagielskichinese_simplified.lm                zh-CN-utf8
60*b1cdbd2cSJim Jagielskichinese_traditional.lm               zh-TW-utf8
61*b1cdbd2cSJim Jagielskicroatian.lm                          hr--utf8
62*b1cdbd2cSJim Jagielskiczech.lm                             cs--utf8
63*b1cdbd2cSJim Jagielskidanish.lm                            da--utf8
64*b1cdbd2cSJim Jagielskidutch.lm                             nl--utf8
65*b1cdbd2cSJim Jagielskienglish.lm                           en--utf8
66*b1cdbd2cSJim Jagielskiesperanto.lm                         eo--utf8
67*b1cdbd2cSJim Jagielskiestonian.lm                          et--utf8
68*b1cdbd2cSJim Jagielskifinnish.lm                           fi--utf8
69*b1cdbd2cSJim Jagielskifrench.lm                            fr--utf8
70*b1cdbd2cSJim Jagielskifrisian.lm                           fy--utf8
71*b1cdbd2cSJim Jagielskigeorgian.lm                          ka--utf8
72*b1cdbd2cSJim Jagielskigerman.lm                            de--utf8
73*b1cdbd2cSJim Jagielskigreek.lm                             el--utf8
74*b1cdbd2cSJim Jagielskihebrew.lm                            he--utf8
75*b1cdbd2cSJim Jagielskihindi.lm                             hi--utf8
76*b1cdbd2cSJim Jagielskihungarian.lm                         hu--utf8
77*b1cdbd2cSJim Jagielskiicelandic.lm                         is--utf8
78*b1cdbd2cSJim Jagielskiindonesian.lm                        id--utf8
79*b1cdbd2cSJim Jagielskiirish_gaelic.lm                      ga--utf8
80*b1cdbd2cSJim Jagielskiitalian.lm                           it--utf8
81*b1cdbd2cSJim Jagielskijapanese.lm                          ja--utf8
82*b1cdbd2cSJim Jagielskikorean.lm                            ko--utf8
83*b1cdbd2cSJim Jagielskilatin.lm                             la--utf8
84*b1cdbd2cSJim Jagielskilatvian.lm                           lv--utf8
85*b1cdbd2cSJim Jagielskilithuanian.lm                        lt--utf8
86*b1cdbd2cSJim Jagielskiluxembourgish.lm                     lb--utf8
87*b1cdbd2cSJim Jagielskimalay.lm                             ms--utf8
88*b1cdbd2cSJim Jagielskimanx_gaelic.lm                       gv--utf8
89*b1cdbd2cSJim Jagielskimarathi.lm                           mr--utf8
90*b1cdbd2cSJim Jagielskimongolian_cyrillic.lm                mn--utf8
91*b1cdbd2cSJim Jagielskinepali.lm                            ne--utf8
92*b1cdbd2cSJim Jagielskinorwegian.lm                         nb--utf8       # Norwegian (Bokmal)
93*b1cdbd2cSJim Jagielskipersian.lm                           fa--utf8       # Farsi
94*b1cdbd2cSJim Jagielskipolish.lm                            pl--utf8
95*b1cdbd2cSJim Jagielskiportuguese.lm                        pt-PT-utf8
96*b1cdbd2cSJim Jagielskiquechua.lm                           qu--utf8
97*b1cdbd2cSJim Jagielskiromanian.lm                          ro--utf8
98*b1cdbd2cSJim Jagielskiromansh.lm                           rm--utf8
99*b1cdbd2cSJim Jagielskirussian.lm                           ru--utf8
100*b1cdbd2cSJim Jagielskisanskrit.lm                          sa--utf8
101*b1cdbd2cSJim Jagielskiscots.lm                             sco--utf8
102*b1cdbd2cSJim Jagielskiscots_gaelic.lm                      gd--utf8
103*b1cdbd2cSJim Jagielskiserbian.lm                           sr--utf-8
104*b1cdbd2cSJim Jagielskiserbian-latin.lm                     sh--utf-8
105*b1cdbd2cSJim Jagielskislovak_ascii.lm                      sk-SK-utf8
106*b1cdbd2cSJim Jagielskislovenian.lm                         sl--utf8
107*b1cdbd2cSJim Jagielskispanish.lm                           es--utf8
108*b1cdbd2cSJim Jagielskiswahili.lm                           sw--utf8
109*b1cdbd2cSJim Jagielskiswedish.lm                           sv--utf8
110*b1cdbd2cSJim Jagielskitagalog.lm                           tl--utf8
111*b1cdbd2cSJim Jagielskitamil.lm                             ta--utf8
112*b1cdbd2cSJim Jagielskithai.lm                              th--utf8
113*b1cdbd2cSJim Jagielskiturkish.lm                           tr--utf8
114*b1cdbd2cSJim Jagielskiukrainian.lm                         uk--utf8
115*b1cdbd2cSJim Jagielskivietnamese.lm                        vi--utf8
116*b1cdbd2cSJim Jagielskiwelsh.lm                             cy--utf8
117*b1cdbd2cSJim Jagielskiyiddish_utf.lm                       yi--utf8
118*b1cdbd2cSJim Jagielskizulu.lm                              zu--utf8
119