Issue 31230 - THAI spell checking
Summary: THAI spell checking
Status: CLOSED FIXED
Alias: None
Product: Infrastructure
Classification: Infrastructure
Component: Website general issues (show other issues)
Version: current
Hardware: All All
: P3 Trivial (vote)
Target Milestone: ---
Assignee: Martin Hollmichel
QA Contact: issues@lingucomponent
URL:
Keywords:
Depends on:
Blocks: 41707
  Show dependency tree
 
Reported: 2004-07-08 12:17 UTC by hin
Modified: 2013-02-24 20:34 UTC (History)
8 users (show)

See Also:
Issue Type: FEATURE
Latest Confirmation in: ---
Developer Difficulty: ---


Attachments
THAI dictionary and patch file (139.99 KB, application/x-gzip)
2004-07-08 12:19 UTC, hin
no flags Details
new thai dictionary for spell check (147.19 KB, application/x-gzip)
2005-05-30 07:12 UTC, hin
no flags Details
new thai dictionary for spell check, add country/city name and more. (152.45 KB, application/x-compressed)
2005-09-16 03:16 UTC, hin
no flags Details
Edit license for THAI dictionary (152.53 KB, application/x-compressed)
2005-09-20 03:19 UTC, hin
no flags Details
Configure for THAI dictionary. (700 bytes, patch)
2005-09-22 08:14 UTC, hin
no flags Details | Diff

Note You need to log in before you can comment on or make changes to this issue.
Description hin 2004-07-08 12:17:56 UTC
Dear All,

I've implemented THAI dictionary for THAI spell checking in OpenOffice.org. I
will attach THAI dict and patch file to OpenOffice.org for using.

Thank you.
Comment 1 hin 2004-07-08 12:19:41 UTC
Created attachment 16310 [details]
THAI dictionary and patch file
Comment 2 jjc 2005-02-13 08:28:54 UTC
James->hin
Apart from adding Thai dictionary support, the patch includes a fix to
sw/source/core/txtnode/txtedt.cxx.  I'm not sure exactly what it's doing.  It
looks like it tries to fixes an issue with mixed language spell checking; this
ought to be a separate issue.
Comment 3 frank.meies 2005-04-12 07:08:56 UTC
The patch of sw/source/core/txtnode/txtedt.cxx should not be necessary for OOo
2.0 anymore. The patched function SwScanner::NextWord( LanguageType ) has been
replaced with SwScanner::NextWord() which already works correctly for all script
types and languages.
Comment 4 Martin Hollmichel 2005-05-11 16:03:51 UTC
set target to 2.0
Comment 5 hin 2005-05-30 07:09:52 UTC
Hi,

Now, I'm appended a new Thai word list into th_TH.dic file.  I attach a new
update file to you.

Thank you.
Comment 6 hin 2005-05-30 07:12:33 UTC
Created attachment 26670 [details]
new thai dictionary for spell check
Comment 7 pavel 2005-07-28 10:10:14 UTC
Set target to OOo 2.0.1 (I have to do more changes in the area of spellhecking
to make it easier to integrate them).

README says:

name   :  th_TH 0.2 version of the thai dictionary
date   :  2005.05.30
License:  LGPL
Copyright 2005 by NECTEC, Thailand

mh: the license is LGPL. OK to integrate? If so, reassign back to me. If any
paperwork is to be done, hin will surely do that.
Comment 8 Martin Hollmichel 2005-08-16 10:51:09 UTC
set target to 2.0
Comment 9 Stephan Bergmann 2005-08-17 10:24:12 UTC
TL made me aware that this issue contains a patch for sal/textenc/tencinfo.c
that collides with the fix of issue 43666.  Please see issue 43666 for a
description of what TIS620-related input to rtl_getTextEncodingFromUnixCharset
is mapped to what output.  If the current behaviour (i.e., including the fix for
issue 43666) is not acceptable, please give a list of what additional inputs
should map to what outputs.
Comment 10 Stephan Bergmann 2005-08-17 10:24:45 UTC
.
Comment 11 Stephan Bergmann 2005-08-17 10:55:28 UTC
OK, to make the spellchecker happy TL and I just added the two mappings (case
insensitive)

  "TIS620-2529" -> RTL_TEXTENCODING_TIS_620
  "TIS620-2533" -> RTL_TEXTENCODING_TIS_620

to rtl_getTextEncodingFromUnixCharset (as suggested by the patch).
Comment 12 thomas.lange 2005-08-17 14:55:49 UTC
.
Comment 13 thomas.lange 2005-08-17 14:57:11 UTC
Fixed in CWS thaidict.

Files changed:
- sal/textenc/tencinfo.c  new revision: 1.26.18.1
- sal/qa/rtl/textenc/rtl_tencinfo.cxx  new revision: 1.3.18.1

- dictionaries/prj/build.lst  new revision: 1.2.104.1
- dictionaries/th_TH/makefile.mk	new file
- dictionaries/th_TH/dictionaty.lst	new file
- dictionaries/th_TH/README_th_TH.txt	new file
- dictionaries/th_TH/th_TH.aff		new file
- dictionaries/th_TH/th_TH.dic		new file -kb
Comment 14 thomas.lange 2005-08-18 13:15:14 UTC
Please be aware of issue #53168#, that is by chance a text font is set that is
not availabale on the system the glyphfallback may in some circumstance choose
to replace that by a symbol font. If that happens spellchecking will not
function at all since symbol fonts are excluded from spellchecking.
Thus be sure to have a proper font set!
Comment 15 thomas.lange 2005-08-18 13:33:49 UTC
Note: There are only OOo installation sets.
Comment 16 thomas.lange 2005-08-22 10:04:25 UTC
TL->MH: Since you said you'll probably have to arrange for external testing I'm
giving this one to you.

re-open issue and reassign to mh@openoffice.org
Comment 17 thomas.lange 2005-08-22 10:04:46 UTC
reassign to mh@openoffice.org
Comment 18 thomas.lange 2005-08-22 10:04:54 UTC
reset resolution to FIXED
Comment 19 stefan.baltzer 2005-08-23 11:01:40 UTC
SBA: Verified in CWS thaidict. Spellcheck works in general, but I can not tell
about the correctness of the proposals :-)
Set to verified.
Comment 20 hin 2005-09-16 03:13:29 UTC
Hi ALl,

  I add new words into dictionary file suxh as Thai spellout for country/city
name and some based Thai words. Please update to new Thai dictionary file.

Thank you.
Comment 21 hin 2005-09-16 03:13:54 UTC
Hi ALl,

  I add new words into dictionary file such as Thai spellout for country/city
name and some based Thai words. Please update to new Thai dictionary file.

Thank you.
Comment 22 hin 2005-09-16 03:14:30 UTC
Hi ALl,

  I add new words into dictionary file such as Thai name for country/city and
some based Thai words. Please update to new Thai dictionary file.

Thank you.
Comment 23 hin 2005-09-16 03:16:10 UTC
Created attachment 29590 [details]
new thai dictionary for spell check, add country/city name and more.
Comment 24 pavel 2005-09-16 09:09:53 UTC
seen THAI spell checker in m130.

For updates, please file new issue.
We can't update the file every week though...
Please concentrate on QA of it and target your new developments to 2.0.1.
Comment 25 markpeak 2005-09-16 12:49:10 UTC
pjanik,

I'm the one who tested m126 with Thai spellcheck. That dictionary lacks many
common Thai words and is actually unusable for Thai people. Any Thai users who
use 2.0 (with the old current dictionary) will have bad experience with it and
blame OOo 2.0 for poor quality Thai spellcheck.

I discussed this issues with hin and he already fixed that dictionary (the above
comment). I think that new dictionary should be included in 2.0, not 2.0.1 and
replacing it won't cost us much in QA term.
Comment 26 pavel 2005-09-16 16:40:50 UTC
OK, so file *new issue* to me and I'll check it into 2.0.
Comment 27 Martin Hollmichel 2005-09-19 15:51:22 UTC
began commit to prepforooo20final.

The new README states that the LICENSE is now GPL, it was until now LGPL.

@hin: please clarify license of new dictionary.
Comment 28 hin 2005-09-20 03:14:29 UTC
I'm very sorry. The THAI dictionary license is LGPL. Now I change license to
correct and sent to you again.

Thank you.
Comment 29 hin 2005-09-20 03:19:47 UTC
Created attachment 29695 [details]
Edit license for THAI dictionary
Comment 30 Martin Hollmichel 2005-09-20 10:33:33 UTC
committed update README with LGPL license.
Comment 31 hin 2005-09-22 08:12:27 UTC
@mh: I can not set build option for THAI dictionary in configure step
(--with-dict="THAI"). Please add THAI locale to OOo build configuration. 
Comment 32 pavel 2005-09-22 08:13:56 UTC
This is fixed in dicconfigure.

It should be THTH, BTW ;-)
Comment 33 hin 2005-09-22 08:14:14 UTC
Created attachment 29799 [details]
Configure for THAI dictionary.
Comment 34 hin 2005-09-22 08:36:59 UTC
@pjanik: I can not use 'THTH' for build THAI dictionary.  I see 'DIC_THAI'
variable in 'dictionaries/th_TH/makefile.mk', Please check again.
Comment 35 pavel 2005-09-22 08:38:26 UTC
hin: please check the cws I mentioned again.
Comment 36 Stephan Bergmann 2006-03-14 14:04:49 UTC
I wrote earlier in this issue:

<quote>
OK, to make the spellchecker happy TL and I just added the two mappings (case
insensitive)
  "TIS620-2529" -> RTL_TEXTENCODING_TIS_620
  "TIS620-2533" -> RTL_TEXTENCODING_TIS_620
to rtl_getTextEncodingFromUnixCharset (as suggested by the patch).
</quote>

Now, I found out that what we actually commited then was a slightly larger
change: for example, rtl_getTextEncodingFromUnixCharset("TIS620-1") or
rtl_getTextEncodingFromUnixCharset("TIS620-nonsense") now also return
RTL_TEXTENCODING_TIS_620 instead of RTL_TEXTENCODING_UNKNOWN, which breaks the
unit tests at sal/qa/rtl/textenc, see issue 61507.  I assume that this was a
mistake, but do not want to break anything when fixing issue 61507.  Who would
be a good QA person to verify that Thai spell checking still works the same way
as before once issue 61507 is fixed (and "TIS620-1", "TIS620-nonsense" etc. map
to RTL_TEXTENCODING_UNKNOWN again)?
Comment 37 Martin Hollmichel 2006-03-29 07:12:29 UTC
close issue.