Apache OpenOffice (AOO) Bugzilla – Issue 42671
Turn input from Thai user community into proper issues
Last modified: 2012-01-30 19:44:59 UTC
There is a significant Thai OOo user community, mostly using hacked versions of OOo 1.x, specifically OfficeTLE <http://www.opentle.org/officetle/> and Pladao <http://www.pladao.org>. Most Thai developers and users are not comfortable working in English, and therefore input from this community has not been well reported in OOo IssueTracker. This issue is the staging point for turning this input into good quality issues. We're focusing on the 2.0 codeline. Things that are fixed in 2.0 don't need to be turned into proper issues. Here's a numbered list of the items I've collected so far. Many of these probably won't make sense except to those who are already familiar with the issues. In comments please note the item number you are talking to. Feel free to add comments in Thai. 1. Numbers inappropriately formatted using Thai digits Issue #41672 2. Thai line-breaking non-functional Issue #41671 3. Display of invalid Thai combining character sequences broken on Windows Issue #42171 4. Sequence input checking not working Issue #42469 5. Thai justification inaccuracy Issue #41860 6. Need input sequence correction feature Issue #42661 7. Interop of Thai text with .doc broken for Word >= 2000; maybe Issue #23784 8. Need number spellout feature Issue #3702, Issue #17554 9. Interop with Thai MS Excel 97: BAHTTEXT(), "t" format prefix 10. Font size for default fonts needs to be configurable for each font 11. Display of invalid combining sequences broken with some Unicode fonts (e.g Tahoma) Issue #42663 12. Missing dotted circle breaks display of invalid combining sequences Issue #42662 13. Thai locale data incomplete 14. Windows search cannot search inside .sxw (fixed in 2.0) Issue #21108 15. Optional bell/flash when invalid input sequence entered 16. psprint cache makes printing become gradually slower and slower Issue #20287? 17. On OOo Web site, link to Thai OpenOffice goes to Pladao 18. Mail merge should be able to generate one file containing all recipients instead of one for each (maybe fixed in OOo 2.0) Issue #20057 19. Inaccurate cursor positioning for Thai (fixed in OOo 2) 20. Make Pladao fonts conveniently available for OOo 21. In non-English locale, numbering menu has bogus entries (get Sila to explain how to reproduce) 22. ICU Thai word-breaking does not work well in real-life 23. Integrate Thai spell-checking Issue #31230 24. Problem with mixed-language spell-checking that shows up when Thai spell-checking integrated (see patch in Issue #31230 from hin) 25. Thai thesaurus 26. Sequence checking at beginning of paragraph or in between English and Thai (need 4 fixed first) 27. On Windows machine with Thai locale, installation needs to be set up appropriately for Thai (e.g. with CTL enabled with default language of Thai) (fixed already in OOo 2?) 28. Need feature for easy manual override of incorrect word breaking Issue #42660 29. Thai in title bar (fixed in OOo 2?) 30. Showing non-printing characters with font with no middle dot on Windows, characters following middle dot get font that contains middle dot 31. Should have separate font for each script as in Mozilla, instead of Western/Asian/CTL 32. Should have visual feedback for current keyboard layout (e.g. change in cursor shape) 33. UI should not use term "CTL" 34. Font dropdown should be aware of current keyboard layout 35. Thai language pack (get NECTEC to do it?) 36. Need to integrate word-breaking and spell-checking 37. Inconsistent treatment of description property between Word and OOo creates interop problem (from NECTEC) 38. Thai forward word broken (ICU bug; fixed in OOo 2) 39. Need to be able to switch UI language at runtime (fixed in OOo 2) 40. Installation of OOo with Thai UI should install English UI as well (fixed in OOo 2?) 41. Make OfficeTLE non-code extras (e.g. clip art?) available for OOo 42. Problem fixed by ootle112_mapchar_en_th.patch 43. Ignore obsolete chars when numbering using Thai letters 44. Extend number of outline numbering levels from 8 to 16 45. Problem fixed by ootle112_adjust_matric_page_margin.patch
Created attachment 22551 [details] Set of patches from OpenOfficeTLE 1.1.2 (against 1.1.3)
Got it. Will write an issue for each one that has not been submitted.
Item 10 notes. OOo 1.1 hard-codes a default font size of 10 points or 12 points in a number of places. For most Thai fonts 10 points or 12 points is unreadably small. Thai characters have four vertical levels (base, 1 diacritic below and two diacritics above). When Thai characters are mixed with Latin characters, the relative sizes of Thai and Latin characters are usually chosen so that the bottom of below diacritics is lower than the bottom of Latin descenders, and the top of the second level Thai diacritics is higher than the top of Latin ascenders. The (not unreasonable) normal convention for Thai fonts is that in an N point font the distance from the bottom of the below-diacritic to the top of the second above-diacritic is N points. Thai typesetting would typically use a 16 point default font size. Thai fonts typically contain Latin characters as well: the Latin characters in a 16 point Thai font would be about the same size as the characters in a normal 12-point Western font. Some newer fonts don't follow this convention. Instead they size of a font is based on the size of the Latin characters. In this case, the font will either (a) have a much larger line-spacing than usual or (b) will have to use a unusually vertically compact design for the Thai diacritics. Loma does (a) I believe. Tahoma does (b). (a) is not good if you use the font for English alone. (b) is not good because the characters look ugly. A better solution would to use an OpenType BASE table to specify a different line-spacing for different scripts. I don't know of any currently available font that does this, but I expect to see them in the future. In particular work is ongoing on adding Thai characters to Bitstream Vera using this approach. OfficeTLE's solution to this is just to hard-code 16 points instead of 12 points as the default font size. OfficeTLE also changes the default fonts for Western use to be Thai fonts (which contain Latin characters for which 16 points is a reasonable default point size). This is not so good even for Thai users, because the Latin characters in Thai fonts are usually not very good quality. I'm not exactly sure what the right solution is. I think part of it is probably to allow VCL.xcu to optionally specify a font size for each default font (eg <value>Norasi:16;Bistream Vera Sans:12;Tahoma:12</value>
Item 21. On Windows with m74 with Thai system locale, in Calc in the Format Cells dialog, if I set Category to Currency, Language to Thai, and then click the down arrow on Format, and scroll through the list of currencies, I see many entries for Indic and Middle eastern (bidi) scripts with messed up formatting.
James, I get the same behavior even when the locale is set to English.
Item 10. On Windows WordPad defaults to Arial 10pt in a Western locale (e.g. Control Panel|Regional and Language Options|Regional Options set to English US), but Cordia 14pt in a Thai locale. In a Western locale, if I change the keyboard layout to Thai, the font changes to Cordia 14pt.
Item 32. With Notepad on Windows XP, the cursor shape changes from a vertical bar to a sort of very narrow backwards L shape whenever the keyboard layout changes from English to Thai.
Item 33. In Windows, the Control Panel under Supplemental Language Support uses the terms "East Asian languages" and "complex script and right-to-left languages (including Thai)", which is a lot better than OOo's "Asian" and "CTL".
Item 34. On Windows XP, WordPad has a drop down box for the current script, next to the font name and font size drop down boxes; when the keyboard layout is changed, the script in the drop down box changes automatically. The list of scripts show in the box corresponds to the scripts supported by the font shown in the drop down box. If the current keyboard layout is English and current font is Arial (or any other font that doesn't support Thai) and you then change the keyboard layout to Thai, the font shown in the drop-down box will switch to Cordia.
Item 41. The OpenOffice TLE 1.1.2 extras (25M) are at: http://www.softwarebank.org/download.php/49/OOTLE_1.1.2_extras.tar.gz
Does this mean we are going to look at each item in the list; see if it's valid or not, for the current OOo 1.9.x build; and if it is, create new issue in the IssueZilla (to bring it to attention of wider OOo developers)?
James->Art: Yes, that's what we're in the process of doing. SIPA has contracted Samphan to do the work. He's done quite a few already as you can see from the comments and Issue links. The descriptions are intended to be just enough to remind those of us who have been discussing the problems of the various issues we've talked about. In many cases, I fear they may be too short to make sense to those who haven't bee involved in the discussions. But if any of the descriptions here make sense and you can help turn them into proper issues, that would be great. Falko (ft) is helping us coordinate with the OpenOffice team, so we're assigning new bugs to him. Also we have a tracking bug issue #41707 to which we're adding all our issues as we create them. The plan is for SIPA to provide some significant funding to Sun in Hamburg (and/or anybody else who can help get the job done) to get at least some of these things fixed in OOo 2.x, so that eventually there will be no need to maintain two separate Thai versions (OfficeTLE and Pladao). I am sure you agree that the best long-term solution is for the official version of OOo to have first-rate Thai support. Many of the issues on the list come from the OfficeTLE team at NECTEC. I would like to ensure that we also include any issues that the Pladao team has discovered; I have tried to make contact, but so far without success. I haven't even been able to find the source code for latest version of Pladao: everything in their CVS Web seems to be about two years old. I think SIPA would be happy to provide some funding to the Pladao team to get their changes into a suitable form for merging into the OOo 2.x. Perhaps you could help liaise.
Pladao Office 3.1 source code http://pladao.org/modules.php?name=Downloads&d_op=getit&lid=93
an actual url for the Pladao Office 3.1 source code http://www.pladao.org/files/3.1/PladaoOffice3.1-src.tar.gz
Pladao Office 3.1 readme -- for known issues with Pladao Office 3.1 http://www.pladao.org/files/3.1/readme.html (in Thai language) ---- FYI, Pladao Office 3.1 is OO.o 1.x-based.
Revise the list :- 0. Tracking bug for Thai-related issue Issue #41707 1. Numbers inappropriately formatted using Thai digits Issue #41672 2. Thai line-breaking non-functional Issue #41671 3. Display of invalid Thai combining character sequences broken on Windows Issue #42171 4. Sequence input checking not working Issue #42469 5. Thai justification inaccuracy Issue #41860 6. Need input sequence correction feature Issue #42661 7. Interop of Thai text with .doc broken for Word >= 2000; maybe Issue #23784 8. Need number spellout feature Issue #3702, Issue #17554 9. Interop with Thai MS Excel 97: BAHTTEXT(), "t" format prefix Issue #42727 10. Font size for default fonts needs to be configurable for each font Issue #42725 11. Display of invalid combining sequences broken with some Unicode fonts (e.g Tahoma) Issue #42663 12. Missing dotted circle breaks display of invalid combining sequences Issue #42662 13. Thai locale data incomplete Issue #42723 14. Windows search cannot search inside .sxw (fixed in 2.0) Issue #21108 15. Optional bell/flash when invalid input sequence entered Issue #42728 16. psprint cache makes printing become gradually slower and slower Issue #20287? 17. On OOo Web site, link to Thai OpenOffice goes to Pladao 18. Mail merge should be able to generate one file containing all recipients instead of one for each (maybe fixed in OOo 2.0) Issue #20057 19. Inaccurate cursor positioning for Thai (fixed in OOo 2) 20. Make Pladao and OfficeTLE fonts conveniently available for OOo Issue 42738 21. In non-English locale, numbering menu has bogus entries (get Sila to explain how to reproduce) 22. ICU Thai word-breaking does not work well in real-life 23. Integrate Thai spell-checking Issue #31230 24. Problem with mixed-language spell-checking that shows up when Thai spell-checking integrated (see patch in Issue #31230 from hin) 25. Thai thesaurus 26. Sequence checking at beginning of paragraph or in between English and Thai (need 4 fixed first) 27. On Windows machine with Thai locale, installation needs to be set up appropriately for Thai (e.g. with CTL enabled with default language of Thai) (fixed already in OOo 2?) Issue #42730 28. Need feature for easy manual override of incorrect word breaking Issue #42660 29. Thai in title bar and filename (fixed in OOo 2) 30. Showing non-printing characters with font with no middle dot on Windows, characters following middle dot get font that contains middle dot 31. Should have separate font for each script as in Mozilla, instead of Western/Asian/CTL Issue #42735 32. Should have visual feedback for current keyboard layout (e.g. change in cursor shape) Issue #42731 33. UI should not use term "CTL" Issue #42733 34. Font dropdown should be aware of current keyboard layout Issue #42732 35. Thai language pack (get NECTEC to do it?) 36. Need to integrate word-breaking and spell-checking 37. Inconsistent treatment of description property between Word and OOo creates interop problem (from NECTEC) 38. Thai forward word broken (ICU bug; fixed in OOo 2) 39. Need to be able to switch UI language at runtime (fixed in OOo 2) 40. Installation of OOo with Thai UI should install English UI as well (fixed in OOo 2?) 41. Make OfficeTLE non-code extras (e.g. clip art?) available for OOo 42. Problem fixed by ootle112_mapchar_en_th.patch 43. Ignore obsolete chars when numbering using Thai letters 44. Extend number of outline numbering levels from 8 to 16 45. Problem fixed by ootle112_adjust_matric_page_margin.patch
Another one we discussed that I forgot to add to the list: 46. Don't understand semantics of check-box to control whether Sequence Input Checking is or is not Restricted
I think by far the most important item that we don't yet have a good issue for is 7. Next is 42 (this looks like it is very important as soon as you have a Thai UI).
Current state Issued:- 0. Tracking bug for Thai-related issue Issue #41707 1. Numbers inappropriately formatted using Thai digits Issue #41672 2. Thai line-breaking non-functional Issue #41671 3. Display of invalid Thai combining character sequences broken on Windows Issue #42171 4. Sequence input checking not working Issue #42469 5. Thai justification inaccuracy Issue #41860 6. Need input sequence correction feature Issue #42661 7. Interop of Thai text with .doc broken for Word >= 2000; maybe Issue #23784 8. Need number spellout feature (old Issue #3702, Issue #17554) Issue 43043 9. Interop with Thai MS Excel 97: BAHTTEXT(), "t" format prefix Issue #42727 10. Font size for default fonts needs to be configurable for each font Issue #42725 11. Display of invalid combining sequences broken with some Unicode fonts (e.g Tahoma) Issue #42663 12. Missing dotted circle breaks display of invalid combining sequences Issue #42662 13. Thai locale data incomplete Issue #42723 15. Optional bell/flash when invalid input sequence entered Issue #42728 20. Make Pladao and OfficeTLE fonts conveniently available for OOo Issue 42738 27. On Windows machine with Thai locale, installation needs to be set up appropriately for Thai (e.g. with CTL enabled with default language of Thai) Issue #42730 28. Need feature for easy manual override of incorrect word breaking Issue #42660 31. Should have separate font for each script as in Mozilla, instead of Western/Asian/CTL Issue #42735 32. Should have visual feedback for current keyboard layout (e.g. change in cursor shape) Issue #42731 33. UI should not use term "CTL" Issue #42733 34. Font dropdown should be aware of current keyboard layout Issue #42732 41. Make OfficeTLE non-code extras (e.g. clip art?) available for OOo Issue 42971, Issue 42738 42. Problem fixed by ootle112_mapchar_en_th.patch Issue 42964 43. Ignore obsolete chars when numbering using Thai letters Issue 43045 44. Extend number of outline numbering levels from 8 to 16 Issue 42965 45 Don't understand semantics of check-box to control whether Sequence Input Checking is or is not Restricted Issue 42967 Fixed:- 14. Windows search cannot search inside .sxw (fixed in 2.0) Issue #21108 16. psprint cache makes printing become gradually slower and slower (fixed in OOo 1.1.1?) Issue #20287 18. Mail merge should be able to generate one file containing all recipients instead of one for each (maybe fixed in OOo 2.0) Issue #20057 19. Inaccurate cursor positioning for Thai (fixed in OOo 2) 29. Thai in title bar and filename (fixed in OOo 2) 38. Thai forward word broken (ICU bug; fixed in OOo 2) 39. Need to be able to switch UI language at runtime (fixed in OOo 2) 40. Installation of OOo with Thai UI should install English UI as well (fixed in OOo 2) Users should install English OOo then Thai language pack instead? New feature:- 23. Integrate Thai spell-checking Issue #31230 25. Thai thesaurus 35. Thai language pack (NECTEC will do it) No idea: 17. On OOo Web site, link to Thai OpenOffice goes to Pladao Need other fix 26. Sequence checking at beginning of paragraph or in between English and Thai (need 4 fixed first) Can't reproduce 30. Showing non-printing characters with font with no middle dot on Windows, characters following middle dot get font that contains middle dot To be issued :- General, non-thai specific 37. Inconsistent treatment of description property between Word and OOo creates interop problem (from NECTEC) Need to build to understand: 21. In non-English locale, numbering menu has bogus entries 24. Problem with mixed-language spell-checking that shows up when Thai spell-checking integrated (see patch in Issue #31230 from hin) Need investigation: 45. Problem fixed by ootle112_adjust_matric_page_margin.patch 36. Need to integrate word-breaking and spell-checking Issue at ICU 22. ICU Thai word-breaking does not work well in real-life
> 17. On OOo Web site, link to Thai OpenOffice goes to Pladao On which page please? If you can give me a url, I may ask people who take care the website to fix it. .. Replace it with http://th.openoffice.org/ ?
Item 17. If you go to: http://download.openoffice.org/1.1.4/index.html Then select Thai. On the page this takes you to, select "Continue to download". Then it takes you to a page that says: คุณสามารถดาวน์โหลด OpenOffice.org รุ่นภาษาไทย ได้จากไซต์เหล่านี้: * http://pladao.org (ประเทศไทย) We need to make sure this it doesn't say this for the 2.x release.
I think it would be helpful to provide some guidance on the relative priority of issues. Fixes for issue 41671 and issue 41672 are in the pipeline. Apart from those, my top issues (from highest to lowest priority) would, I think, be: issue 41860 issue 42909 issue 42469 issue 42725 Anybody else have thoughts on this?
Item 17. URL is http://th.openoffice.org/about-downloads.html
Item 17, fixed. Now points to both Thai-hacked versions (Pladao Office, OfficeTLE) and also standard version (OpenOffice.org). http://th.openoffice.org/about-downloads.html
45. Problem fixed by ootle112_adjust_matric_page_margin.patch The patch came from the fact that Thai use Metric mesurement system as specified in th_TH.xml but usually (always?) use Inch in measuring paper document (like 1.25" margin, 0.5" tab stop). The problem is that the codes that the patch fixed have hard-coded some numbers depend on the locale's mesurement system like 2 cm vs 1" (e.g. margin), 1.25 cm vs 1/2" (tab stop). The solutions :- 1) Specify in th_TH.xml that Thai use US (not metric) measurement system because Thai use inch for papers. No change in code. 2) Have another entry in the locale data to specify 'measurement system for document' that will be reflected in default values for various things and the unit of the rulers. 3) Not hard-code anything. Read every defaults from locale data.
45. I like your solution 2.
45. Just discussed with people on TLWG newsgroup and talked with Sila, concluded that there's no standard for measure system for Thai document. Some people say CM (and left/right margin of 2cm). Some say inch (and left/right margin of 1". One say that it came from the use of MS Word in the past.). Sila also suggest not to submit this patch. So I think we should left it as CM to conform with the national measurement system.
17. On OOo Web site, link to Thai OpenOffice goes to Pladao - Fixed 37. Inconsistent treatment of description property between Word and OOo creates interop problem (from NECTEC) Issue #43588 45. Problem fixed by ootle112_adjust_matric_page_margin.patch - Non-issue 36. Need to integrate word-breaking and spell-checking - Issue #43583
21. In non-English UI, bullet & numbering menu has bogus entries - FIXED Tested by building OOo 680_m79 with Thai UI, then look at Format->Bullet & Numbering->Options->Numbering (detail from Sila), no strange or blank entry. However, there're some errors in translation which could be fixed by NECTEC in the new Thai language pack.
K Nusorn pointed me to this http://marketing.openoffice.org/conference/presentations-pdf/fri1330/PladaoOffice2003.pdf This is a nice discussion of everything Pladao added. I noticed references to a couple of things that we haven't discussed yet: - drop caps - underlining
Created attachment 23108 [details] An example of Thai drop cap text
We can specify the number of characters to drop in the drop-cap feature. If we specified only part of the first cluster (e.g. 1 char when the cluster contain 3 chars), there'll be a bug (repetition of the first cluster, as shown in the first paragraph in my attached example). But if we specified the number of characters of the whole first cluster, the feature work (2nd paragraph). Drop-cap of the first word also works (3rd paragraph). I think I should file a bug on this (repetition of the first cluster). Do you think it should detect the cluster boundary itself? (specify 1-3 characters should work in my example).
I think it should count in clusters. "1" "character" should mean 1 cluster. It's no good to require the user to specify 3 characters for a cluster with consonant+vowel+tone-mark, because you want to be able to have a paragraph style which always drops exactly the first cluster, regardless of whether the cluster has 1, 2 or 3 characters. In any case, it makes no sense to make a base character dropped without also making its following combining characters dropped.
for unlining please see more info at Mozilla's Bugzilla: Bug 156881 : Underline should skip character/part-of-character that is below the base line (text-decoration-mode) https://bugzilla.mozilla.org/show_bug.cgi?id=156881
Underlining, also this one, issue 10708 : RFE: implement CSS3 text mode - text decoration http://www.openoffice.org/issues/show_bug.cgi?id=10708
All valid items have their own issue. See issue 41707 - Tracking bug for Thai-related issue. Discussion will be there. 22 - work-breaking will be issued at ICU. Closed.
The Issue you raised has been marked as 'Resolved' and not updated within the last 1 year+. I am therefore setting this issue to 'Verified' as the first step towards Closing it. If you feel this is incorrect, please re-open the issue and add any comments. Many thanks, Andrew Cleaning-up and Closing old Issues ~ The Grand Bug Squash, pre v3 ~ http://marketing.openoffice.org/3.0/announcementbeta.html
As per previous posting: Verified -> Closed. A Closed Issue is a Happy Issue (TM). Regards, Andrew