Apache OpenOffice (AOO) Bugzilla – Issue 97214
[NL] TWo INitial CApitals Exception list for Dutch needed (words starting with "IJ")
Last modified: 2017-05-20 11:13:09 UTC
The "ij" is a vowel in Dutch which is capitalized as "IJ". However, when my autocorrect is on, typed words like "IJs" are autocorrected to "Ijs". The spelling checker recognizes "Ijs" as bad spelling, which is good; however, when viewing the spelling checker suggestions, there is again "Ijs".
@SBA: Please have a look. I found "a" reference: http://www.lingvozone.com/languages/Language%20Information9.htm An in IZ the issue 13981 mentions it shortly.
Tools - AutoCorrect - Options Tab "Exceptions" Choose Language "Dutch (Netherlands)" -> See that the exception list "Words with TWo INitial CApitals" for NL only has the entry "OOo" Your short-term solution: Add the Words you need to this list The Sledgehammer solution: On Tab "Options", uncheck "Words with TWo INitial CApitals" twice, OK The real Solution: Provide af full list of all words and attach it to this issue. Then it can be added to the office by default. Note: On tab "Exceptions", when you select language English (USA), you can se a nice list of MHz, GHz, CDs and PCs that are likely to be useful when used for Dutch, too. Change issue type to "Enhancement", adjusting summary
This is not the desired solution. As I mentioned, the vowel "IJ" in Dutch ALWAYS is written with two capitals. So I would have to add IJs, IJzer, IJzel, IJl, IJdel, IJzig, IJnemuiden, ...
What amount of words are we talking about? 100, 1000, 10000? Note that it is not neccesairily YOU to provide such a list. I just thought you might have one at hands since you digged into this already. The "List" solution would work without any "real code" change, it is just adding local data. In my opinion is realistic to get this done for OOo 3.2. Yes, one might think of the AutoCorrection ignoring "IJ" for Dutch as you propose. Or being able to add "wild cards" like "IJ*" into that list. All this sounds "expencive" in terms of performance (each word being checked "Do you start with IJ?") and developer ressources to do it. I truly believe that such a Feature wisch (NEW code implemented and tested, maybe even some UI change to control this, then with a full-blown specification and new Test Cases being written and documentation being translated) would get very very old because there is "bigger fish to catch" in the forseeable future. Fixing REAL defects has a higher priority than this Feature that would serve "only" (sorry... :-) the users of Dutch. Means: Bring on a developer to do so or wait a long long time without any change in the Office. Thus I still favor the "List solution" as I see no other realistic one. Put OS and TL on c/c.
Note: This issue should probably be split up into two, since one part is the spell checker and the other is auto correction, and both have nothing in common.
I would regard adding "wild cards" to the autocorrection exceptions list as the worst possible solution. For me, as a simple outsider from the OpenOffice code, I don't know about costs for changing that code to work for this issue. The structural solution would be regarding "ij" as a single "letter". The quick solution is indeed to add every word starting with "IJ" to the autocorrection exceptions list, as there won't be too much.
I believe this is some l10n issue!
Changed component to l10n. Reassigned to VA. This issue is about adding a list (see summary). -> The enhancement issue "treat letter combinations differently" must be written seperately.
Adding myself, digro and simonbr to the cc @Simon please comment on this.
A list of 338 words starting with "IJ" from the OpenTaal word list can be downloaded from here: http://simonbr.xs4all.nl/wiki/index.php/IJ-woorden Note that there may exist other Dutch words that start with IJ, for example compounds formed from one of these. So adding these words to the exception list will not completely fix the problem.
Just curious: Why are in Dutch no double letters (ligature) used? It seems that the most important fonts contain these characters: IJ or ij But maybe I am wrong. ;-)
I don't think an exception list is a very elegant solution for this. The letter IJ is a feature of the Dutch language and it would be best if the spelling engine or whatever part of the programme controls initial caps correction would recognize this. I don't think this that in MS Word this is solved by an exception list. Also, how do you add items to the exception list? The provisional list referred to here already contains 338 items. Are you supposed to add these one by one through the software's dialogue? A hell of a job that doesn't really deserve the label 'user friendly'. This is not the only Dutch language issue in Open Office, by the way.
To explain: Simon's list is admirable, but it can never be complete. To take just a few examples: it has the word IJsland and IJslands, but then there's also the word form IJslandse. Similarly for IJzig -- it does have IJzige, but then ijzigst and ijziger and ijzigste are all of them also *possible* word forms. Trying to simply compile a list of "all" possible words is a never ending story. The software just has to recognize that, weird as it is, IJ is a single letter in Dutch and should be treated as such. Come on, guys! If Microsoft can, so can you.
So we did :-) By updating the Dutch (NL) dictinary extension for OOo 3.3 (see issue 114603), the "Capital IJ" issue is fixed. "Ijsselmeer" gets corrcted to "IJsselmeer", same for "Ijsland", "Ijslandse"... Looks like this issue can be closed as it was fixed at the best place, the spell check extension itself. :-) Reassigning issue to me.
Set to "WORKSFORME"
Hmmm... Just re-thinking this one. The new spell checker dictionary alone does not solve the problem. While writing, "IJ..." still gets auto-corrected to Ij... (and then marked as misspelled). Thus any (incomplete) list will soothe the problem. Reopening issue.
Reassigned to VA again.
Can someone provide me with an updated acor_nl-NL.dat, so I can check it in? TIA!
Reset assigne to the default "issues@openoffice.apache.org".