Apache OpenOffice (AOO) Bugzilla – Issue 85381
Heap corruption bug in dmake
Last modified: 2013-08-07 15:34:52 UTC
There is a heap corruption bug in dmake that occurs only on Cygwin. Look at the function gen_path_list_string(). This function takes a list of pathnames in Cygwin (Unix) form and turns them into a single string of pathnames in Windows form separated by spaces. Look at the code where Cygwin paths are turned into Windows paths: tpath = DO_WINPATH(cell->datum); if( tpath == cell->datum ) { len=cell->len; } else { /* ... but only if DO_WINPATH() did something. */ len = strlen(tpath); if( len > slen_rest ) { /* We need more memory. As DOS paths are usually shorter than the * original cygwin POSIX paths (exception mounted paths) this should * rarely happen. */ The condition (len > slen_rest) is bogus. slen_rest is irrelevant here. What len should be tested against is the length of the original Cygwin pathname, cell->len. After that change, and correspondingly for the same reason changing the subexpression len-slen_rest to len->cell-len a couple of lines below, we then also notice that there is no use for the slen_rest variable at all. Also, looking at the following code where the result buffer is reallocated: /* Get more than needed. */ slen += slen + len-cell->len + 128; if((result = realloc( result, (unsigned)(slen*sizeof(char)) ) ) == NULL) No_ram(); we see some odd things. Firstly, slen += slen + len-cell->len + 128. Wtf? Clearly what is meant is slen += len-cell->len + 128. However, there is no real reason for that "extra" 128. It seems that the persons behind this code were aware that there is something wrong with it, but didn't quite know what, so they throw in an "extra" 127 bytes and hope for the best. (Alas, this doesn't help for the crash I observed as the problem was that this "We need more memory" branch isn't taken in the first place even if needed.) There is no need for that "more than needed", just doing slen += (len-cell->len) + 1 works fine.
Also, the sizeof(char) is totally pointless. sizeof(char) is 1 by definition. Pretending to "code carefully" by using sizeof(char) is silly.
Created attachment 50986 [details] Suggested patch
My bug, my issue ;) Thanks for the extensive analysis.
Oh, and did you get enough coffee lately? > ... Firstly, slen += slen + len-cell->len + 128. Wtf? That wouldn't compile if it really were in the code. Ah yes, you clearly understood what "the persons behind this code" were planning. > The condition (len > slen_rest) is bogus. slen_rest is irrelevant here. .. > However, there is no real reason for that "extra" 128. It seems that the > persons behind this code were aware that there is something wrong with it, > but didn't quite know what, so they throw in an "extra" 127 bytes and hope > for the best. There is no hidden agenda, only a bug. If the code would have worked as intended it would have gotten more memory as needed to accommodate for future memory needs of *result and hereby reduced the number of needed realloc() calls at the expense of a little memory overhead - Yes a little questionable but IMHO valid. I will fix this bug and you might sometime have a look into issue 65083.
Created attachment 51040 [details] Patch for make.c
Unfortunately, your patch fails to correct the root cause of the problem, the check for (len > slen_rest). BTW, I *think* that the only situation where the conversion from Cygwin pathname to Windows pathname can cause the pathname length to grow is if one against recommendations haven't used the /cygdrive/... form of pathnames consistently and everywhere when building OOo. I guess that is why this bug usually doesn't bite. I had by accident used a Cygwin path starting with a Cygwin mount point /ooo in the build tree where I noticed the bug. Or is there some situation where even a Cygwin pathname starting with /cygdrive/ can correspond to a longer Windows pathname?
Created attachment 51066 [details] Patch for make.c
Let me answer the second path first, if for example /tmp or /home/.. or another of the "standard" unix directories is used it is well possible that they get longer when they are translated to DOS, e.g. /tmp -> c:/cygwin/tmp. But I think it is a bug if using mounted paths, like your /ooo is failing. Now, I didn't check if I can force the error, but I miss the reason why the if(len > slen_rest) check is failing to find that more memory is needed - except that my patch put another bug in. :( Its fixed with the previous patch (iz85381_2.diff). slen_rest is initialized to the same number as the originally needed amount of memory without the POSIX->DOS conversion. Inside the loop, after copying the buffer, slen_rest is set to the number of remaining bytes in that buffer. That should work, if len is smaller than slen_rest, increase the buffer size plus 128 bytes headroom to reduce possible malloc calls if more elements are longer in DOS than the POSIX version. @mh: Once we get this sorted out, the patch should go into OOH if it is not to late.
The point is that the input to the function is a *list* of pathnames in Cygwin form. slist_rest is initialised to the length required by all the pathnames concatenated, separated by spaces. In the loop each pathname from the list is converted to Windows form. len is set to the length of the converted pathname. To check if the length of the current pathname was increased by the conversion, it is pointless to compare len against slen_rest, which is the length of the concatenation of the original pathnames. One needs to compare to the original length of the current pathname, i.e. cell->len.
Or actually, I guess your way to fix the problem might work, too. I will have to check with dmalloc.
Sorry, thinking about this in the shower, I came to the conclusion your second patch is still buggy. But I had to write a test program to be sure. Try the attached test program. It contains three implementations of gen_path_list_string(): The original one from ooh680-m3, with my patch, and with your second patch. I just added a couple of debugging printouts to them. Additionally, so that the test program won't actually corrupt the heap itself, I allocate a bit more than the program thinks (look for TEST_PROGRAM_SAFETY). Run the program and pay attention to the output lines saying "Allocated n bytes", "Reallocated to n bytes" and "Wrote to result n bytes".
Created attachment 51102 [details] Test program
Created attachment 51232 [details] Patch for make.c
Ouch! Thanks for your excellent test program tml. No wonder that was corrupting the heap. I was only checking for the remaining memory if the DO_WINPATH function actually did something. The previous patch fixes the problem in your testcase. @mh: If tml is OK with it this patch should go in OOH, preferably as a masterfix.
I'm fine with doing this as masterfix, cc'ing obo for that.
Fixed make.c commited on MWS for OOH680_m6.
Yup, that last patch looks fine.
Patch also committed to CWS dmake412.
last comment from tml counts as "verified"
.