Issue 117415 - *.doc file has Field "Fib.nFib" set incorrectly
Summary: *.doc file has Field "Fib.nFib" set incorrectly
Status: UNCONFIRMED
Alias: None
Product: Writer
Classification: Application
Component: save-export (show other issues)
Version: OOo 3.2.1
Hardware: All All
: P3 Normal (vote)
Target Milestone: ---
Assignee: AOO issues mailing list
QA Contact:
URL:
Keywords: needhelp
Depends on:
Blocks:
 
Reported: 2011-03-17 00:23 UTC by baggett.patrick
Modified: 2017-05-20 10:44 UTC (History)
3 users (show)

See Also:
Issue Type: DEFECT
Latest Confirmation in: ---
Developer Difficulty: ---


Attachments
Trivial *.doc file generated by OOo 3.2.1 that exhibits the problem (9.00 KB, application/octet-stream)
2011-03-17 00:23 UTC, baggett.patrick
no flags Details
The words "Hello World" saved into an ODT file (8.22 KB, application/vnd.oasis.opendocument.text)
2011-03-21 14:44 UTC, baggett.patrick
no flags Details
HelloWord2.doc : AOO created file (9.00 KB, application/msword)
2015-05-01 19:52 UTC, joezbugz
no flags Details
WordDoc.doc : Word 2010 created file (21.50 KB, application/msword)
2015-05-01 19:52 UTC, joezbugz
no flags Details
AOOnFib.jpg : AOO created file binary version shown (247.15 KB, image/jpeg)
2015-05-01 19:53 UTC, joezbugz
no flags Details
WordnFib.jpg : Word 2010 crfeated file binary version shown (241.61 KB, image/jpeg)
2015-05-01 19:54 UTC, joezbugz
no flags Details

Note You need to log in before you can comment on or make changes to this issue.
Description baggett.patrick 2011-03-17 00:23:01 UTC
Created attachment 76125 [details]
Trivial *.doc file generated by OOo 3.2.1 that exhibits the problem

I'm writing code to load *.doc files for a proprietary project and have been reading the "[MS-DOC]" file (from Microsoft Open Specification). For testing, I have sample files saved by both Microsoft Word 2007 and OpenOffice 3.2.1.

ref: http://msdn.microsoft.com/en-us/library/cc313153.aspx

The problem I'm seeing is that OpenOffice saves the field 'FibBase.nFib' (pg 52) with the incorrect value (0x0101) instead of what is specified (0x00C1) in the spec. This doesn't harm anything really since the documentation says the value "SHOULD" be 0x00C1 but not "MUST" be. In fact, when 'FibRgCswNew.nFibNew' is present, that 'FibBase.nFib' isn't used and references to it are replaced with 'FibRgCswNew.nFibNew'.

However, the 'cswNew' field (near bottom of pg 51) is zero, which means the whole 'FibRgCswNew' structure isn't present (and therefore neither 'FibRgCswNew.nFibNew'). In that case, 'FibBase.nFib' really should be 0x00C1 (see table at bottom of pg 51) since it not only differs from the "SHOULD" value, but the expected value from the nFib -> cswNew table is a "MUST" match (again, see bottom of pg 51).

Though wrong, none of these details seem to confuse OpenOffice or MS-Word, but they are somewhat inconsistent with the spec and files generated by MS-Word. Stricter validators/loaders may reject the files created by OpenOffice.
Comment 1 michael.ruess 2011-03-17 09:24:59 UTC
From the saved doc we cannot so much, please also attach a native odt version of the field you inserted into the document. Thanks for your patience.
Comment 2 baggett.patrick 2011-03-17 09:43:32 UTC
Huh? I don't think you understood the report. This has absolutely nothing to do with *.odt format. I didn't insert any fields into anything, I'm reading a *.doc file //saved by OOo 3.2.1// and output file isn't correct. The only text in the document is "Hello World". If you opened the document, you would see this is the case.

The "fields" I am referring to are binary data blobs saved into the *.doc file. I provided a link to where the document may be found and exact page numbers in that document. Please re-read my initial bug report and follow the links / page numbers I recorded and let me know if you still need more information.
Comment 3 michael.ruess 2011-03-17 11:23:45 UTC
Well, I know that your issue is not about odt format - but an original file without the error is needed to better debug this problem. And native odt is best for this purpose.
I was not able to find a doc file on the mentioned web page - only pdf's...
Please attach the original MS Word created doc file (best would be a one pager having the field in question as sole content).
Comment 4 baggett.patrick 2011-03-17 21:50:28 UTC
Native ODT file? This has absolutely 0 to do with ODT file format. It is how OO Writer saves in the *.doc file format. Having an ODT file will do nothing for you.

See the attachment -> "test.doc"

If you don't want to download it, then open OO Writer, type "Hello World" and then do "Save As" -> "*.doc : Microsoft Word 97 - 2003". That is how trivial the document is. I dont know how I can be more clear.

The link in (question http://msdn.microsoft.com/en-us/library/cc313153.aspx) is to the FILE FORMAT SPECIFICATION FOR *.DOC AS RELEASED BY MICROSOFT. The field in question is not a field like a table, row, or column or anything that a user types in OO Writer. It part of the raw file's header.

I really hate to sound rude, but if you would have downloaded the PDF pointed by that link, gone to the page number I specified in my original bug log, and read it, you would know this.
Comment 5 michael.ruess 2011-03-18 10:57:34 UTC
MRU->HBRINKM: please have a look if this information is useful to you.
Comment 6 michael.ruess 2011-03-21 08:56:25 UTC
...forgot to reassign.

MRU->bagget.patrick: you don't understand me. I know well, that this problem does not deal with odt format at all. But to conveniently reproduce and debug the problem, an odt version of the "non-damaged" document would be extremely helpful.
We then would be able to see immediately what's going wrong when Writer saves in .doc format.
Comment 7 baggett.patrick 2011-03-21 14:44:42 UTC
Created attachment 76163 [details]
The words "Hello World" saved into an ODT file
Comment 8 baggett.patrick 2011-03-21 14:56:46 UTC
MRU, I understood you. And I still think it is dumb that you or anyone else reading this doesn't just type the words "Hello World" in a file and save it. Not exactly rocket science. Instead you've just repeat yourself over and over while sounding like a fool. Wow, just wow. You could have just read what I said and typed "Hello World" into a new document, clicked save, and attached it yourself if you wanted, but apparently that's too hard, so I've done it for you. See attachment #2 [details].

I hope I never run into an open source project again that is this annoying to report a bug for. I don't know if there is a language barrier here or what, but I get the feeling that you probably shouldn't be the one following up on technical bug reports. This is my first bug report and likely my last for this project. I'm not going to waste any more time and effort responding to mindless posts. All the information you need to fix the problem was in the first post. Either you fix it or you don't.
Comment 9 Oliver-Rainer Wittmann 2012-06-13 12:19:15 UTC
getting rid of value "enhancement" for field "severity".
For enhancement the field "issue type" shall be used.
Comment 10 joezbugz 2015-05-01 19:51:00 UTC
There seems to be a critical misunderstanding between the original poster and the AOO person examining it.

Re-title: .doc version tag does not comply with binary format specification.

I am testing on AOO 4.1.1 on a Windows7 machine. I also have Word 2010 installed and am using it for comparisons. My comments on using Visual Studio for data examination are based on VS2010 Professional.

The problem is this: When AOO saves a file in .doc format, it sets the version number of the file format to a value that is different from the specification. The specification, [MS-DOC] Word (.doc) Binary File Format, can be found at  https://msdn.microsoft.com/en-us/library/cc313153.aspx. (same link as original post.) Open or download this file for reference. Using this link instead of attaching a copy will keep the reference to the latest version.

The specification says that the value of the file format version, referred to as FibBase.nFib “should” be 0x00C1. This definition can be found at the bottom of page 52, in section 2.5.2 FibBase. The “should” text links to an addendum, which explains conditions under which the value could be 0x00C0 or 0x00C2.

When AOO saves a file in .doc format it sets this version value to 0x1010, which is contrary to the specification.

Steps to see the issue:

1. Open AOO Writer.
2. Create an empty file.
3. Save as .doc:
    a. Click File: Save As...
    b. Under Save As Type select Microsoft Word 97/2000/XP (.doc) (*.doc)
    c. Enter a name and click Save.
4. Close the file. (see attachment HelloWord2.doc)

At this point you will need to open the saved file in a binary editor. One option is Visual Studio. It's also possible to use Beyond Compare 3, by comparing the file to any other short file and selecting a hex comparison.

Visual Studio:
1. Open Visual Studio.
2. Click File: Open: File...
3. Navigate to and select the document.
4. Click the pulldown arrow on the Open button and select Open With...
5. In the Open With dialog select Binary Editor and click OK.
6. In the file search for A5EC. The binary format is little-endian so the result will look like “EC A5”.
7. The next two bytes are the nFib. They will be 0101. (see attchment AOOnFib.jpg)
8. Repeat this step with a short file created in Word, or the attached WordDoc.doc. The nFib will be 0x00C1 (appearing as C1 00) as in the specification. (see attachment WordnFib.jpg)

As noted in the original report, this does not appear to cause any problems when opening the files in either Word or AOO Writer. However, it is a deviation from the specification. It is possible that for other purposes this deviation may cause problems. The linked definition for “should” in the spec is:
	
    SHOULD   This word, or the adjective "RECOMMENDED", mean that there
    may exist valid reasons in particular circumstances to ignore a
    particular item, but the full implications must be understood and
    carefully weighed before choosing a different course.

If there is a known valid reason for the deviation, then perhaps this bug report should be closed with a note of such reason.
Comment 11 joezbugz 2015-05-01 19:52:05 UTC
Created attachment 84723 [details]
HelloWord2.doc : AOO created file
Comment 12 joezbugz 2015-05-01 19:52:49 UTC
Created attachment 84724 [details]
WordDoc.doc : Word 2010 created file
Comment 13 joezbugz 2015-05-01 19:53:40 UTC
Created attachment 84725 [details]
AOOnFib.jpg : AOO created file binary version shown
Comment 14 joezbugz 2015-05-01 19:54:29 UTC
Created attachment 84726 [details]
WordnFib.jpg : Word 2010 crfeated file binary version shown
Comment 15 Marcus 2017-05-20 10:44:17 UTC
Reset the assignee to the default "issues@openoffice.apache.org".