Issue 74245 - ODFF: LOOKUP in columns if range is wider than tall
Summary: ODFF: LOOKUP in columns if range is wider than tall
Status: CLOSED FIXED
Alias: None
Product: Calc
Classification: Application
Component: programming (show other issues)
Version: OOo 1.0.0
Hardware: All All
: P3 Trivial (vote)
Target Milestone: ---
Assignee: oc
QA Contact: issues@sc
URL:
Keywords: ms_interoperability
Depends on:
Blocks:
 
Reported: 2007-02-06 23:41 UTC by kweinrich
Modified: 2013-08-07 15:15 UTC (History)
4 users (show)

See Also:
Issue Type: PATCH
Latest Confirmation in: ---
Developer Difficulty: ---


Attachments
test document (141.00 KB, application/vnd.ms-excel)
2008-02-13 04:11 UTC, kyoshida
no flags Details
patch (20.52 KB, patch)
2008-02-13 04:13 UTC, kyoshida
no flags Details | Diff
revised patch (21.30 KB, patch)
2008-02-13 16:29 UTC, kyoshida
no flags Details | Diff

Note You need to log in before you can comment on or make changes to this issue.
Description kweinrich 2007-02-06 23:41:18 UTC
In Excel (V.X), using the LOOKUP command, after the search criteria are selected, one is allowed to select 
either an array $F$3:$J$4, or one may enter as search vector $F$3:$J$3 plus a result vector $F$4:$J$4.  
Both work equally well.  However, it appears that Open Office does not recognize the array method of 
syntax, nor does it translate it to the search vector + result vector syntax upon import from Excel.  

It seems to me for full compatibility, Open Office needs to do one or the either.  

Thank you.  Karl
Comment 1 frank 2007-03-19 15:59:06 UTC
Hi,

not a bug but an enhancement. Re-flagged and re-assigned.

Frank
Comment 2 ooo 2007-03-19 18:15:51 UTC
Generally both forms work well. Could you please elaborate what exactly doesn't
work? Best if you also attach a test case document to this issue to illustrate.

Thanks
  Eike
Comment 3 ooo 2007-03-20 13:02:29 UTC
Ah, I was blind. This is the "more columns than rows" case, where Excel assumes
that values are to be looked up in columns instead of rows. Nasty syntax and
easily breaks documents in case you insert/delete rows in such ranges. Which
most certainly is the reason that LOOKUP is deprecated, better to use HLOOKUP or
VLOOKUP instead.

Accepting issue. This is not an import problem, the column semantics aren't
implemented in the interpreter engine.
Comment 4 Martin Hollmichel 2007-11-09 16:52:45 UTC
change target from 2.x to 3.x according to
http://wiki.services.openoffice.org/wiki/Target_3x
Comment 5 ooo 2007-11-14 16:26:32 UTC
ODFF relevant.
Comment 6 kyoshida 2008-02-12 01:22:08 UTC
I'm working on this currently.  There are more undocumented behaviors of Excel
than just the column vs row direction.  I'm trying to uncover all that.  Since
this involves a matrix handling, it will be a bit tricky to get it all covered.
Comment 7 kyoshida 2008-02-13 04:11:32 UTC
Created attachment 51484 [details]
test document
Comment 8 kyoshida 2008-02-13 04:13:17 UTC
Created attachment 51485 [details]
patch
Comment 9 kyoshida 2008-02-13 04:17:45 UTC
This patch is against CWS odff (base m244).

Hopefully I've covered all the Excel's behaviors.  That test document covers a
wide range of scenarios, and with this patch, Calc's LOOKUP behave identically
to Excel's under the scenarios tested in the document.
Comment 10 ooo 2008-02-13 12:33:44 UTC
Thank you, Kohei!
You mentioned cases that are probably not covered by the ODFF spec. Could you
please give details, so I can add those to the specification?
Thanks
  Eike
Comment 11 kyoshida 2008-02-13 16:28:31 UTC
Ok, here is my note.  Let me just copy-n-paste.

---------------------------------------------------------------------------
* 2-parameter case ([lookup value], [lookup-result matrix])

  * The search direction, either horizontal-search or vertical-search, 
    is determined from the dimensions of the matrix; if the data 
    matrix is wider than taller, the search direction is horizontal, 
    and if it is taller than wider, the direction is vertical.  If the 
    data matrix is a square, the search direction is vertical.  
    
  * In vertical search mode, only the first column gets searched, and 
    the last column is used as the result array.  Those columns that 
    are not either the first nor the last are simply not used.  In 
    case of a one-column matrix, the search column and the result 
    column are identical.  These rules also apply in horizontal search 
    mode but to rows instead of columns.  
        
* 3-parameter case ([lookup value], [lookup matrix], [result matrix(vector)])

  * The result matrix must be a vector, or it raises an error.

  * The lookup matrix does not need to be a vector.  If it's not a 
    vector only the first column or row gets searched (= search 
    vector).  [NOTE: the ODFF spec requires this to be a vector, but 
    Excel does not.] 
    
  * The search direction is determined from the dimensions of the 
    lookup matrix.  The same rules as the 2-parameter case apply.  

  * The directions of the lookup matrix and the result matrix can 
    differ.  [NOTE: the ODFF spec does not mention this.] 

  * The lengths of the search vector and the result vector do not need 
    to be identical; however, when the match position falls outside 
    the length of the result vector, an error is raised if the result 
    vector is given as an array object.  If it's a cell range, it gets 
    automatically extended to the length of the searched vector.  If 
    the cell range cannot be extended due to size limit, then '0' is 
    displayed.  [NOTE: the ODFF spec requires the search vector and 
    the result vector to be of the same length.] 
---------------------------------------------------------------------------

The spec covers the 2-parameter case pretty well, and my note reveals nothing
new here.  But for the 3-parameter case there are some behaviors of Excel that
are not mentioned or not accurately depicted in the ODFF spec.

Actually I have a new patch to cover the last item of my note.  I'll attach it
here shortly.
Comment 12 kyoshida 2008-02-13 16:29:53 UTC
Created attachment 51498 [details]
revised patch
Comment 13 kweinrich 2008-02-13 18:04:53 UTC
I wish to thank all of you for efforts in resolving this.  We are hoping to migrate to the NeoOffice flavor of 
OpenOffice within the next year.  It is efforts like these that make this more possible.  

Karl
Comment 14 ooo 2008-04-11 19:48:58 UTC
Hi Kohei,

I have some doubt about a detail of the latest patch. In
lcl_CompareMatrix2Query() this

    if (rStr1 == rStr2)
        return 0;

    return rStr1 < rStr2 ? -1 : 1;

changed to

    return rStr1.CompareIgnoreCaseToAscii(rStr2); // case-insensitive

As the method name says that works only for ASCII strings. To make that work
with Unicode, either collation or transliteration have to be used. As the
comparison is done in a binary search on data that is assumed to be sorted,
using ScGlobal::pCollator->compareString() should be appropriate.

lcl_CompareMatrix2Query() is also used in ScInterpreter::ScMatch(), I guess
there shouldn't be any negative impact by these changes, correct?

Thanks
  Eike
Comment 15 kyoshida 2008-04-14 06:00:24 UTC
>I guess there shouldn't be any negative impact by these changes, correct?

I think the change you're suggesting is the right one.  Please go ahead with the
change. :-)

Kohei
Comment 16 ooo 2008-04-23 22:19:38 UTC
In cws odff03:

sc/source/core/tool/interpr1.cxx  1.56.20.6

Several changes were necessary:

Loading the attached test case document in a non-product gave lots of
Error: ScMatrix::GetDouble: dimension error
From File .../odff03/DEV300/src.m3/sc/source/core/tool/scmatrix.cxx at Line 430

because nDelta==-1 if no value was found (already the first item greater
than query) and nFirst==0

    if (nLen == 1) // first and last items are next to each other.
    {
        nDelta = nCmp < 0 ? nLast - 1 : nFirst - 1;

These cases then showed up as not being identical to Excel results.


In case the result array was shorter than the lookup array an off-bounds
memory access happened with pResMat->Get...(nDelta), returning an
arbitrary value if lucky ;-) and pResMat->IsValue(nDelta) returned TRUE
before. Also showed up as not being identical to Excel results in cells
B70:B71 (0 displayed instead of #N/A).

Introduced handling of mixed numeric/string of both query and matrix
element and empty matrix element in lcl_CompareMatrix2Query()

Introduced handling of mixed numeric/string/empty matrix elements in
lcl_GetLastMatch(), also there the result was arbitrary for
a rMat.GetDouble() if the element was a string instead, usually not
identical though to the value to be matched, so it wasn't obvious.
Comment 17 ooo 2008-05-09 17:38:17 UTC
Reassigning to QA for verification.
Comment 18 oc 2008-06-09 12:26:26 UTC
verified in internal build cws_odff3
Comment 19 oc 2008-10-17 10:18:14 UTC
closed because fix available in builds OOO300_m9 and DEV300_m33