Bug 5290 - [XQuery] Unclear meaning of "collation" in order-by clause
: [XQuery] Unclear meaning of "collation" in order-by clause
Status: CLOSED FIXED
Product: XPath / XQuery / XSLT
XQuery 1.0
: Recommendation
: PC Windows XP
: P2 normal
: ---
Assigned To: Don Chamberlin
: Mailing list for public feedback on specs from XSL and XML Query WGs
:
:
:
:
:
  Show dependency treegraph
 
Reported: 2007-11-27 18:38 UTC by Don Chamberlin
Modified: 2008-09-03 19:10 UTC (History)
0 users (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Don Chamberlin 2007-11-27 18:38:48 UTC
In XQuery Section 3.8.3 (Order By and Return clauses), the "greater-than"
relationship between two ordering keys V and W, when a collation C is
specified, is defined by fn:compare(V, W, C). The document does not specify
that this definition applies only if the keys are strings. Consider the
following case:

order by $salary collation "fr-ca" 

If the type of $salary is xs:decimal, an error will result because fn:compare()
requires strings for its first two parameters, and xs:decimal is not promotable
to xs:string.

If this is really the behavior we want, we should add a note that says this
explicitly and designates the error code (XPST0017, No Such Function). If we
think this is a confusing and unhelpful error code, we should invent a new code
for this case. Alternatively, we could say that the collation applies only if
the ordering keys are valid operands of fn:compare (strings or promotable to
strings); otherwise the keys are compared using gt and the collation clause is
ignored.
Comment 1 Michael Kay 2007-11-27 22:12:22 UTC
It's worth pointing out that in functions like distinct-values, min, and max,
the collation is ignored if not required. I think that should be the behaviour
for order-by as well. (Indeed, I assumed it was; and making the change in this
direction would definitely involve less disruption for users.)
Comment 2 Don Chamberlin 2008-01-08 19:49:29 UTC
The working group discussed this bug report on 8 Jan 2008. Two alternatives
were discussed:

(1) Close the bug with no changes. In this case, specifying a collation for a
non-string ordering-key is an error. It was observed that this behavior is
inconsistent with max, min, distinct-values, and possibly a future group-by.

(2) Change XQuery Section 3.8.3 as follows:
When two orderspec values V and W are compared to determine their relative
order in the ordering sequence, the "greater-than" relationship is defined as
follows:
When the orderspec specifies "empty least", the following rules are applied in
order:
(a) If V is an empty sequence and W is not an empty sequence, then W
"greater-than" V is true.
(b) If V is NaN and W is neither NaN nor an empty sequence, then W
"greater-than" V is true.
(c) If a specific collation C is specified, and V and W are both of type
xs:string or are convertible to xs:string by subtype substitution and/or type
promotion, and fn:compare(V, W, C) is less than zero, then W "greater than" V
is true.
(d) If W gt V is true, then W "greater-than" V is true.
(e) If none of the above rules apply, then W "greater-than" V is false.
Analogous changes apply when the orderspec specifies "empty greatest".

If adopted, these changes could be made effective immediately by an erratum to
XQuery 1.0, or could be introduced in XQuery 1.1 as relaxation of an error
condition.

This bug report remains open pending discussion of these alternatives.
--Don Chamberlin (for the Query Working Group)
Comment 3 Don Chamberlin 2008-01-26 22:14:31 UTC
On 23 Jan 2008, the Query Working Group considered this bug report and decided
to accept alternative (2) in Comment #2. This will have the effect that a
specified collation is ignored if the ordering key is not convertible to the
type xs:string. This change will be published in an erratum to XQuery 1.0.
--Don Chamberlin (for the Query Working Group)
Comment 4 Don Chamberlin 2008-09-03 19:10:23 UTC
Mike Kay has pointed out that the text in Comment #2 does not accurately
reflect the intent of the working group that ordering is based on fn:compare()
when a collation is specified and the ordering keys are convertible to strings.
As stated in Comment #2, if fn:compare() returns a non-negative result (Rule
c), the algorithm falls through to Rule d and applies the "gt" operator. This
is a mistake.

The corrected rules are as follows:

When the orderspec specifies empty least, the following rules are applied in
order:
(1) If V is an empty sequence and W is not an empty sequence, then W
greater-than V is true.
(2) If V is NaN and W is neither NaN nor an empty sequence, then W greater-than
V is true.
(3) If a specific collation C is specified, and V and W are both of type
xs:string or are convertible to xs:string by subtype substitution and/or type
promotion, then:
If fn:compare(V, W, C) is less than zero, then W greater-than V is true;
otherwise W greater-than V is false.
(4) If none of the above rules apply, then:
If W gt V is true, then W greater-than V is true; otherwise W greater-than V is
false.

An analogous set of rules apply when the orderspec specifies empty greatest.

Since this change simply corrects an error in accurately reflecting the
decision of the working group, I am leaving this bug report in "Closed" status.

Don Chamberlin


  翻译: