Bug 6028 - [FO] fn:id is broken -- gives wrong answers for elements of type ID
: [FO] fn:id is broken -- gives wrong answers for elements of type ID
Status: CLOSED FIXED
Product: XPath / XQuery / XSLT
Functions and Operators 1.0
: Recommendation
: PC Linux
: P1 major
: ---
Assigned To: Michael Kay
: Mailing list for public feedback on specs from XSL and XML Query WGs
: http://www.w3.org/TR/xquery-operators...
:
:
:
:
  Show dependency treegraph
 
Reported: 2008-09-05 14:02 UTC by Henry S. Thompson
Modified: 2009-10-16 22:35 UTC (History)
4 users (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Henry S. Thompson 2008-09-05 14:02:44 UTC
fn:id is broken -- it gives wrong answers for elements of type ID:  XML Schema
and XPointer both make clear that an element of type ID identifies its
_parent_, not the element itself.

This is a serious bug because it means XPointer cannot be simply implemented
using an XPath2 implementation out of the box.

I am submitting this personally, but I expect the XML Core WG (the keeper of
the XPointer specs) will endorse this comment at their next meeting
(2008-09-10).
Comment 1 Michael Kay 2008-09-05 17:03:46 UTC
Personal response.

I agree the design is wrong. (I say as much in my book XSLT/XPath Programmer's
Reference 4th ed). The question is what we can now do about it.

I think that issuing an erratum to change the behaviour of the existing
function in such an incompatible way would be very damaging. Not just because
it would break existing code, but also because there would be a period of at
least three years where different implementations in active use would give
different results, and also because books (like my own) and other reference and
tutorial materials will be on sale and in use for many years and will be giving
incorrect information to users, leading to general confusion in the user
community. And because the release cycle for some products is long (up to 3
years in the case of major relational database systems), the goal that XPointer
could use an XPath implementation "out of the box" would not be achieved for
several years.

One solution might be to define a new function with the "improved" behaviour,
to exist alongside the old. This isn't ideal either, because there will still
be an interoperability problem and an education problem, but at least there
won't be a compatibility problem, and the interoperability problem will be that
some implementations accept the new function and others don't, which is a lot
better than having them return fundamentally different results.

Another softer approach might be to introduce a new way of calling the existing
function: if the supplied ID value is prefixed with "#" then the function
exhibits the new behaviour. This approach has the benefit that it makes it
easier for applications to handle the coexistence of implementations that do or
don't implement the erratum - the current specification is that if a value such
as "#D001" is passed to the function, no error is reported and nothing is
selected. So the expression

(id("#D001"),id("D001")/..)[1]

would give the desired result on both pre-and post-erratum implementations.

But of course this isn't a panacea - one aim of the function is that you can
use it to follow a link from an IDREF or IDREFS attribute, and this would
require some string manipulation to make it work.

I think we should probably see this problem with a sense of perspective. Many
users are reluctant to use id() anyway, because it only works when the input
document is validated, and because some parsers don't report IDness even when
they are performing validation. Under XSLT, many users prefer to use the key()
function. Under XQuery, they prefer to write an expression such as
//A[@id="D001"] and rely on the optimizer to use whatever indexes are
available. Even if it's broken, do we really need to fix it?
Comment 2 David Ezell 2008-09-05 17:18:44 UTC
On 2008-09-05 the XML Schema WG resolved (minutes to be published RSN) to
endorse this issue, and respectfully request that QT accept this issue as
requiring its immediate attention.  The Schema WG believes that the most
satisfactory dispensation would be to open an erratum against F&O.

Thank you.
Comment 3 Michael Kay 2008-11-11 17:43:18 UTC
I was asked today by the joint XQuery/XSLWG meeting to produce a proposal. The
proposal is to add a new section, identical to 15.5.2 fn:id, with the following
differences:

(a) the function name is fn:element-with-id. (It has the same two signatures as
fn:id).

(b) Rule 2, bullet 1, sub-bullet 1 is changed to read:

The element has a child element node whose is-id property property (See Section
5.5 is-id AccessorDM.) is true, and whose typed value is equal to V under the
rules of the eq operator using the Unicode code point collation
(http://www.w3.org/2005/xpath-functions/collation/codepoint).

(c) The existing Notes under fn:id() are referenced rather than being
duplicated

(d) A new note is added explaining the problem and the difference between the
two functions, and the approach to introducing it as an optional function by
means of an erratum.

(e) The new function is added by Erratum to XPath 2.0/XQuery 1.0, with a
statement that it is an optional feature for this version. To localize the
changes, we can do this, I think, by means of a statement within the F+O
specification of the function that says "Where any statement in the conformance
rules of a host language requires that all functions in this function library
must be provided, such a statement does not apply to the fn:element-with-id
function (in both its variants): processors may provide this function but are
not required to do so."

(f) The function becomes an integral part of the function library for XPath
2.1/XQuery 1.1.
Comment 4 Michael Kay 2009-01-28 20:18:03 UTC
Erratum E31 has been drafted to define the new function fn:element-with-id().
Comment 5 Jim Melton 2009-02-07 01:04:35 UTC
I'm marking this bug FIXED on the grounds that a new function has been added to
satisfy the requirement as stated without "breaking" the existing function. 

If you are satisfied with this resolution, please mark the bug CLOSED. 
Comment 6 Michael Kay 2009-10-16 20:49:28 UTC
I'm marking this as closed, since it has been fixed by erratum (E31) and the
new text is in the master document for both the 1.02e and 1.1 branches.
Comment 7 John Cowan 2009-10-16 22:00:31 UTC
I'm posting even though I agree that this bug is properly closed, because
Michael Kay's comment #1 contains a broken meme I am trying to stamp out.

"id() [...] only works when the input document is validated" is not true. 
Conforming non-validating XML parsers MUST process ATTLIST declarations in the
internal subset that appear before the first reference to an external parameter
entity. Conforming processors that read external DTD subsets and parameter
entities, whether validating or non-validating, MUST process all ATTLIST
declarations.  Once an ATTLIST declaration for a particular attribute has been
processed, the XML parser MUST report it to the application.  So if you write:

<!DOCTYPE root [<!ATTLIST root foo ID #REQUIRED>]>
<root foo="FOO">

every conforming XML parser MUST report the IDness of root/@foo.  What is more,
if you write:

<!DOCTYPE root [<!ATTLIST root foo ID #REQUIRED>]>
<root foo="FOO">

every conforming processor MUST still report the IDness of root/@foo, even
though 32 is not a valid ID.
Comment 8 John Cowan 2009-10-16 22:35:09 UTC
Erm, make the second example:

<!DOCTYPE root [<!ATTLIST root foo ID #REQUIRED>]>
<root foo="32">

(thanks to Paul Grosso)


  翻译: