Summary of Requirements Gleaned From Workshop Position Papers
30-November-1998
Editors:
Paul Cotton
(IBM)
<cotton@ca.ibm.com>
Ashok Malhotra
(IBM)
<petsa@us.ibm.com>
Candidate Requirements for XML Query
Nov 30, 1998
Table of Contents
1. Motivation
2. XML Query Requirements
2.1 Query Language and Structure
2.2 Query Language Facilities
2.3 Querying many documents from many, possibly non-XML, sources
2.4 Security
2.5 Using the XML Query Language
2.6 Other Requirements
3. Papers We Did Not Consider
4. Bibliography
We took the position papers that dealt with requirements
for XML query, that were available online to us on November 30, and attempted to extract a list of requirements from
them. This is presented below. If we have not represented your
position fairly and accurately please accept our apologies and suggest
corrections. If you wrote in support of a particular requirement and
your paper is not cited below it is probably because we felt it more
important to cover all the requirements rather than cite all the papers
that wrote in support of a particular requirement. Although we have
attempted to group related requirements, the grouping and ordering is
not meant to indicate importance or priority.
- Need for non-procedural query language.
Paper 17 (Eisenberg).
- XML Query should use XML syntax.
Paper 5 (Agranat).
- Build upon syntax used by other XML standards.
XPointer: Paper 15 (DeRose). XSL: Paper 38 (Schach), paper 49 (XSL),
paper 50 (Bosworth) and paper 53 (MathWG).
All, paper 41 (Simeonov).
- Ability to transmit a query as part of a URL.
Implies syntactic constraints. Paper 46 (Vishwanath).
- Several query languages addressing different user sets.
Paper 26 (Malloy).
- Queries should be XLink and XPointer cognizant.
Paper 23 (Maier). Paper 40 (Shea). Paper 29 (Mecca) would like typed
links. Query should support namespaces. Paper 23 (Maier) and 41
(Simeonov) discuss some issues related to namespace support.
- Support for querying data as well as metadata.
Paper 28 (Masuda), paper 31 (Mihaila), paper 47 (Ward).
- Uniform treatment of attributes and elements.
Paper 34 (Olken), paper 35 (Quark), paper 40 (Shea).
- Need to have a GUI for queries.
Paper 35 (Quark).
- Support for standard query operations.
Several papers. Paper 23 (Maier) from a database prespective asks for:
- Selection: Choosing a document or document element based on
content, structure or attributes. Paper 43 (Tompa) argues that the
document structure is important and should not be lost in an abstract view.
- Extraction: Pulling out particular elements of a document.
- Reduction: Removing selected sub-elements of an element.
- Restructuring: Constructing a new set of element instances to
hold queried data. Paper 2 (Baru) discusses ordering which is a
special case of restructuring. Paper 51 (Denenberg) discusses ranked
retrieval and duplicate removal.
Paper 39 (Seligman) wants complex and powerful transformation and
restructuring capabilities. Note that XSL lays claim to some of this territory.
- Combination: Merging two or more elements into one.
Paper 15 (DeRose) contains an excellent exposition of queries in a
hierarchical space with linking.
- "it must express joins."
Paper 18 (Fernandez).
- Support for insert, update and delete operations.
Paper 28 (Masuda). Paper 3 (Beech) asks for
transaction management.
- Support for nested queries.
Paper 40 (Shea).
- Support for full-text queries.
Paper 33 (Murata) discusses word containment, containment in
order, wildcards and proximity queries. Also support for regular
expressions. Paper 20 (Ishikawa) and Paper 31 (Mihaila) discuss wildcards and regular
expressions. Paper 3 (Beech) asks for SQL-MM like facilities.
Paper 37 (Rhys) wants a "mixture of exact queries on the structured part and
information retrieval queries on the unstructured part."
-
- Provide facilities to construct XML
documents.
This is controversial! Paper 23 (Maier) states it as its first
requirement citing the benefits of closure. Paper 33 (Murata)
is equivocal: the query language may support construction but it may
also return data that can be used by the environment in which it
executes, such as XSL or a DOM program for construction or
transformation. Paper 3 (Beech) is also equivocal. Paper 5 (Agranat)
wants query to not support construction.
Paper 50 (Bosworth) wants the query language to describe how the
resultant graph is serialized.
- RDF query requirements
such as selection based on
property values, navigating over properties, boolean results from
queries and support for alternate representations are discussed in
papers 12 (Cranor), 14 (Decker) and 24 (Malhotra). Paper 52 (Shklar)
discusses integration of full-text query with RDF query.
- Ability to query multiple documents.
Paper 34 (Olken), paper 35 (Quark), paper 43 (Tompa).
- Ability to query distributed data stored on websites in a variety of
formats: relational and OO databases, html, xml or ascii. XML query is
translated to query/view on underlying data representation.
Paper 2 (Baru), paper 5 (Agranat), paper 25 (Madnick), paper 30
(Mendelsohn), paper 43 (Tompa), paper 44 (Valkenburg), paper 46
(Vishwanath), paper 50 (Bosworth).
- Create XML schemas from non-XML data sources.
Paper 29 (Mecca) as well as some of the papers that discuss querying
over diverse data sources cited above.
- Support for "live" data: i.e. data that changes while
user is viewing it.
Paper 46 (Vishwanath).
- "Security is essential on document collections, parts of
collections, and on parts of individual documents."
Paper 39
(Seligman). Paper 47 (Ward).
- Authorization on insert, update, delete
operations. Paper 8 (Buneman) wants to store information
about the update -- time, date, author.
Paper 3 (Beech).
- Query should be usable on documents without a
schema.
Paper 23 (Maier).
- If a schema is available it should be possible to use it
to check query correctness.
Paper 5 (Agranat), paper 23 (Maier).
- Queries should incorporate variables from a local context.
Paper 23 (Maier).
- It should be possible to run queries from several environments/contexts.
Paper 11 (Cotton), Paper 23 (Maier). Paper 39 (Seligman).
- Ability to name, store and retrieve queries.
Paper 40 (Shea).
- Support for annotating XML documents.
Paper 8 (Buneman).
- Support for constraints on elements.
Paper 8 (Buneman) wants referential integrity. Paper 26 (Malloy)
wants "A language for specifying and enforcing constraints between
arbitrary document elements."
Paper 37 (Rhys) wants "constraints specification and triggers." But
isn't this a XML Schema requirement?
- Inference or Semantic Mediation
Several papers
worry about the problem of semantic mismatch between the query and the
data. Paper 25 (Madnick) speaks of extracting price information where
the price is expressed in different currencies. Paper 46 (Vishwanath)
speaks of finding "related" or "similar" data. Paper 19 (Guha)
and paper 12 (Decker) address the problem of inference mainly from a RDF
perspective.
Papers were omitted from the above summary for a variety of reasons.
Some (4,27,42,48) were not available to us on November 24. Others
(1,7,9,13,16,20,21,36) proposed solutions in the form of a particular
syntax or discussed specific systems
rather than discussing requirements.
This is prefectly legitimate in a position paper but our summary only attempted to
extract requirements rather than list proposed solutions.
Paper 6 (Arocena) is a model of the web that is neither XML nor
RDF.
Paper 10 (Christian) discusses the Global Information Locator Service,
paper 22 (LeVan) discusses online seraching from a library perspective,
paper 32 (Mitchell) discusses querying business documents and paper 45
(Valkenburg) discusses query languages for scientific data.
While these area are undoubtedly important, they did not provide new
and different requirements for XML query. Perhaps we missed some
subtle distinctions.
1. Serge Abiteboul (INRIA), Jennifer Widom, Tirthankar Lahiri (Stanford University)
"A Unified Approach for Querying Structured Data and XML"
2. C. Baru, B. Ludscher, Y. Papakonstantinou, P. Velikhov, V. Vianu
"Features and Requirements for an XML View Definition Language: Lessons from XML Information Mediation"
3. David Beech (Oracle) "Position Paper on Query Languages for the Web"
4. Adam Bosworth (Microsoft) "Querying XML"
5. Agranat Systems "Agranat Systems XML QL Position"
6. Gustavo Arocena (IBM Toronto Laboratory), Alberto Mendelzon (University of Toronto), George Mihaila (University of Toronto)
"Query Languages for the Web"
7. Tim Bray (Textuality)
"Element Sets: A Minimal Basis for an XML Query Engine"
8. Peter Buneman, Alin Deutsch, Wenfei Fan, Hartmut Liefke, Arnaud
Sahuguet, Wang-Chiew Tan (University of Pennsylvania)
"Beyond XML Query Languages"
9. Stefano Ceri, Sara Comai, Ernesto Damiani, Piero Fraternali, Stefano Paraboschi, Letizia Tanca (Politecnico di Milano,
Universita' di Milano)
"XML-GL: A Graphical Language for Querying and Reshaping XML Documents"
10. Eliot Christian (United States Geological Survey)
"Experiences with Information Locator Services"
11. Paul Cotton, David Fallside, Ashok Malhotra (IBM)
"Position paper for the W3C Query Languages Workshop"
12. Stefan Decker (University of Karlsruhe), Dan Brickley (University of Bristol), Janne Saarela (W3C), Jurgen Angele
(University of Karlsruhe)
"A Query Service for RDF"
13. Steven J. DeRose (Inso Corporation and Brown University)
"XQuery: A unified syntax for linking and querying general XML documents"
14. Lorrie Faith Cranor (AT&T)
"Requirements for a P3P Query Language"
15. Steven J. DeRose (Inso and Brown University), C. M. Sperberg-McQueen (W3C and University of Illinois at Chicago),
Bill Smith (Sun Microsystems)
"Queries on Links and Hierarchies"
16. Alin Deutsch (University of Pennsylvania), Mary Fernandez (AT&T Labs), Daniela Florescu (INRIA), Alon Levy
(University of Washington), Dan Suciu (AT&T Labs)
"XML-QL"
17. Andrew Eisenberg (Sybase, Inc.) "QL'98 - Position Paper"
18. Mary Fernandez, Dan Suciu (AT&T Labs)
"A Query Language for XML"
19. R.V. Guha (Netscape), Ora Lassila (Nokia), Eric Miller (OCLC), Dan Brickley (Bristol) "Enabling Inferencing"
20. Hiroshi Ishikawa, Kazumi Kubota, Yasuhiko Kanemasa (Fujitsu Laboratories Ltd.)
"XQL: A Query Language for XML Data"
21. David Konopnicki, Oded Shmueli (Technion)
"WWW Data and Services: Querying, Integration and Automation"
22. Ralph LeVan (OCLC Online Computer Library Center, Inc.)
"Library Experience in Online Searching"
23. David Maier (Oregon Graduate Institute)
"Database Desiderata for an XML Query Language"
24. Ashok Malhotra, Neel Sundaresan (IBM)
"RDF Query Specification"
25. Stuart Madnick, Michael Siegel, Thomas Lee (MIT Sloan)
"The COntext INterchange (COIN) Project: Data Extraction and Interpretation from Semi-Structured Web Sources"
26. Mary Ann Malloy, John C. Schneider (The MITRE Corporation)
"Experiences Designing Query Languages for Hierarchically Structured Text Documents"
27. Massimo Marchiori, Janne Saarela (W3C) "Query + Metadata + Logic = Metalog"
28. Isao Masuda (Information Broadcasting Laboratories, Inc.)
"Position Paper for "Query Language" Workshop"
29. Giansalvatore Mecca (Universita' della Basilicata), Paolo Merialdo (Universita' della Basilicata, Universita' di Roma Tre),
Paolo Atzeni (Universita' di Roma Tre)
"Do we really need a new query language for XML?"
30. Noah Mendelsohn (Lotus Development Corp.)
"Query Languages Workshop Position Paper"
31. George Mihaila (University of Toronto), Louiqa Raschid (University of Maryland)
"Locating Data Repositories using XML"
32. Gail Mitchell (GTE Laboratories Incorporated)
"Querying Business Documents"
33. Makoto Murata (Fuji Xerox Information Systems), Jonathan Robie (Texcel Research)
"Observations on Structured Document Query Languages"
34. Frank Olken, John McCarthy (Lawrence Berkeley National Laboratory)
"Requirements and Desiderata for an XML Query Language"
35. Quark. Inc. "Non-Position Paper for Quark, Inc."
36. Jonathan Robie (Texcel), Joe Lapp (webMethods Inc.), David Schach (Microsoft) "XML Query Language (XQL)"
37. Michael Rys (Stanford University)
"Query Languages for XML Documents: A QL '98 Position Paper"
38. David Schach (Microsoft), Joe Lapp (webMethods Inc.), Jonathan Robie (Texcel)
"Querying and Transforming XML"
39. Len Seligman, Arnon Rosenthal (The MITRE Corporation)
"XML Query Language Requirements of Large, Heterogeneous Organizations"
40. William Shea, Paul Kanevsky, Ramesh Lekshmynarayanan (Merrill Lynch)
"QL'98 Position Paper"
41. Simeon Simeonov (Allaire Corporation)
"Position paper for the W3C Query Language Workshop 3-Dec-98"
42. Ralph Swick (W3C, Cambridge, USA)
"RDF, the Resource Description Framework" (Tutorial)
43. Frank Tompa (University of Waterloo)
"Providing flexible access in a query language for XML"
44. Peter Valkenberg (SURFnet), Dan Brickley (University of Bristol)
"Query Languages Issues in a Distributed Indexing Environment"
45. Peter Vanderbilt (NASA) "Query languages for scientific data"
46. Chidambaram Vishwanath, Gerhard Wetzel, Sankar Virdhagriswaran (Crystaliz, Inc.)
"Querying Database-Backed Web Sites"
47. Nigel Ward, Renato Iannella, Hoylen Sue, Rob McArthur, Jane Hunter (DSTC)
"Position Paper: DSTC Requirements for a Web Query Language"
48. Jennifer Widom (Stanford University)
"Querying XML with Lore"
49. W3C XSL Working Group
"The Query Language Position Paper of the XSL Working Group"
LATE ARRIVALS
50. Adam Bosworth (Microsoft) "Querying XML"
51. Ray Denenberg (Library of Congress) "The Library Perspective"
52. Leon Shklar (Pencom Web Works and Rutgers University)
"QL'98 Position Paper"
53. W3C Math Working Group
"The Query Language Position Paper of the Math Working Group"