The document provides an overview of XML (Extensible Markup Language). It describes XML as a text-based markup language derived from SGML that uses tags to identify and organize data rather than display it like HTML. The document outlines key characteristics of XML including that it is extensible, carries data without presenting it, and is an open standard. It also provides examples of XML usage and describes the basic syntax and components of XML documents and elements.
Web Development Course - XML by RSOLUTIONSRSolutions
The document provides an overview of XML including:
1. XML was developed by the W3C to overcome HTML limitations and transport data rather than display it. XML is readable, understandable, well-defined, and self-descriptive.
2. An XML document has a tree structure with a root element containing child elements, attributes, and data. Elements are used to classify data and can contain other elements, text, and attributes.
3. XML documents must follow syntax rules like having matching opening and closing tags and properly nested elements. Attributes require values to be in quotes.
XML is a markup language that is used to define and store data in a structured format. It allows data to be separated from its presentation and is extensible to add new tags. An XML document must have a root element and follow syntax rules to be well-formed. It can also be validated against a DTD or schema to check that the elements and structure match the definitions.
XML is an extensible markup language that allows users to define their own elements and tags. It was designed to store and transport data, unlike HTML which was designed for displaying data. XML separates data from presentation by using user-defined tags to describe information rather than pre-defined tags like HTML. This extensibility makes XML highly flexible and customizable for different applications and domains.
This document provides an introduction to XML, including its structure, syntax, and uses. It defines XML as a markup language that provides a format for structured data. It describes XML elements, attributes, and how XML documents must follow specific syntax rules to be considered well-formed. The document also discusses Document Type Definitions (DTDs), XML namespaces, XML schemas, displaying XML with CSS, and transforming XML with XSLT.
Web authoring refers to the process of creating, designing, and publishing content for the World Wide Web using technologies like HTML, CSS, JavaScript, and other web development tools. It involves creating web pages and websites. XML is a markup language similar to HTML that uses tags to structure and present data in a file. An XML document has a root element containing other nested elements in a hierarchical tree structure. Elements can have attributes that provide additional information.
XML is a markup language used to carry and store data. It was designed to transport data rather than display it. XML tags are defined by the author rather than being predefined. XML documents form a tree structure with a root element and branching child elements. For a document to be considered valid XML, it must follow syntax rules like having matching open and close tags and properly nested elements.
XML Introduction,Syntax of XML,Well formed XML Documents,XML Document Structure,Document Type Definitions,XML Namespace,XML Schemas,DOM(Document Object Model)
XML (Extensible Markup Language) is a markup language that defines a set of rules for encoding documents in a format that is both human-readable and machine-readable. It is designed to carry data, describe its meaning, and not focus on how it looks. XML uses elements with matching start and end tags to structure and markup text and other types of data. Elements can have attributes to provide additional information and can be nested within other elements to show relationships between data.
SGML is a standard for specifying markup languages. It describes how to define document structure separately from presentation. XML is a simplified version of SGML used to store and transport data. Key differences between XML and HTML include XML focusing on data rather than presentation, being case sensitive, requiring closing tags, and preserving whitespace.
This document provides an introduction to XML, including:
- What XML is and why it was created as an extensible meta language for describing other languages.
- Basic XML rules like tags being case sensitive, elements needing closing tags, and attributes requiring quotation marks.
- Differences between XML and HTML in terms of focus, predefined tags, and use for transporting vs displaying data.
- Benefits of XML like improved web functionality through custom markup and being a meta language that describes other languages.
The document discusses XML (Extensible Markup Language), which is a flexible way to create common information formats and share data on the web. XML is similar to HTML but describes data content rather than display/interaction. XML allows for unlimited, self-defining markup and can be used by any individual or group wanting to share information consistently. The document also discusses validating XML files, document type definitions (DTDs), element declarations, attribute declarations, and entity declarations in XML.
This document provides an introduction and overview of XML. It defines XML, explains how it is used to transport and store data, and compares it to HTML. It provides examples of XML code and documents. It describes XML syntax rules including requirements for closing tags, nesting, and attributes. It explains how XML documents form a tree structure and defines key XML concepts like elements, attributes, comments and naming conventions.
This document provides an overview of XML (eXtensible Markup Language). It discusses how XML is used to store structured data, compares XML to databases and HTML, and outlines the basic structure and syntax of XML documents. Key points covered include XML tags, elements, attributes, namespaces, parsing XML with PHP, and manipulating XML data using DOM and SimpleXML extensions.
XML stands for Extensible Markup Language. It is used to carry data, not display it like HTML. XML tags are defined by the developer rather than being predefined. XML documents form a tree structure with elements having parent-child relationships. Namespaces are used to avoid conflicts when element names are reused, and default namespaces simplify markup by eliminating the need for prefixes on child elements.
This document provides an overview of XML (Extensible Markup Language). It defines XML as a text-based markup language that stores data in a structured format using user-defined tags. The document outlines key features of XML including separating data from presentation, simplifying data sharing, and its use in web publishing, web searching, and data transfer. It also describes XML syntax rules, components like elements and attributes, and applications of XML.
XML is a markup language that defines rules for encoding documents in a human- and machine-readable format. It allows users to define their own elements and tags to structure data. Some key benefits of XML include its extensibility, ability to carry data independently of presentation, and status as a public standard. While XML provides structure and organization, it does not perform computations or specify how data should be displayed.
XML stands for Extensible Markup Language and is used to mark up data so it can be processed by computers, whereas HTML is used to mark up text to be displayed for users. Both XML and HTML use elements enclosed in tags, attributes, and entities, but XML only describes content while HTML describes both structure and appearance. XML allows users to define their own tags, and is strictly structured, making it suitable for data processing by computers.
Web engineering UNIT IV as per RGPV syllabusNANDINI SHARMA
Technologies for Web Applications: Introduction of XML, Validation of XML documents, DTD, Ways to use XML, XML for data files, HTML Vs XML, Embedding XML into HTML documents, Converting XML to HTML for Display, Displaying XML using CSS and XSL, Rewriting HTML as XML, Relationship between HTML, SGML and XML, web personalization , Semantic web,
Semantic Web Services, Ontology.
XML is a markup language that is used to describe data and is self-descriptive. It allows information to be carried in a hardware- and software-independent manner. XML tags are not predefined like HTML, and it is designed to describe data rather than display it. XML is widely used in web development to simplify data storage and sharing. The structure of an XML document includes a root element containing child elements, along with XML declarations and syntax rules.
Web programming unit IIII XML &DOM NOTES BY BHAVSINGH MALOTHBhavsingh Maloth
This document provides an introduction and overview of XML including:
- What XML is and how it differs from HTML in focusing on describing data rather than displaying it
- XML syntax rules including elements, tags, attributes, and well-formed vs valid documents
- How to define XML structures using DTDs including internal and external DTDs
- Common XML building blocks like elements, tags, attributes, and how to declare them in a DTD
- The basics of using a DTD to validate an XML document's structure
XML is a markup language that represents text information in a standard format. It was designed to transport and store information in a reliable way. XML has a wide range of applications and is just a formalism unlike HTML. XML documents can be validated against a DTD to check that they conform to the defined syntax rules and are well-formed.
XML is a markup language similar to HTML but designed for structured data rather than web pages. It uses tags to define elements and attributes, and can be validated using DTDs or XML schemas. XML documents can be transformed and queried using XSLT and XPath respectively. SAX is an event-based parser that reads XML sequentially while DOM loads the entire document into memory for random access.
This document provides an overview of XML (eXtensible Markup Language) by comparing and contrasting it with HTML. It discusses how XML is used to mark up data for computers to process rather than for display like HTML. The document outlines the basic rules for well-formed XML, including the need for matching tags, proper nesting, and defined entities. It also covers XML extensions like namespaces, attributes, and how to define a valid XML vocabulary through DTDs or schemas.
Welcome to the May 2025 edition of WIPAC Monthly celebrating the 14th anniversary of the WIPAC Group and WIPAC monthly.
In this edition along with the usual news from around the industry we have three great articles for your contemplation
Firstly from Michael Dooley we have a feature article about ammonia ion selective electrodes and their online applications
Secondly we have an article from myself which highlights the increasing amount of wastewater monitoring and asks "what is the overall" strategy or are we installing monitoring for the sake of monitoring
Lastly we have an article on data as a service for resilient utility operations and how it can be used effectively.
Ad
More Related Content
Similar to xml introduction in web technologies subject (20)
XML is a markup language used to carry and store data. It was designed to transport data rather than display it. XML tags are defined by the author rather than being predefined. XML documents form a tree structure with a root element and branching child elements. For a document to be considered valid XML, it must follow syntax rules like having matching open and close tags and properly nested elements.
XML Introduction,Syntax of XML,Well formed XML Documents,XML Document Structure,Document Type Definitions,XML Namespace,XML Schemas,DOM(Document Object Model)
XML (Extensible Markup Language) is a markup language that defines a set of rules for encoding documents in a format that is both human-readable and machine-readable. It is designed to carry data, describe its meaning, and not focus on how it looks. XML uses elements with matching start and end tags to structure and markup text and other types of data. Elements can have attributes to provide additional information and can be nested within other elements to show relationships between data.
SGML is a standard for specifying markup languages. It describes how to define document structure separately from presentation. XML is a simplified version of SGML used to store and transport data. Key differences between XML and HTML include XML focusing on data rather than presentation, being case sensitive, requiring closing tags, and preserving whitespace.
This document provides an introduction to XML, including:
- What XML is and why it was created as an extensible meta language for describing other languages.
- Basic XML rules like tags being case sensitive, elements needing closing tags, and attributes requiring quotation marks.
- Differences between XML and HTML in terms of focus, predefined tags, and use for transporting vs displaying data.
- Benefits of XML like improved web functionality through custom markup and being a meta language that describes other languages.
The document discusses XML (Extensible Markup Language), which is a flexible way to create common information formats and share data on the web. XML is similar to HTML but describes data content rather than display/interaction. XML allows for unlimited, self-defining markup and can be used by any individual or group wanting to share information consistently. The document also discusses validating XML files, document type definitions (DTDs), element declarations, attribute declarations, and entity declarations in XML.
This document provides an introduction and overview of XML. It defines XML, explains how it is used to transport and store data, and compares it to HTML. It provides examples of XML code and documents. It describes XML syntax rules including requirements for closing tags, nesting, and attributes. It explains how XML documents form a tree structure and defines key XML concepts like elements, attributes, comments and naming conventions.
This document provides an overview of XML (eXtensible Markup Language). It discusses how XML is used to store structured data, compares XML to databases and HTML, and outlines the basic structure and syntax of XML documents. Key points covered include XML tags, elements, attributes, namespaces, parsing XML with PHP, and manipulating XML data using DOM and SimpleXML extensions.
XML stands for Extensible Markup Language. It is used to carry data, not display it like HTML. XML tags are defined by the developer rather than being predefined. XML documents form a tree structure with elements having parent-child relationships. Namespaces are used to avoid conflicts when element names are reused, and default namespaces simplify markup by eliminating the need for prefixes on child elements.
This document provides an overview of XML (Extensible Markup Language). It defines XML as a text-based markup language that stores data in a structured format using user-defined tags. The document outlines key features of XML including separating data from presentation, simplifying data sharing, and its use in web publishing, web searching, and data transfer. It also describes XML syntax rules, components like elements and attributes, and applications of XML.
XML is a markup language that defines rules for encoding documents in a human- and machine-readable format. It allows users to define their own elements and tags to structure data. Some key benefits of XML include its extensibility, ability to carry data independently of presentation, and status as a public standard. While XML provides structure and organization, it does not perform computations or specify how data should be displayed.
XML stands for Extensible Markup Language and is used to mark up data so it can be processed by computers, whereas HTML is used to mark up text to be displayed for users. Both XML and HTML use elements enclosed in tags, attributes, and entities, but XML only describes content while HTML describes both structure and appearance. XML allows users to define their own tags, and is strictly structured, making it suitable for data processing by computers.
Web engineering UNIT IV as per RGPV syllabusNANDINI SHARMA
Technologies for Web Applications: Introduction of XML, Validation of XML documents, DTD, Ways to use XML, XML for data files, HTML Vs XML, Embedding XML into HTML documents, Converting XML to HTML for Display, Displaying XML using CSS and XSL, Rewriting HTML as XML, Relationship between HTML, SGML and XML, web personalization , Semantic web,
Semantic Web Services, Ontology.
XML is a markup language that is used to describe data and is self-descriptive. It allows information to be carried in a hardware- and software-independent manner. XML tags are not predefined like HTML, and it is designed to describe data rather than display it. XML is widely used in web development to simplify data storage and sharing. The structure of an XML document includes a root element containing child elements, along with XML declarations and syntax rules.
Web programming unit IIII XML &DOM NOTES BY BHAVSINGH MALOTHBhavsingh Maloth
This document provides an introduction and overview of XML including:
- What XML is and how it differs from HTML in focusing on describing data rather than displaying it
- XML syntax rules including elements, tags, attributes, and well-formed vs valid documents
- How to define XML structures using DTDs including internal and external DTDs
- Common XML building blocks like elements, tags, attributes, and how to declare them in a DTD
- The basics of using a DTD to validate an XML document's structure
XML is a markup language that represents text information in a standard format. It was designed to transport and store information in a reliable way. XML has a wide range of applications and is just a formalism unlike HTML. XML documents can be validated against a DTD to check that they conform to the defined syntax rules and are well-formed.
XML is a markup language similar to HTML but designed for structured data rather than web pages. It uses tags to define elements and attributes, and can be validated using DTDs or XML schemas. XML documents can be transformed and queried using XSLT and XPath respectively. SAX is an event-based parser that reads XML sequentially while DOM loads the entire document into memory for random access.
This document provides an overview of XML (eXtensible Markup Language) by comparing and contrasting it with HTML. It discusses how XML is used to mark up data for computers to process rather than for display like HTML. The document outlines the basic rules for well-formed XML, including the need for matching tags, proper nesting, and defined entities. It also covers XML extensions like namespaces, attributes, and how to define a valid XML vocabulary through DTDs or schemas.
Welcome to the May 2025 edition of WIPAC Monthly celebrating the 14th anniversary of the WIPAC Group and WIPAC monthly.
In this edition along with the usual news from around the industry we have three great articles for your contemplation
Firstly from Michael Dooley we have a feature article about ammonia ion selective electrodes and their online applications
Secondly we have an article from myself which highlights the increasing amount of wastewater monitoring and asks "what is the overall" strategy or are we installing monitoring for the sake of monitoring
Lastly we have an article on data as a service for resilient utility operations and how it can be used effectively.
Jacob Murphy Australia - Excels In Optimizing Software ApplicationsJacob Murphy Australia
In the world of technology, Jacob Murphy Australia stands out as a Junior Software Engineer with a passion for innovation. Holding a Bachelor of Science in Computer Science from Columbia University, Jacob's forte lies in software engineering and object-oriented programming. As a Freelance Software Engineer, he excels in optimizing software applications to deliver exceptional user experiences and operational efficiency. Jacob thrives in collaborative environments, actively engaging in design and code reviews to ensure top-notch solutions. With a diverse skill set encompassing Java, C++, Python, and Agile methodologies, Jacob is poised to be a valuable asset to any software development team.
Welcome to MIND UP: a special presentation for Cloudvirga, a Stewart Title company. In this session, we’ll explore how you can “mind up” and unlock your potential by using generative AI chatbot tools at work.
Curious about the rise of AI chatbots? Unsure how to use them-or how to use them safely and effectively in your workplace? You’re not alone. This presentation will walk you through the practical benefits of generative AI chatbots, highlight best practices for safe and responsible use, and show how these tools can help boost your productivity, streamline tasks, and enhance your workday.
Whether you’re new to AI or looking to take your skills to the next level, you’ll find actionable insights to help you and your team make the most of these powerful tools-while keeping security, compliance, and employee well-being front and center.
Dear SICPA Team,
Please find attached a document outlining my professional background and experience.
I remain at your disposal should you have any questions or require further information.
Best regards,
Fabien Keller
In this paper, the cost and weight of the reinforcement concrete cantilever retaining wall are optimized using Gases Brownian Motion Optimization Algorithm (GBMOA) which is based on the gas molecules motion. To investigate the optimization capability of the GBMOA, two objective functions of cost and weight are considered and verification is made using two available solutions for retaining wall design. Furthermore, the effect of wall geometries of retaining walls on their cost and weight is investigated using four different T-shape walls. Besides, sensitivity analyses for effects of backfill slope, stem height, surcharge, and backfill unit weight are carried out and of soil. Moreover, Rankine and Coulomb methods for lateral earth pressure calculation are used and results are compared. The GBMOA predictions are compared with those available in the literature. It has been shown that the use of GBMOA results in reducing significantly the cost and weight of retaining walls. In addition, the Coulomb lateral earth pressure can reduce the cost and weight of retaining walls.
This research is oriented towards exploring mode-wise corridor level travel-time estimation using Machine learning techniques such as Artificial Neural Network (ANN) and Support Vector Machine (SVM). Authors have considered buses (equipped with in-vehicle GPS) as the probe vehicles and attempted to calculate the travel-time of other modes such as cars along a stretch of arterial roads. The proposed study considers various influential factors that affect travel time such as road geometry, traffic parameters, location information from the GPS receiver and other spatiotemporal parameters that affect the travel-time. The study used a segment modeling method for segregating the data based on identified bus stop locations. A k-fold cross-validation technique was used for determining the optimum model parameters to be used in the ANN and SVM models. The developed models were tested on a study corridor of 59.48 km stretch in Mumbai, India. The data for this study were collected for a period of five days (Monday-Friday) during the morning peak period (from 8.00 am to 11.00 am). Evaluation scores such as MAPE (mean absolute percentage error), MAD (mean absolute deviation) and RMSE (root mean square error) were used for testing the performance of the models. The MAPE values for ANN and SVM models are 11.65 and 10.78 respectively. The developed model is further statistically validated using the Kolmogorov-Smirnov test. The results obtained from these tests proved that the proposed model is statistically valid.
Citizen Observatories (COs) are innovative mechanisms to engage citizens in monitoring and addressing environmental and societal challenges. However, their effectiveness hinges on seamless data crowdsourcing, high-quality data analysis, and impactful data-driven decision-making. This paper validates how the GREENGAGE project enables and encourages the accomplishment of the Citizen Science Loop within COs, showcasing how its digital infrastructure and knowledge assets facilitate the co-production of thematic co-explorations. By systematically structuring the Citizen Science Loop—from problem identification to impact assessment—we demonstrate how GREENGAGE enhances data collection, analysis, and evidence exposition. For that, this paper illustrates how the GREENGAGE approach and associated technologies have been successfully applied at a university campus to conduct an air quality and public space suitability thematic co-exploration.
1. UNIT-II XML
Introduction to XML
XML stands for Extensible Markup Language. It is a text-based markup language derived from
Standard Generalized Markup Language (SGML).
XML tags identify the data and are used to store and organize the data, rather than specifying
how to display it like HTML tags, which are used to display the data. XML is not going to
replace HTML in the near future, but it introduces new possibilities by adopting many successful
features of HTML.
There are three important characteristics of XML that make it useful in a variety of systems and
solutions:
XML is extensible: XML allows you to create your own self-descriptive tags, or language, that
suits your application.
XML carries the data, does not present it: XML allows you to store the data irrespective of
how it will be presented.
XML is a public standard: XML was developed by an organization called the World Wide
Web Consortium (W3C) and is available as an open standard.
XMLUsage
A short list of XML usage says it all:
XML can work behind the scene to simplify the creation of HTML documents for large web
sites.
XML can be used to exchange the information between organizations and systems.
XML can be used for offloading and reloading of databases.
XML can be used to store and arrange the data, which can customize your data handling needs.
XML can easily be merged with style sheets to create almost any desired output.
Virtually, any type of data can be expressed as an XML document.
What isMarkup?
XML is a markup language that defines set of rules for encoding documents in a format that
is both human-readable and machine-readable. So what exactly is a markup language?
Markup is information added to a document that enhances its meaning in certain ways, in
that it identifies the parts and how they relate to each other. More specifically, a markup
language is a set of symbols that can be placed in the text of a document to demarcate and
label the parts of that document.
Following example shows how XML markup looks, when embedded in a piece of text:
<message>
<text>Hello, world!</text>
</message>
This snippet includes the markup symbols, or the tags such as
<message>...</message> and <text>...</text>. The tags <message> and
</message> mark the start and the end of the XML code fragment. The tags <text> and
</text> surround the text Hello, world!.
2. Is XMLaProgrammingLanguage?
A programming language consists of grammar rules and its own vocabulary which is used to
create computer programs. These programs instructs computer to perform specific tasks.
perform any computation or algorithms. It is usually stored in a simple text file and is
processed by special software that is capable of interpretingXML.
Tags andElements
An XML file is structured by several XML-elements, also called XML-nodes or XML- tags.
XML-elements' names are enclosed by triangular brackets < > as shown below:
<element>
Syntax Rules for Tags and Elements
Element Syntax: Each XML-element needs to be closed either with start or with end
elements as shown below:
<element>....</element>
or in simple-cases, just this way:
<element/>
Nesting of elements: An XML-element can contain multiple XML-elements as its children,
but the children elements must not overlap. i.e., an end tag of an element must have the same
name as that of the most recent unmatched start tag.
Following example shows incorrect nested tags:
<?xml version="1.0"?>
<contact-info>
<company>IARE
<contact-info>
</company>
Following example shows correct nested tags:
<?xml version="1.0"?>
<contact-info>
<company>IARE</company>
<contact-info>
Let us learn about one of the most important part of XML, the XML tags. XML tags form the
foundation of XML. They define the scope of an element in the XML. They can also be used to
insert comments, declare settings required for parsing the environment and to insert special
instructions.
We can broadly categorize XML tags as follows:
StartTag
The beginning of every non-empty XML element is marked by a
start-tag. An example of start-tag is:
<address>
EndTag
Every element that has a start tag should end with an end-tag. An
example of end- tag is:
</address>
Note that the end tags include a solidus ("/") before the name of an
element.
3. EmptyTag
The text that appears between start-tag and end-tag is called content. An element which has
no content is termed as empty. An empty element can be represented in two ways as below:
(1) A start-tag immediately followed by an end-tag as shown below:
<hr></hr>
(2) A complete empty-element tag is as shown below:
<hr />
Empty-element tags may be used for any element which has no content.
XML TagsRules
Following are the rules that need to be followed to use XML tags:
Rule 1
XML tags are case-sensitive. Following line of code is an example of wrong syntax </Address>,
because of the case difference in two tags, which is treated as erroneous syntax in XML.
<address>This is wrong syntax</Address>
Following code shows a correct way, where we use the same case to name the start and the
end tag. <address>This is correct syntax</address>
Rule 2
XML tags must be closed in an appropriate order, i.e., an XML tag opened inside another
element must be closed before the outer element is closed. For example:
<outer_element>
<internal_element>
This tag is closed before the outer_element
</internal_element>
</outer_element>
XMLElements
XML elements can be defined as building blocks of an XML. Elements can behave as containers
to hold text, elements, attributes, media objects or all of these.
Each XML document contains one or more elements, the scope of which are
either delimited by start and end tags, or for empty elements, by an emptyelement
tag.
Syntax
Following is the syntax to write an XML element:
<element-name attribute1 attribute2>
....content
</element-name>
where
element-name is the name of the element. The name its
case in the start and end tags must match.
attribute1, attribute2 are attributes of the element
separated by white spaces. An attribute defines a property of the element. It
associates a name with a value, which is a string of characters. An attribute
is written as:
name = "value"
The name is followed by an = sign and a string value inside double(" ") or single('
') quotes.
4. EmptyElement
An empty element (element with no content) has following syntax:
<name attribute1 attribute2.../>
Example of an XML document using various XML element:
<?xml version="1.0"?>
<contact-info>
<address category="residence">
<name>Tanmay Patil</name>
<company>TutorialsPoint</company>
<phone>(011) 123-4567</phone>
<address/>
</contact-info>
XML ElementsRules
Following rules are required to be followed for XML elements:
An element name can contain any alphanumeric characters. The only punctuation
marks allowed in names are the hyphen (-), under-score (_) and period (.).
Names are case sensitive. For example, Address, address, and ADDRESS are
different names.
Start and end tags of an element must be identical.
An element, which is a container, can contain text or elements as seen in the above
example.
Root element: An XML document can have only one root element. For example, following
is not a correct XML document, because both the x and y elements occur at the top level
without a root element:
<x>...</x>
<y>...</y>
The following example shows a correctly formed XML document:
<root>
<x>...</x>
<y>...</y>
</root>
Case sensitivity: The names of XML-elements are case-sensitive. That means the name of
the start and the end elements need to be exactly in the same case.
For example, <contact-info> is different from<Contact-Info>.
5. XML DTD
What is a DTD?
A DTD is a Document Type Definition.
A DTD defines the structure and the legal elements and attributes of an XML document.
Why Use a DTD?
With a DTD, independent groups of people can agree on a standard DTD for interchanging data.
An application can use a DTD to verify that XML data is valid.
An Internal DTD Declaration
If the DTD is declared inside the XML file, it must be wrapped inside the <!DOCTYPE>
definition:
XML document with an internal DTD
<?xml version="1.0"?>
<!DOCTYPE note [
<!ELEMENT note (to,from,heading,body)>
<!ELEMENT to (#PCDATA)>
<!ELEMENT from (#PCDATA)>
<!ELEMENT heading (#PCDATA)>
<!ELEMENT body (#PCDATA)>
]>
<note>
<to>Tove</to>
<from>Jani</from>
<heading>Reminder</heading>
<body>Don't forget me this weekend</body>
</note>
View XML file »
In the XML file, select "view source" to view the DTD.
The DTD above is interpreted like this:
!DOCTYPE note defines that the root element of this document is note
!ELEMENT note defines that the note element must contain four elements:
"to,from,heading,body"
!ELEMENT to defines the to element to be of type "#PCDATA"
!ELEMENT from defines the from element to be of type "#PCDATA"
!ELEMENT heading defines the heading element to be of type "#PCDATA"
6. !ELEMENT body defines the body element to be of type "#PCDATA"
An External DTD Declaration
If the DTD is declared in an external file, the <!DOCTYPE> definition must contain a reference
to the DTD file:
XML document with a reference to an external DTD
<?xml version="1.0"?>
<!DOCTYPE note SYSTEM "note.dtd">
<note>
<to>Tove</to>
<from>Jani</from>
<heading>Reminder</heading>
<body>Don't forget me this weekend!</body>
</note>
View XML file »
And here is the file "note.dtd", which contains the DTD:
<!ELEMENT note (to,from,heading,body)>
<!ELEMENT to (#PCDATA)>
<!ELEMENT from (#PCDATA)>
<!ELEMENT heading (#PCDATA)>
<!ELEMENT body (#PCDATA)>
7. DTD - XML Building Blocks
The main building blocks of both XML and HTML documents are elements.
The Building Blocks of XML Documents
Seen from a DTD point of view, all XML documents are made up by the following building
blocks:
Elements
Attributes
Entities
PCDATA
CDATA
Elements:
Elements are the main building blocks of both XML and HTML documents.
Examples of HTML elements are "body" and "table". Examples of XML elements could be
"note" and "message". Elements can contain text, other elements, or be empty. Examples of
empty HTML elements are "hr", "br" and "img".
Examples:
<body>some text</body>
<message>some text</message>
Attributes:
Attributes provide extra information about elements.
Attributes are always placed inside the opening tag of an element. Attributes always come in
name/value pairs. The following "img" element has additional information about a source file:
<img src="computer.gif" />
The name of the element is "img". The name of the attribute is "src". The value of the attribute is
"computer.gif". Since the element itself is empty it is closed by a " /".
8. Entities
Some characters have a special meaning in XML, like the less than sign (<) that defines the start
of an XML tag.
Most of you know the HTML entity: " ". This "no-breaking-space" entity is used in
HTML to insert an extra space in a document. Entities are expanded when a document is parsed
by an XML parser.
The following entities are predefined in XML:
Entity References Character
< <
> >
& &
" "
' '
PCDATA:
PCDATA means parsed character data.
Think of character data as the text found between the start tag and the end tag of an XML
element.
PCDATA is text that WILL be parsed by a parser. The text will be examined by the parser for
entities and markup.
Tags inside the text will be treated as markup and entities will be expanded.
However, parsed character data should not contain any &, <, or > characters; these need to be
represented by the & < and > entities, respectively.
CDATA
CDATA means character data.
CDATA is text that will NOT be parsed by a parser. Tags inside the text will NOT be treated as
markup and entities will not be expanded.
9. XML Schema
An XML Schema describes the structure of an XML document, just like a DTD.
An XML document with correct syntax is called "Well Formed".
An XML document validated against an XML Schema is both "Well Formed" and "Valid".
XML Schema
XML Schema is an XML-based alternative to DTD:
<xs:element name="note">
<xs:complexType>
<xs:sequence>
<xs:element name="to" type="xs:string"/>
<xs:element name="from" type="xs:string"/>
<xs:element name="heading" type="xs:string"/>
<xs:element name="body" type="xs:string"/>
</xs:sequence>
</xs:complexType>
</xs:element>
The Schema above is interpreted like this:
<xs:element name="note"> defines the element called "note"
<xs:complexType> the "note" element is a complex type
<xs:sequence> the complex type is a sequence of elements
<xs:element name="to" type="xs:string"> the element "to" is of type string (text)
<xs:element name="from" type="xs:string"> the element "from" is of type string
<xs:element name="heading" type="xs:string"> the element "heading" is of type string
<xs:element name="body" type="xs:string"> the element "body" is of type string
10. XML Schemas are More Powerful than DTD
XML Schemas are written in XML
XML Schemas are extensible to additions
XML Schemas support data types
XML Schemas support namespaces
Why Use an XML Schema?
With XML Schema, your XML files can carry a description of its own format.
With XML Schema, independent groups of people can agree on a standard for interchanging
data.
With XML Schema, you can verify data.
XML Schemas Support Data Types
One of the greatest strength of XML Schemas is the support for data types:
It is easier to describe document content
It is easier to define restrictions on data
It is easier to validate the correctness of data
It is easier to convert data between different data types
XML Schemas use XML Syntax
Another great strength about XML Schemas is that they are written in XML:
You don't have to learn a new language
You can use your XML editor to edit your Schema files
You can use your XML parser to parse your Schema files
You can manipulate your Schemas with the XML DOM
You can transform your Schemas with XSLT
11. XML DOM
What is the DOM?
The DOM defines a standard for accessing and manipulating documents:
"The W3C Document Object Model (DOM) is a platform and language-neutral interface that
allows programs and scripts to dynamically access and update the content, structure, and style
of a document."
The HTML DOM defines a standard way for accessing and manipulating HTML documents. It
presents an HTML document as a tree-structure.
The XML DOM defines a standard way for accessing and manipulating XML documents. It
presents an XML document as a tree-structure.
Understanding the DOM is a must for anyone working with HTML or XML.
The HTML DOM
All HTML elements can be accessed through the HTML DOM.
This example changes the value of an HTML element with id="demo":
Example
<h1 id="demo">This is a Heading</h1>
<script>
document.getElementById("demo").innerHTML = "Hello World!";
</script>
12. This example changes the value of the first <h1> element in an HTML document:
Example
<h1>This is a Heading</h1>
<h1>This is a Heading</h1>
<script>
document.getElementsByTagName("h1")[0].innerHTML = "Hello World!";
</script>
Note: Even if the HTML document contains only ONE <h1> element you still have to specify
the array index [0], because the getElementsByTagName() method always returns an array.
The XML DOM
All XML elements can be accessed through the XML DOM.
The XML DOM is:
A standard object model for XML
A standard programming interface for XML
Platform- and language-independent
A W3C standard
In other words: The XML DOM is a standard for how to get, change, add, or delete XML
elements.
Get the Value of an XML Element
This code retrieves the text value of the first <title> element in an XML document:
Example
txt = xmlDoc.getElementsByTagName("title")[0].childNodes[0].nodeValue;
Loading an XML File
This example reads "books.xml" into xmlDoc and retrieves the text value of the first <title>
element in books.xml:
13. Example
<!DOCTYPE html>
<html>
<body>
<p id="demo"></p>
<script>
var xhttp = new XMLHttpRequest();
xhttp.onreadystatechange = function() {
if (this.readyState == 4 && this.status == 200) {
myFunction(this);
}
};
xhttp.open("GET", "books.xml", true);
xhttp.send();
function myFunction(xml) {
var xmlDoc = xml.responseXML;
document.getElementById("demo").innerHTML =
xmlDoc.getElementsByTagName("title")[0].childNodes[0].nodeValue;
}
</script>
</body>
</html>
Example Explained
xmlDoc - the XML DOM object created by the parser.
getElementsByTagName("title")[0] - get the first <title> element
childNodes[0] - the first child of the <title> element (the text node)
nodeValue - the value of the node (the text itself)
Loading an XML String
This example loads a text string into an XML DOM object, and extracts the info from it with
JavaScript:
Example
<html>
<body>
<p id="demo"></p>
14. <script>
var text, parser, xmlDoc;
text = "<bookstore><book>" +
"<title>Everyday Italian</title>" +
"<author>Giada De Laurentiis</author>" +
"<year>2005</year>" +
"</book></bookstore>";
parser = new DOMParser();
xmlDoc = parser.parseFromString(text,"text/xml");
document.getElementById("demo").innerHTML =
xmlDoc.getElementsByTagName("title")[0].childNodes[0].nodeValue;
</script>
</body>
</html>
Programming Interface
The DOM models XML as a set of node objects. The nodes can be accessed with JavaScript or
other programming languages. In this tutorial we use JavaScript.
The programming interface to the DOM is defined by a set standard properties and methods.
Properties are often referred to as something that is (i.e. nodename is "book").
Methods are often referred to as something that is done (i.e. delete "book").
XML DOM Properties
These are some typical DOM properties:
x.nodeName - the name of x
x.nodeValue - the value of x
x.parentNode - the parent node of x
x.childNodes - the child nodes of x
x.attributes - the attributes nodes of x
Note: In the list above, x is a node object.
XML DOM Methods
x.getElementsByTagName(name) - get all elements with a specified tag name
x.appendChild(node) - insert a child node to x
x.removeChild(node) - remove a child node from x
Note: In the list above, x is a node object.
15. The sample XML considered in the examples is:
<employees>
<employee id="111">
<firstName>Rakesh</firstName>
<lastName>Mishra</lastName>
<location>Bangalore</location>
</employee>
<employee id="112">
<firstName>John</firstName>
<lastName>Davis</lastName>
<location>Chennai</location>
</employee>
<employee id="113">
<firstName>Rajesh</firstName>
<lastName>Sharma</lastName>
<location>Pune</location>
</employee>
</employees>
And the obejct into which the XML content is to be extracted is defined as below:
class Employee{
String id;
String firstName;
String lastName;
String location;
@Override
16. public String toString() {
return firstName+" "+lastName+"("+id+")"+location;
}
}
There are 3 main parsers for which I have given sample code:
DOM Parser
SAX Parser
StAX Parser
Using DOM Parser
I am making use of the DOM parser implementation that comes with the JDK and in my
example I am using JDK 7. The DOM Parser loads the complete XML content into a Tree
structure. And we iterate through the Node and NodeList to get the content of the XML. The
code for XML parsing using DOM parser is given below.
public class DOMParserDemo {
public static void main(String[] args) throws Exception {
//Get the DOM Builder Factory
DocumentBuilderFactory factory =
DocumentBuilderFactory.newInstance();
//Get the DOM Builder
DocumentBuilder builder = factory.newDocumentBuilder();
//Load and Parse the XML document
//document contains the complete XML as a Tree.
Document document = builder.parse(
ClassLoader.getSystemResourceAsStream("xml/employee.xml"));
List<Employee> empList = new ArrayList<>();
//Iterating through the nodes and extracting the data.
17. NodeList nodeList = document.getDocumentElement().getChildNodes();
for (int i = 0; i < nodeList.getLength(); i++) {
//We have encountered an <employee> tag.
Node node = nodeList.item(i);
if (node instanceof Element) {
Employee emp = new Employee();
emp.id = node.getAttributes().
getNamedItem("id").getNodeValue();
NodeList childNodes = node.getChildNodes();
for (int j = 0; j < childNodes.getLength(); j++) {
Node cNode = childNodes.item(j);
//Identifying the child tag of employee encountered.
if (cNode instanceof Element) {
String content = cNode.getLastChild().
getTextContent().trim();
switch (cNode.getNodeName()) {
case "firstName":
emp.firstName = content;
break;
case "lastName":
emp.lastName = content;
break;
case "location":
emp.location = content;
break;
18. }
}
}
empList.add(emp);
}
}
//Printing the Employee list populated.
for (Employee emp : empList) {
System.out.println(emp);
}
}
}
class Employee{
String id;
String firstName;
String lastName;
String location;
@Override
public String toString() {
return firstName+" "+lastName+"("+id+")"+location;
}}
The output for the above will be:
Rakesh Mishra(111)Bangalore
John Davis(112)Chennai
Rajesh Sharma(113)Pune
19. Using SAX Parser
SAX Parser is different from the DOM Parser where SAX parser doesn’t load the complete
XML into the memory, instead it parses the XML line by line triggering different events as and
when it encounters different elements like: opening tag, closing tag, character data, comments
and so on. This is the reason why SAX Parser is called an event based parser.
Along with the XML source file, we also register a handler which extends the DefaultHandler
class. The DefaultHandler class provides different callbacks out of which we would be interested
in:
startElement() – triggers this event when the start of the tag is encountered.
endElement() – triggers this event when the end of the tag is encountered.
characters() – triggers this event when it encounters some text data.
The code for parsing the XML using SAX Parser is given below:
import java.util.ArrayList;
import java.util.List;
import javax.xml.parsers.SAXParser;
import javax.xml.parsers.SAXParserFactory;
import org.xml.sax.Attributes;
import org.xml.sax.SAXException;
import org.xml.sax.helpers.DefaultHandler;
public class SAXParserDemo {
public static void main(String[] args) throws Exception {
SAXParserFactory parserFactor = SAXParserFactory.newInstance();
SAXParser parser = parserFactor.newSAXParser();
SAXHandler handler = new SAXHandler();
parser.parse(ClassLoader.getSystemResourceAsStream("xml/employee.xml"),
handler);
//Printing the list of employees obtained from XML
20. for ( Employee emp : handler.empList){
System.out.println(emp);
}
}
}
/**
* The Handler for SAX Events.
*/
class SAXHandler extends DefaultHandler {
List<Employee> empList = new ArrayList<>();
Employee emp = null;
String content = null;
@Override
//Triggered when the start of tag is found.
public void startElement(String uri, String localName,
String qName, Attributes attributes)
throws SAXException {
switch(qName){
//Create a new Employee object when the start tag is found
case "employee":
emp = new Employee();
emp.id = attributes.getValue("id");
break;
}
}
21. @Override
public void endElement(String uri, String localName,
String qName) throws SAXException {
switch(qName){
//Add the employee to list once end tag is found
case "employee":
empList.add(emp);
break;
//For all other end tags the employee has to be updated.
case "firstName":
emp.firstName = content;
break;
case "lastName":
emp.lastName = content;
break;
case "location":
emp.location = content;
break;
}
}
@Override
public void characters(char[] ch, int start, int length)
throws SAXException {
content = String.copyValueOf(ch, start, length).trim();
22. }
}
class Employee {
String id;
String firstName;
String lastName;
String location;
@Override
public String toString() {
return firstName + " " + lastName + "(" + id + ")" + location;
}
}
The output for the above would be:
Rakesh Mishra(111)Bangalore
John Davis(112)Chennai
Rajesh Sharma(113)Pune
With this I have covered parsing the same XML document and performing the same task of
populating the list of Employee objects using all the three parsers namely:
DOM Parser
SAX Parser