Getting Started with Queries

This section provides information to get you started using queries. It does not provide complete information about how to define a query. Instead, it provides instructions for defining typical queries you might want to run. There are numerous cross-references to later sections that provide complete information about a particular query construct.

The topics discussed in this section include

Obtaining All Marked-Up Text

When you query a document, you do not usually want to obtain all marked-up text. However, an understanding of queries that return all marked-up text makes it easier to define a query that retrieves just what you want.

The following figure shows a complete query ( /bookstore) and the way the XPath processor interprets it:

This query returns the bookstore element. Because the bookstore element is the document element, which contains all elements and attributes in the document, this query returns all marked-up text.

In the query, the initial forward slash (/) instructs the XPath processor to start its search at the root node.

Suppose you run the following query on bookstore.xml:

/book
               

            

This query returns an empty set. It searches the immediate children of the root node for an element named book. Because there is no such element, this query does not return any marked-up text. Note that this query does not return an error. The query runs successfully, but the XPath processor does not find any elements that match the query. All book elements are grandchildren of the root node, and the XPath processor only checks the children of the root node.

Obtaining a Portion of an XML Document

Usually, you use a query to obtain a portion of an XML document. To obtain the particular elements that you want, you must understand how to obtain an element that is a child of the document element. With this information, you can obtain any elements in the document.

The following figure shows how the XPath processor interprets the /bookstore/book query:

When the XPath processor starts its search at the root node, there is only one element among the immediate children of the root node. This is the document element. In this example, bookstore is the document element.

The query in this figure returns the book elements that are children of bookstore. This query does not return the my:book element, which is also a child of bookstore.

Now you can define queries that obtain any elements you want. For example:

/bookstore/book/title
               

            

This query returns title elements contained in book elements that are contained in bookstore.

Obtaining All Elements of a Particular Name

Sometimes you want all like-named elements regardless of where they are in a document. In this case, you do not need to start at the root node and navigate to the elements you want.

For example, the following query returns all last-name elements in any XML document:

//last-name
       

    

The double forward slash (//) at the beginning of a query instructs the XPath processor to start at the root node and search the entire document. In other words, the XPath processor searches all descendants of the root node.

If you perform this query on bookstore.xml, it returns the last-name elements that are children of author elements, and it also returns the last-name element that is a child of a publication element.

Obtaining All Elements of a Particular Name from a

Although sometimes you might want all like-named elements wherever they are in a document, other times you might want only those like-named elements from a particular part of the document (branch of the tree).

For example, you might want all price elements contained in book elements, but not price elements contained in magazine elements. The query is to return such a result is:

/bookstore/book//price
               

            

This query returns all price elements that are contained in book elements. Some of these price elements are immediate children of book elements. One returned price element is a great-grandchild of the second book element. The following figure shows how the XPath processor interprets this query:

Different Results from Similar Queries

Some queries can look very similar but return very different results. The following figure shows this.

Queries That Return More Than You Want

Suppose you want the titles of all the books. You might decide to define your query like this:

//title
               

            

This query does return all titles of books, but it also returns the title of a magazine. This query instructs the XPath processor to start at the root node, search all descendants, and return all title elements. In bookstore.xml, this means that the query returns the title of the magazine in addition to the titles of books. In some other document, if all titles are contained in book elements, this query returns exactly what you want.

To query and obtain only the titles of books, you can use either of the following queries. They obtain identical results. However, the first query runs faster.

/bookstore/book/title
               
//book/title
               
 
               

            

The first query runs faster because it uses the child axis, while the second query uses the descendent-or-self axis. In general, the simpler axes, such as child, self, parent, and ancestor, are faster than the more complicated axes, such as descendent, preceding, following, preceding-sibling, and following-sibling. This is especially true for large documents. Whenever possible, use a simpler axis.

Specifying Attributes in Queries

To specify an attribute name in a query, precede the attribute name with an at sign (@). The XPath processor treats elements and attributes in the same way wherever possible. For example:

//@style
               

            

This query returns the style attributes associated with the magazine, the three books, and the my:book element. That is, it returns all the style attributes in the document. It does not return the elements that contain the attributes.

Following is another query that includes an attribute:

/bookstore/book/@style
               

            

This query returns the three style attributes for the three book elements.

The following query returns the style attribute of the context node:

@style 
               

            

If the context node does not have a style attribute, the result set is empty.

The next query returns the exchange attribute on price elements in the current context:

price/@exchange 
               

            

Following is an example that is not valid because attributes cannot have subelements:

price/@exchange/total 
               

            

Following is a query that finds the style attribute for all book elements in the document:

//book/@style 
               

            

Restrictions

Attributes cannot contain subelements. Consequently, you cannot apply a path operator to an attribute. If you try to, you receive a syntax error.

Attributes are inherently unordered. Consequently, you cannot apply a position number to an attribute. If you try to, you receive a syntax error.

Attributes and Wildcards

You can use an at sign (@) and asterisk (*) together to retrieve a collection of attributes. For example, the following query finds all attributes in the current context:

@* 
               

            

Filtering Results of Queries

Sometimes you want to retrieve only those elements that meet a certain condition. For example, you might want information about a particular book. In this case, you can include a filter in your query. You enclose filters in brackets ( [ ] ).

The following figure shows how the XPath processor interprets a query with a filter:

This query checks each book element to determine whether it has a title child element whose value is "History of Trenton". If it does, the query returns the book element. Using the sample data, this query returns the second book element.

The following topics provide details about filters:

Quotation Marks in Filters

Suppose you define the following filter:

[title="History of Trenton"]
               

            

If you need to specify this filter as part of an attribute value, use single quotation marks instead of double quotation marks. This is because the attribute value itself is (usually) inside double quotation marks. For example:

<xsl:value-of select="/bookstore/book[title='History of Trenton']">
               

            

Strings within an expression may contain special characters such as [, {, &, `, /, and others, as long as the entire string is enclosed in double quotes ("). When the string itself contains double quotes, you may enclose it in single quotes ('). When a string contains both single and double quotes, you must handle these segments of the string as if they were individual phrases, and concatenate them.

More Filter Examples

Following is another example of a query with a filter clause. This query returns book elements if the price of the book is greater than 25 dollars:

/bookstore/book[price > 25]
               

            

The next query returns author elements if the author has a degree:

//author[degree]
               

            

The next query returns the date attributes that match "3/1/00":

//@date[.="3/1/00"]
               

            

The next query returns manufacturer elements in the current context for which the rwdrive attribute of the model is the same as the vendor attribute of the manufacturer:

manufacturer[model/@rwdrive = @vendor]
               

            

How the XPath Processor Evaluates a Filter

You can apply constraints and branching to a query by specifying a filter clause. The filter contains a query, which is called the subquery. The subquery evaluates to a Boolean value, or to a numeric value. The XPath processor tests each element in the current context to see if it satisfies the subquery. The result includes only those elements that test true for the subquery.

The XPath processor always evaluates filters with respect to a context. For example, the expression book[author] means for every book element that is found in the current context, determine whether the book element contains an author element. For example, the following query returns all books in the current context that contain at least one excerpt:

book[excerpt] 
               

            

The next query returns all titles of books in the current context that have at least one excerpt:

book[excerpt]/title 
               

            

Multiple Filters

You can specify any number of filters in any level of a query expression. Empty filters
( [ ] ) are not allowed.

A query that contains one or more filters returns the rightmost element that is not in a filter clause. For example:

book[excerpt]/author[degree] 
               

            

The previous query returns author elements. It does not return degree elements. To be exact, this query returns all authors who have at least one degree if the author is of a book for which the document contains at least one excerpt. In other words, for all books in the current context that have excerpts, this query finds all authors with degrees.

The following query finds each book child of the current context that has an author with at least one degree:

book[author/degree] 
               

            

The next query returns all books in the current context that have an excerpt and a title:

book[excerpt][title] 
               

            

Filters and Attributes

Following is a query that finds all child elements of the current context with specialty attributes:

*[@specialty] 
               

            

The following query returns all book children in the current context with style attributes:

book[@style] 
               

            

The next query finds all book child elements in the current context in which the value of the style attribute of the book is equal to the value of the specialty attribute of the bookstore element:

book[/bookstore/@specialty = @style] 
               

            

Wildcards in Queries

In a query, you can include an asterisk (*) to represent all elements. For example:

/bookstore/book/*
               

            

This query searches for all book elements in bookstore. For each book element, this query returns all child elements that the book element contains.

The * collection returns all elements that are children of the context node, regardless of their tag names.

The next query finds all last-name elements that are grandchildren of book elements in the current context:

book/*/last-name 
               

            

The following query returns the grandchild elements of the current context.

*/* 
               

            

Restrictions

Usually, the asterisk (*) returns only elements. It does not return processing instructions, attributes, or comments, nor does it include attributes or comments when it maintains a count of nodes. For example, the following query returns title elements. It does not return style attributes.

/bookstore/book/*[1]
               

            

Wildcards in strings are not allowed. For example, you cannot define a query such as the following:

/bookstore/book[author=" A* "]
               

            

Attributes

To use a wildcard for attributes, you can specify @*. For example:

/bookstore/book/@*
               

            

For each book element, this query returns all attributes. It does not return any elements.

Calling Functions in Queries

The XPath processor provides many functions that you can call in a query. This section provides some examples to give you a sense of how functions in queries work. Many subsequent sections provide information about invoking functions in queries. For a complete list of the functions you can call in a query, see XPath Functions Quick Reference.

Following is a query that returns a number that indicates how many book elements are in the document:

count(//book)
               

            

In format descriptions, a question mark that follows an argument indicates that the argument is optional. For example:

string substring(string, number, number?) 
               

            

This function returns a string. The name of the function is substring. This function takes two required arguments (a string followed by a number) and one optional argument (a number).

Case Sensitivity and Blank Spaces in Queries

Queries are case sensitive. This applies to every part of the query, including operators, strings, element and attribute names, and function names.

For example, suppose you try this query:

/Bookstore
               

            

This query returns an empty set because the name of the document element is bookstore and not Bookstore.

Blank spaces in queries are not significant unless they appear within quotation marks.

Precedence of Query Operators

The precedence of query operators varies for XPath 1.0 and XPath 2.0, as shown in the following tables. In these tables, operators are listed in order of precedence, with highest precedence being first; operators in a given row have the same precedence.

Operation Type
XPath Operators
Grouping
( )
               

            
Filter
[ ]
               

            
Unary minus

              - 
               

            
Multiplication
*, div, mod
               

            
Addition
+, -
               

            
Relational (Comparison)
= != < <= > >=
               

            
Union
|
               

            
Negation
not
               

            
Conjunction
and
               

            
Disjunction
or
               

            
Table 59. Query Operator Precedence - XPath 1.0

Operation Type
XPath Operators
Sequence separator
,
               

            
Conjunction
and
               

            
Type matching

              instance of
               

            
Assertion
treat
               

            
Conversion test
castable
               

            
Conversion
cast
               

            
Relational (Comparison)
eg, ne, lt, le, gt, ge, =, !=, <, <=, >, >=, 
is, <<, >>
               

            
Range
to
               

            
Addition
+, -
               

            
Multiplication
*, div, idiv, mod
               

            
Unary
unary -, unary +
               

            
Union
union, |
               

            
Select set
intersect, except
               

            
Navigation
/, //
               

            
Filter
[ ]
               

            
Table 60. Query Operator Precedence - XPath 2.0

 
Free Stylus Studio XML Training: