HEX

File: //usr/local/share/man/man3/XML::XPath::XMLParser.3pm
.\" Automatically generated by Pod::Man 4.11 (Pod::Simple 3.35)
.\"
.\" Standard preamble:
.\" ========================================================================
.de Sp \" Vertical space (when we can't use .PP)
.if t .sp .5v
.if n .sp
..
.de Vb \" Begin verbatim text
.ft CW
.nf
.ne \\$1
..
.de Ve \" End verbatim text
.ft R
.fi
..
.\" Set up some character translations and predefined strings.  \*(-- will
.\" give an unbreakable dash, \*(PI will give pi, \*(L" will give a left
.\" double quote, and \*(R" will give a right double quote.  \*(C+ will
.\" give a nicer C++.  Capital omega is used to do unbreakable dashes and
.\" therefore won't be available.  \*(C` and \*(C' expand to `' in nroff,
.\" nothing in troff, for use with C<>.
.tr \(*W-
.ds C+ C\v'-.1v'\h'-1p'\s-2+\h'-1p'+\s0\v'.1v'\h'-1p'
.ie n \{\
.    ds -- \(*W-
.    ds PI pi
.    if (\n(.H=4u)&(1m=24u) .ds -- \(*W\h'-12u'\(*W\h'-12u'-\" diablo 10 pitch
.    if (\n(.H=4u)&(1m=20u) .ds -- \(*W\h'-12u'\(*W\h'-8u'-\"  diablo 12 pitch
.    ds L" ""
.    ds R" ""
.    ds C` ""
.    ds C' ""
'br\}
.el\{\
.    ds -- \|\(em\|
.    ds PI \(*p
.    ds L" ``
.    ds R" ''
.    ds C`
.    ds C'
'br\}
.\"
.\" Escape single quotes in literal strings from groff's Unicode transform.
.ie \n(.g .ds Aq \(aq
.el       .ds Aq '
.\"
.\" If the F register is >0, we'll generate index entries on stderr for
.\" titles (.TH), headers (.SH), subsections (.SS), items (.Ip), and index
.\" entries marked with X<> in POD.  Of course, you'll have to process the
.\" output yourself in some meaningful fashion.
.\"
.\" Avoid warning from groff about undefined register 'F'.
.de IX
..
.nr rF 0
.if \n(.g .if rF .nr rF 1
.if (\n(rF:(\n(.g==0)) \{\
.    if \nF \{\
.        de IX
.        tm Index:\\$1\t\\n%\t"\\$2"
..
.        if !\nF==2 \{\
.            nr % 0
.            nr F 2
.        \}
.    \}
.\}
.rr rF
.\" ========================================================================
.\"
.IX Title "XML::XPath::XMLParser 3"
.TH XML::XPath::XMLParser 3 "2018-10-11" "perl v5.26.3" "User Contributed Perl Documentation"
.\" For nroff, turn off justification.  Always turn off hyphenation; it makes
.\" way too many mistakes in technical documents.
.if n .ad l
.nh
.SH "NAME"
XML::XPath::XMLParser \- The default XML parsing class that produces a node tree
.SH "SYNOPSIS"
.IX Header "SYNOPSIS"
.Vb 7
\&    my $parser = XML::XPath::XMLParser\->new(
\&                filename => $self\->get_filename,
\&                xml => $self\->get_xml,
\&                ioref => $self\->get_ioref,
\&                parser => $self\->get_parser,
\&            );
\&    my $root_node = $parser\->parse;
.Ve
.SH "DESCRIPTION"
.IX Header "DESCRIPTION"
This module generates a node tree for use as the context node for XPath processing.
It aims to be a quick parser, nothing fancy, and yet has to store more information
than most parsers. To achieve this I've used array refs everywhere \- no hashes.
I don't have any performance figures for the speedups achieved, so I make no
apologies for anyone not used to using arrays instead of hashes. I think they
make good sense here where we know the attributes of each type of node.
.SH "Node Structure"
.IX Header "Node Structure"
All nodes have the same first 2 entries in the array: node_parent
and node_pos. The type of the node is determined using the \fBref()\fR function.
The node_parent always contains an entry for the parent of the current
node \- except for the root node which has undef in there. And node_pos is the
position of this node in the array that it is in (think:
\&\f(CW$node\fR == \f(CW$node\fR\->[node_parent]\->[node_children]\->[$node\->[node_pos]] )
.PP
Nodes are structured as follows:
.SS "Root Node"
.IX Subsection "Root Node"
The root node is just an element node with no parent.
.PP
.Vb 6
\&    [
\&      undef, # node_parent \- check for undef to identify root node
\&      undef, # node_pos
\&      undef, # node_prefix
\&      [ ... ], # node_children (see below)
\&    ]
.Ve
.SS "Element Node"
.IX Subsection "Element Node"
.Vb 9
\&    [
\&      $parent, # node_parent
\&      <position in current array>, # node_pos
\&      \*(Aqxxx\*(Aq, # node_prefix \- namespace prefix on this element
\&      [ ... ], # node_children
\&      \*(Aqyyy\*(Aq, # node_name \- element tag name
\&      [ ... ], # node_attribs \- attributes on this element
\&      [ ... ], # node_namespaces \- namespaces currently in scope
\&    ]
.Ve
.SS "Attribute Node"
.IX Subsection "Attribute Node"
.Vb 7
\&    [
\&      $parent, # node_parent \- the element node
\&      <position in current array>, # node_pos
\&      \*(Aqxxx\*(Aq, # node_prefix \- namespace prefix on this element
\&      \*(Aqhref\*(Aq, # node_key \- attribute name
\&      \*(Aqftp://ftp.com/\*(Aq, # node_value \- value in the node
\&    ]
.Ve
.SS "Namespace Nodes"
.IX Subsection "Namespace Nodes"
Each element has an associated set of namespace nodes that are currently
in scope. Each namespace node stores a prefix and the expanded name (retrieved
from the xmlns:prefix=\*(L"...\*(R" attribute).
.PP
.Vb 6
\&    [
\&      $parent,
\&      <pos>,
\&      \*(Aqa\*(Aq, # node_prefix \- the namespace as it was written as a prefix
\&      \*(Aqhttp://my.namespace.com\*(Aq, # node_expanded \- the expanded name.
\&    ]
.Ve
.SS "Text Nodes"
.IX Subsection "Text Nodes"
.Vb 5
\&    [
\&      $parent,
\&      <pos>,
\&      \*(AqThis is some text\*(Aq # node_text \- the text in the node
\&    ]
.Ve
.SS "Comment Nodes"
.IX Subsection "Comment Nodes"
.Vb 5
\&    [
\&      $parent,
\&      <pos>,
\&      \*(AqThis is a comment\*(Aq # node_comment
\&    ]
.Ve
.SS "Processing Instruction Nodes"
.IX Subsection "Processing Instruction Nodes"
.Vb 6
\&    [
\&      $parent,
\&      <pos>,
\&      \*(Aqtarget\*(Aq, # node_target
\&      \*(Aqdata\*(Aq, # node_data
\&    ]
.Ve
.SH "Usage"
.IX Header "Usage"
If you feel the need to use this module outside of XML::XPath (for example
you might use this module directly so that you can cache parsed trees), you
can follow the following \s-1API:\s0
.SS "new"
.IX Subsection "new"
The new method takes either no parameters, or any of the following parameters:
.PP
.Vb 4
\&        filename
\&        xml
\&        parser
\&        ioref
.Ve
.PP
This uses the familiar hash syntax, so an example might be:
.PP
.Vb 1
\&    use XML::XPath::XMLParser;
\&
\&    my $parser = XML::XPath::XMLParser\->new(filename => \*(Aqexample.xml\*(Aq);
.Ve
.PP
The parameters represent a filename, a string containing \s-1XML,\s0 an XML::Parser
instance and an open filehandle ref respectively. You can also set or get all
of these properties using the get_ and set_ functions that have the same
name as the property: e.g. get_filename, set_ioref, etc.
.SS "parse"
.IX Subsection "parse"
The parse method generally takes no parameters, however you are free to
pass either an open filehandle reference or an \s-1XML\s0 string if you so require.
The return value is a tree that XML::XPath can use. The parse method will
die if there is an error in your \s-1XML,\s0 so be sure to use perl's exception
handling mechanism (eval{};) if you want to avoid this.
.SS "parsefile"
.IX Subsection "parsefile"
The parsefile method is identical to \fBparse()\fR except it expects a single
parameter that is a string naming a file to open and parse. Again it
returns a tree and also dies if there are \s-1XML\s0 errors.
.SH "NOTICES"
.IX Header "NOTICES"
This file is distributed as part of the XML::XPath module, and is copyright
2000 Fastnet Software Ltd. Please see the documentation for the module as a
whole for licencing information.