File: //usr/local/share/man/man3/XML::DOM::Parser.3pm
.\" Automatically generated by Pod::Man 4.11 (Pod::Simple 3.35)
.\"
.\" Standard preamble:
.\" ========================================================================
.de Sp \" Vertical space (when we can't use .PP)
.if t .sp .5v
.if n .sp
..
.de Vb \" Begin verbatim text
.ft CW
.nf
.ne \\$1
..
.de Ve \" End verbatim text
.ft R
.fi
..
.\" Set up some character translations and predefined strings. \*(-- will
.\" give an unbreakable dash, \*(PI will give pi, \*(L" will give a left
.\" double quote, and \*(R" will give a right double quote. \*(C+ will
.\" give a nicer C++. Capital omega is used to do unbreakable dashes and
.\" therefore won't be available. \*(C` and \*(C' expand to `' in nroff,
.\" nothing in troff, for use with C<>.
.tr \(*W-
.ds C+ C\v'-.1v'\h'-1p'\s-2+\h'-1p'+\s0\v'.1v'\h'-1p'
.ie n \{\
. ds -- \(*W-
. ds PI pi
. if (\n(.H=4u)&(1m=24u) .ds -- \(*W\h'-12u'\(*W\h'-12u'-\" diablo 10 pitch
. if (\n(.H=4u)&(1m=20u) .ds -- \(*W\h'-12u'\(*W\h'-8u'-\" diablo 12 pitch
. ds L" ""
. ds R" ""
. ds C` ""
. ds C' ""
'br\}
.el\{\
. ds -- \|\(em\|
. ds PI \(*p
. ds L" ``
. ds R" ''
. ds C`
. ds C'
'br\}
.\"
.\" Escape single quotes in literal strings from groff's Unicode transform.
.ie \n(.g .ds Aq \(aq
.el .ds Aq '
.\"
.\" If the F register is >0, we'll generate index entries on stderr for
.\" titles (.TH), headers (.SH), subsections (.SS), items (.Ip), and index
.\" entries marked with X<> in POD. Of course, you'll have to process the
.\" output yourself in some meaningful fashion.
.\"
.\" Avoid warning from groff about undefined register 'F'.
.de IX
..
.nr rF 0
.if \n(.g .if rF .nr rF 1
.if (\n(rF:(\n(.g==0)) \{\
. if \nF \{\
. de IX
. tm Index:\\$1\t\\n%\t"\\$2"
..
. if !\nF==2 \{\
. nr % 0
. nr F 2
. \}
. \}
.\}
.rr rF
.\" ========================================================================
.\"
.IX Title "XML::DOM::Parser 3"
.TH XML::DOM::Parser 3 "2002-07-31" "perl v5.26.3" "User Contributed Perl Documentation"
.\" For nroff, turn off justification. Always turn off hyphenation; it makes
.\" way too many mistakes in technical documents.
.if n .ad l
.nh
.SH "NAME"
XML::DOM::Parser \- An XML::Parser that builds XML::DOM document structures
.SH "SYNOPSIS"
.IX Header "SYNOPSIS"
.Vb 1
\& use XML::DOM;
\&
\& my $parser = new XML::DOM::Parser;
\& my $doc = $parser\->parsefile ("file.xml");
\& $doc\->dispose; # Avoid memory leaks \- cleanup circular references
.Ve
.SH "DESCRIPTION"
.IX Header "DESCRIPTION"
XML::DOM::Parser extends XML::Parser
.PP
The XML::Parser module was written by Clark Cooper and
is built on top of XML::Parser::Expat,
which is a lower level interface to James Clark's expat library.
.PP
XML::DOM::Parser parses \s-1XML\s0 strings or files
and builds a data structure that conforms to the \s-1API\s0 of the Document Object
Model as described at <http://www.w3.org/TR/REC\-DOM\-Level\-1>.
See the XML::Parser manpage for other additional properties of the
XML::DOM::Parser class.
Note that the 'Style' property should not be used (it is set internally.)
.PP
The XML::Parser \fBNoExpand\fR option is more or less supported, in that it will
generate EntityReference objects whenever an entity reference is encountered
in character data. I'm not sure how useful this is. Any comments are welcome.
.PP
As described in the synopsis, when you create an XML::DOM::Parser object,
the parse and parsefile methods create an XML::DOM::Document object
from the specified input. This Document object can then be examined, modified and
written back out to a file or converted to a string.
.PP
When using \s-1XML::DOM\s0 with XML::Parser version 2.19 and up, setting the
XML::DOM::Parser option \fBKeepCDATA\fR to 1 will store CDATASections in
CDATASection nodes, instead of converting them to Text nodes.
Subsequent CDATASection nodes will be merged into one. Let me know if this
is a problem.
.SH "Using LWP to parse URLs"
.IX Header "Using LWP to parse URLs"
The \fBparsefile()\fR method now also supports URLs, e.g. \fIhttp://www.erols.com/enno/xsa.xml\fR.
It uses \s-1LWP\s0 to download the file and then calls \fBparse()\fR on the resulting string.
By default it will use a LWP::UserAgent that is created as follows:
.PP
.Vb 3
\& use LWP::UserAgent;
\& $LWP_USER_AGENT = LWP::UserAgent\->new;
\& $LWP_USER_AGENT\->env_proxy;
.Ve
.PP
Note that env_proxy reads proxy settings from environment variables, which is what I need to
do to get thru our firewall. If you want to use a different LWP::UserAgent, you can either set
it globally with:
.PP
.Vb 1
\& XML::DOM::Parser::set_LWP_UserAgent ($my_agent);
.Ve
.PP
or, you can specify it for a specific XML::DOM::Parser by passing it to the constructor:
.PP
.Vb 1
\& my $parser = new XML::DOM::Parser (LWP_UserAgent => $my_agent);
.Ve
.PP
Currently, \s-1LWP\s0 is used when the filename (passed to parsefile) starts with one of
the following \s-1URL\s0 schemes: http, https, ftp, wais, gopher, or file (followed by a colon.)
If I missed one, please let me know.
.PP
The \s-1LWP\s0 modules are part of libwww-perl which is available at \s-1CPAN.\s0