Initial Commit

2025-12-03 16:38:10 +01:00
parent c5e26bf594
commit b732d8d4b5
17680 changed files with 5977495 additions and 2 deletions
--- a/database/perl/vendor/lib/HTML/AsSubs.pm
+++ b/database/perl/vendor/lib/HTML/AsSubs.pm
@@ -0,0 +1,203 @@
+package HTML::AsSubs;
+
+# ABSTRACT: functions that construct a HTML syntax tree
+
+
+use warnings;
+use strict;
+use vars qw(@ISA @EXPORT);
+
+our $VERSION = '5.07'; # VERSION from OurPkgVersion
+
+require HTML::Element;
+require Exporter;
+@ISA = qw(Exporter);
+
+# Problem: exports so damned much.  Has no concept of "export only HTML4
+#  elements".  TODO:?? make something that make functions that just
+#  wrap XML::Generator calls?
+
+
+use vars qw(@TAGS);
+@TAGS = qw(html
+    head title base link meta isindex nextid script style
+    body h1 h2 h3 h4 h5 h6 p pre div blockquote
+    a img br hr
+    ol ul dir menu li
+    dl dt dd
+    dfn cite code em kbd samp strong var address span
+    b i u tt
+    center font big small strike
+    sub sup
+    table tr td th caption
+    form input select option textarea
+    object applet param
+    map area
+    frame frameset noframe
+);
+
+for (@TAGS) {
+    my $code;
+    $code = "sub $_ { _elem('$_', \@_); }\n";
+    push( @EXPORT, $_ );
+## no critic
+    eval $code;
+## use critic
+    if ($@) {
+        die $@;
+    }
+}
+
+
+sub _elem {
+    my $tag = shift;
+    my $attributes;
+    if ( @_ and defined $_[0] and ref( $_[0] ) eq "HASH" ) {
+        $attributes = shift;
+    }
+    my $elem = HTML::Element->new( $tag, %$attributes );
+    $elem->push_content(@_);
+    $elem;
+}
+
+1;
+
+__END__
+
+=pod
+
+=head1 NAME
+
+HTML::AsSubs - functions that construct a HTML syntax tree
+
+=head1 VERSION
+
+This document describes version 5.07 of
+HTML::AsSubs, released August 31, 2017
+as part of L<HTML-Tree|HTML::Tree>.
+
+=head1 SYNOPSIS
+
+ use HTML::AsSubs;
+ $h = body(
+	   h1("This is the heading"),
+	   p("This is the first paragraph which contains a ",
+	     a({href=>'link.html'}, "link"),
+	     " and an ",
+	     img({src=>'img.gif', alt=>'image'}),
+	     "."
+	    ),
+	  );
+ print $h->as_HTML;
+
+=head1 DESCRIPTION
+
+This module exports functions that can be used to construct various
+HTML elements. The functions are named after the tags of the
+corresponding HTML element and are all written in lower case. If the
+first argument is a hash reference then it will be used to initialize the
+attributes of this element. The remaining arguments are regarded as
+content.
+
+For a similar idea (i.e., it's another case where the syntax tree
+of the Perl source mirrors the syntax tree of the HTML produced),
+see HTML::Element's C<new_from_lol> method.
+
+For what I now think is a cleaner implementation of this same idea,
+see the excellent module C<XML::Generator>, which is what I suggest
+for actual real-life use.  (I suggest this over C<HTML::AsSubs> and
+over C<CGI.pm>'s HTML-making functions.)
+
+=head1 ACKNOWLEDGEMENT
+
+This module was inspired by the following message:
+
+ Date: Tue, 4 Oct 1994 16:11:30 +0100
+ Subject: Wow! I have a large lightbulb above my head!
+
+ Take a moment to consider these lines:
+
+ %OVERLOAD=( '""' => sub { join("", @{$_[0]}) } );
+
+ sub html { my($type)=shift; bless ["<$type>", @_, "</$type>"]; }
+
+ :-)  I *love* Perl 5!  Thankyou Larry and Ilya.
+
+ Regards,
+ Tim Bunce.
+
+ p.s. If you didn't get it, think about recursive data types: html(html())
+ p.p.s. I'll turn this into a much more practical example in a day or two.
+ p.p.p.s. It's a pity that overloads are not inherited. Is this a bug?
+
+=head1 BUGS
+
+The exported link() function overrides the builtin link() function.
+The exported tr() function must be called using &tr(...) syntax
+because it clashes with the builtin tr/../../ operator.
+
+=head1 SEE ALSO
+
+L<HTML::Element>, L<XML::Generator>
+
+=head2 html head title base link meta isindex nextid script style body h1 h2 h3 h4 h5 h6 p pre div blockquote a img br hr ol ul dir menu li dl dt dd dfn cite code em kbd samp strong var address span b i u tt center font big small strike sub sup table tr td th caption form input select option textarea object applet param map area frame frameset noframe
+
+A bunch of methods for creating tags.
+
+=head1 Private Functions
+
+=head2 _elem()
+
+The _elem() function is wrapped by all the html 'tag' functions. It
+takes a tag-name, optional hashref of attributes and a list of content
+as parameters.
+
+=head1 AUTHOR
+
+Current maintainers:
+
+=over
+
+=item * Christopher J. Madsen S<C<< <perl AT cjmweb.net> >>>
+
+=item * Jeff Fearn S<C<< <jfearn AT cpan.org> >>>
+
+=back
+
+Original HTML-Tree author:
+
+=over
+
+=item * Gisle Aas
+
+=back
+
+Former maintainers:
+
+=over
+
+=item * Sean M. Burke
+
+=item * Andy Lester
+
+=item * Pete Krawczyk S<C<< <petek AT cpan.org> >>>
+
+=back
+
+You can follow or contribute to HTML-Tree's development at
+L<< https://github.com/kentfredric/HTML-Tree >>.
+
+=head1 COPYRIGHT AND LICENSE
+
+Copyright 1995-1998 Gisle Aas, 1999-2004 Sean M. Burke,
+2005 Andy Lester, 2006 Pete Krawczyk, 2010 Jeff Fearn,
+2012 Christopher J. Madsen.
+
+This library is free software; you can redistribute it and/or
+modify it under the same terms as Perl itself.
+
+The programs in this library are distributed in the hope that they
+will be useful, but without any warranty; without even the implied
+warranty of merchantability or fitness for a particular purpose.
+
+=cut
--- a/database/perl/vendor/lib/HTML/Element.pm
+++ b/database/perl/vendor/lib/HTML/Element.pm
--- a/database/perl/vendor/lib/HTML/Element/traverse.pm
+++ b/database/perl/vendor/lib/HTML/Element/traverse.pm
@@ -0,0 +1,374 @@
+
+# This is a .pm just to (try to) make some CPAN document converters
+#  convert it happily as part of the dist's documentation tree.
+package HTML::Element::traverse;
+
+# ABSTRACT: discussion of HTML::Element's traverse method
+
+use warnings;
+use strict;
+
+our $VERSION = '5.07'; # VERSION from OurPkgVersion
+
+use HTML::Element ();
+1;
+
+__END__
+
+=pod
+
+=head1 NAME
+
+HTML::Element::traverse - discussion of HTML::Element's traverse method
+
+=head1 VERSION
+
+This document describes version 5.07 of
+HTML::Element::traverse, released August 31, 2017
+as part of L<HTML-Tree|HTML::Tree>.
+
+=head1 SYNOPSIS
+
+  # $element->traverse is unnecessary and obscure.
+  #   Don't use it in new code.
+
+=head1 DESCRIPTION
+
+C<HTML::Element> provides a method C<traverse> that traverses the tree
+and calls user-specified callbacks for each node, in pre- or
+post-order.  However, use of the method is quite superfluous: if you
+want to recursively visit every node in the tree, it's almost always
+simpler to write a subroutine does just that, than it is to bundle up
+the pre- and/or post-order code in callbacks for the C<traverse>
+method.
+
+=head1 EXAMPLES
+
+Suppose you want to traverse at/under a node $tree and give elements
+an 'id' attribute unless they already have one.
+
+You can use the C<traverse> method:
+
+  {
+    my $counter = 'x0000';
+    $start_node->traverse(
+      [ # Callbacks;
+        # pre-order callback:
+        sub {
+          my $x = $_[0];
+          $x->attr('id', $counter++) unless defined $x->attr('id');
+          return HTML::Element::OK; # keep traversing
+        },
+        # post-order callback:
+        undef
+      ],
+      1, # don't call the callbacks for text nodes
+    );
+  }
+
+or you can just be simple and clear (and not have to understand the
+calling format for C<traverse>) by writing a sub that traverses the
+tree by just calling itself:
+
+  {
+    my $counter = 'x0000';
+    sub give_id {
+      my $x = $_[0];
+      $x->attr('id', $counter++) unless defined $x->attr('id');
+      foreach my $c ($x->content_list) {
+        give_id($c) if ref $c; # ignore text nodes
+      }
+    };
+    give_id($start_node);
+  }
+
+See, isn't that nice and clear?
+
+But, if you really need to know:
+
+=head1 THE TRAVERSE METHOD
+
+The C<traverse()> method is a general object-method for traversing a
+tree or subtree and calling user-specified callbacks.  It accepts the
+following syntaxes:
+
+=over
+
+=item $h->traverse(\&callback)
+
+=item or $h->traverse(\&callback, $ignore_text)
+
+=item or $h->traverse( [\&pre_callback,\&post_callback] , $ignore_text)
+
+=back
+
+These all mean to traverse the element and all of its children.  That
+is, this method starts at node $h, "pre-order visits" $h, traverses its
+children, and then will "post-order visit" $h.  "Visiting" means that
+the callback routine is called, with these arguments:
+
+    $_[0] : the node (element or text segment),
+    $_[1] : a startflag, and
+    $_[2] : the depth
+
+If the $ignore_text parameter is given and true, then the pre-order
+call I<will not> be happen for text content.
+
+The startflag is 1 when we enter a node (i.e., in pre-order calls) and
+0 when we leave the node (in post-order calls).
+
+Note, however, that post-order calls don't happen for nodes that are
+text segments or are elements that are prototypically empty (like "br",
+"hr", etc.).
+
+If we visit text nodes (i.e., unless $ignore_text is given and true),
+then when text nodes are visited, we will also pass two extra
+arguments to the callback:
+
+    $_[3] : the element that's the parent
+             of this text node
+    $_[4] : the index of this text node
+             in its parent's content list
+
+Note that you can specify that the pre-order routine can
+be a different routine from the post-order one:
+
+    $h->traverse( [\&pre_callback,\&post_callback], ...);
+
+You can also specify that no post-order calls are to be made,
+by providing a false value as the post-order routine:
+
+    $h->traverse([ \&pre_callback,0 ], ...);
+
+And similarly for suppressing pre-order callbacks:
+
+    $h->traverse([ 0,\&post_callback ], ...);
+
+Note that these two syntaxes specify the same operation:
+
+    $h->traverse([\&foo,\&foo], ...);
+    $h->traverse( \&foo       , ...);
+
+The return values from calls to your pre- or post-order
+routines are significant, and are used to control recursion
+into the tree.
+
+These are the values you can return, listed in descending order
+of my estimation of their usefulness:
+
+=over
+
+=item HTML::Element::OK, 1, or any other true value
+
+...to keep on traversing.
+
+Note that C<HTML::Element::OK> et
+al are constants.  So if you're running under C<use strict>
+(as I hope you are), and you say:
+C<return HTML::Element::PRUEN>
+the compiler will flag this as an error (an unallowable
+bareword, specifically), whereas if you spell PRUNE correctly,
+the compiler will not complain.
+
+=item undef, 0, '0', '', or HTML::Element::PRUNE
+
+...to block traversing under the current element's content.
+(This is ignored if received from a post-order callback,
+since by then the recursion has already happened.)
+If this is returned by a pre-order callback, no
+post-order callback for the current node will happen.
+(Recall that if your callback exits with just C<return;>,
+it is returning undef -- at least in scalar context, and
+C<traverse> always calls your callbacks in scalar context.)
+
+=item HTML::Element::ABORT
+
+...to abort the whole traversal immediately.
+This is often useful when you're looking for just the first
+node in the tree that meets some criterion of yours.
+
+=item HTML::Element::PRUNE_UP
+
+...to abort continued traversal into this node and its parent
+node.  No post-order callback for the current or parent
+node will happen.
+
+=item HTML::Element::PRUNE_SOFTLY
+
+Like PRUNE, except that the post-order call for the current
+node is not blocked.
+
+=back
+
+Almost every task to do with extracting information from a tree can be
+expressed in terms of traverse operations (usually in only one pass,
+and usually paying attention to only pre-order, or to only
+post-order), or operations based on traversing. (In fact, many of the
+other methods in this class are basically calls to traverse() with
+particular arguments.)
+
+The source code for HTML::Element and HTML::TreeBuilder contain
+several examples of the use of the "traverse" method to gather
+information about the content of trees and subtrees.
+
+(Note: you should not change the structure of a tree I<while> you are
+traversing it.)
+
+[End of documentation for the C<traverse()> method]
+
+=head2 Traversing with Recursive Anonymous Routines
+
+Now, if you've been reading
+I<Structure and Interpretation of Computer Programs> too much, maybe
+you even want a recursive lambda.  Go ahead:
+
+  {
+    my $counter = 'x0000';
+    my $give_id;
+    $give_id = sub {
+      my $x = $_[0];
+      $x->attr('id', $counter++) unless defined $x->attr('id');
+      foreach my $c ($x->content_list) {
+        $give_id->($c) if ref $c; # ignore text nodes
+      }
+    };
+    $give_id->($start_node);
+    undef $give_id;
+  }
+
+It's a bit nutty, and it's I<still> more concise than a call to the
+C<traverse> method!
+
+It is left as an exercise to the reader to figure out how to do the
+same thing without using a C<$give_id> symbol at all.
+
+It is also left as an exercise to the reader to figure out why I
+undefine C<$give_id>, above; and why I could achieved the same effect
+with any of:
+
+    $give_id = 'I like pie!';
+   # or...
+    $give_id = [];
+   # or even;
+    $give_id = sub { print "Mmmm pie!\n" };
+
+But not:
+
+    $give_id = sub { print "I'm $give_id and I like pie!\n" };
+   # nor...
+    $give_id = \$give_id;
+   # nor...
+    $give_id = { 'pie' => \$give_id, 'mode' => 'a la' };
+
+=head2 Doing Recursive Things Iteratively
+
+Note that you may at times see an iterative implementation of
+pre-order traversal, like so:
+
+   {
+     my @to_do = ($tree); # start-node
+     while(@to_do) {
+       my $this = shift @to_do;
+
+       # "Visit" the node:
+       $this->attr('id', $counter++)
+        unless defined $this->attr('id');
+
+       unshift @to_do, grep ref $_, $this->content_list;
+        # Put children on the stack -- they'll be visited next
+     }
+   }
+
+This can I<under certain circumstances> be more efficient than just a
+normal recursive routine, but at the cost of being rather obscure.  It
+gains efficiency by avoiding the overhead of function-calling, but
+since there are several method dispatches however you do it (to
+C<attr> and C<content_list>), the overhead for a simple function call
+is insignificant.
+
+=head2 Pruning and Whatnot
+
+The C<traverse> method does have the fairly neat features of
+the C<ABORT>, C<PRUNE_UP> and C<PRUNE_SOFTLY> signals.  None of these
+can be implemented I<totally> straightforwardly with recursive
+routines, but it is quite possible.  C<ABORT>-like behavior can be
+implemented either with using non-local returning with C<eval>/C<die>:
+
+  my $died_on; # if you need to know where...
+  sub thing {
+    ... visits $_[0]...
+    ... maybe set $died_on to $_[0] and die "ABORT_TRAV" ...
+    ... else call thing($child) for each child...
+    ...any post-order visiting $_[0]...
+  }
+  eval { thing($node) };
+  if($@) {
+    if($@ =~ m<^ABORT_TRAV>) {
+      ...it died (aborted) on $died_on...
+    } else {
+      die $@; # some REAL error happened
+    }
+  }
+
+or you can just do it with flags:
+
+  my($abort_flag, $died_on);
+  sub thing {
+    ... visits $_[0]...
+    ... maybe set $abort_flag = 1; $died_on = $_[0]; return;
+    foreach my $c ($_[0]->content_list) {
+      thing($c);
+      return if $abort_flag;
+    }
+    ...any post-order visiting $_[0]...
+    return;
+  }
+
+  $abort_flag = $died_on = undef;
+  thing($node);
+  ...if defined $abort_flag, it died on $died_on
+
+=head1 SEE ALSO
+
+L<HTML::Element>
+
+=head1 AUTHOR
+
+Current maintainers:
+
+=over
+
+=item * Christopher J. Madsen S<C<< <perl AT cjmweb.net> >>>
+
+=item * Jeff Fearn S<C<< <jfearn AT cpan.org> >>>
+
+=back
+
+Original HTML-Tree author:
+
+=over
+
+=item * Gisle Aas
+
+=back
+
+Former maintainers:
+
+=over
+
+=item * Sean M. Burke
+
+=item * Andy Lester
+
+=item * Pete Krawczyk S<C<< <petek AT cpan.org> >>>
+
+=back
+
+You can follow or contribute to HTML-Tree's development at
+L<< https://github.com/kentfredric/HTML-Tree >>.
+
+=head1 COPYRIGHT
+
+Copyright 2000,2001 Sean M. Burke
+
+=cut
--- a/database/perl/vendor/lib/HTML/Entities.pm
+++ b/database/perl/vendor/lib/HTML/Entities.pm
@@ -0,0 +1,482 @@
+package HTML::Entities;
+
+=encoding utf8
+
+=head1 NAME
+
+HTML::Entities - Encode or decode strings with HTML entities
+
+=head1 SYNOPSIS
+
+ use HTML::Entities;
+
+ $a = "V&aring;re norske tegn b&oslash;r &#230res";
+ decode_entities($a);
+ encode_entities($a, "\200-\377");
+
+For example, this:
+
+ $input = "vis-à-vis Beyoncé's naïve\npapier-mâché résumé";
+ print encode_entities($input), "\n"
+
+Prints this out:
+
+ vis-&agrave;-vis Beyonc&eacute;'s na&iuml;ve
+ papier-m&acirc;ch&eacute; r&eacute;sum&eacute;
+
+=head1 DESCRIPTION
+
+This module deals with encoding and decoding of strings with HTML
+character entities.  The module provides the following functions:
+
+=over 4
+
+=item decode_entities( $string, ... )
+
+This routine replaces HTML entities found in the $string with the
+corresponding Unicode character.  Unrecognized entities are left alone.
+
+If multiple strings are provided as argument they are each decoded
+separately and the same number of strings are returned.
+
+If called in void context the arguments are decoded in-place.
+
+This routine is exported by default.
+
+=item _decode_entities( $string, \%entity2char )
+
+=item _decode_entities( $string, \%entity2char, $expand_prefix )
+
+This will in-place replace HTML entities in $string.  The %entity2char
+hash must be provided.  Named entities not found in the %entity2char
+hash are left alone.  Numeric entities are expanded unless their value
+overflow.
+
+The keys in %entity2char are the entity names to be expanded and their
+values are what they should expand into.  The values do not have to be
+single character strings.  If a key has ";" as suffix,
+then occurrences in $string are only expanded if properly terminated
+with ";".  Entities without ";" will be expanded regardless of how
+they are terminated for compatibility with how common browsers treat
+entities in the Latin-1 range.
+
+If $expand_prefix is TRUE then entities without trailing ";" in
+%entity2char will even be expanded as a prefix of a longer
+unrecognized name.  The longest matching name in %entity2char will be
+used. This is mainly present for compatibility with an MSIE
+misfeature.
+
+   $string = "foo&nbspbar";
+   _decode_entities($string, { nb => "@", nbsp => "\xA0" }, 1);
+   print $string;  # will print "foo bar"
+
+This routine is exported by default.
+
+=item encode_entities( $string )
+
+=item encode_entities( $string, $unsafe_chars )
+
+This routine replaces unsafe characters in $string with their entity
+representation. A second argument can be given to specify which characters to
+consider unsafe.  The unsafe characters is specified using the regular
+expression character class syntax (what you find within brackets in regular
+expressions).
+
+The default set of characters to encode are control chars, high-bit chars, and
+the C<< < >>, C<< & >>, C<< > >>, C<< ' >> and C<< " >> characters.  But this,
+for example, would encode I<just> the C<< < >>, C<< & >>, C<< > >>, and C<< "
+>> characters:
+
+  $encoded = encode_entities($input, '<>&"');
+
+and this would only encode non-plain ASCII:
+
+  $encoded = encode_entities($input, '^\n\x20-\x25\x27-\x7e');
+
+This routine is exported by default.
+
+=item encode_entities_numeric( $string )
+
+=item encode_entities_numeric( $string, $unsafe_chars )
+
+This routine works just like encode_entities, except that the replacement
+entities are always C<&#xI<hexnum>;> and never C<&I<entname>;>.  For
+example, C<encode_entities("r\xF4le")> returns "r&ocirc;le", but
+C<encode_entities_numeric("r\xF4le")> returns "r&#xF4;le".
+
+This routine is I<not> exported by default.  But you can always
+export it with C<use HTML::Entities qw(encode_entities_numeric);>
+or even C<use HTML::Entities qw(:DEFAULT encode_entities_numeric);>
+
+=back
+
+All these routines modify the string passed as the first argument, if
+called in a void context.  In scalar and array contexts, the encoded or
+decoded string is returned (without changing the input string).
+
+If you prefer not to import these routines into your namespace, you can
+call them as:
+
+  use HTML::Entities ();
+  $decoded = HTML::Entities::decode($a);
+  $encoded = HTML::Entities::encode($a);
+  $encoded = HTML::Entities::encode_numeric($a);
+
+The module can also export the %char2entity and the %entity2char
+hashes, which contain the mapping from all characters to the
+corresponding entities (and vice versa, respectively).
+
+=head1 COPYRIGHT
+
+Copyright 1995-2006 Gisle Aas. All rights reserved.
+
+This library is free software; you can redistribute it and/or
+modify it under the same terms as Perl itself.
+
+=cut
+
+use strict;
+our $VERSION = '3.75';
+use vars qw(%entity2char %char2entity);
+
+require 5.004;
+require Exporter;
+our @ISA = qw(Exporter);
+
+our @EXPORT = qw(encode_entities decode_entities _decode_entities);
+our @EXPORT_OK = qw(%entity2char %char2entity encode_entities_numeric);
+
+sub Version { $VERSION; }
+
+require HTML::Parser;  # for fast XS implemented decode_entities
+
+
+%entity2char = (
+ # Some normal chars that have special meaning in SGML context
+ amp    => '&',  # ampersand
+'gt'    => '>',  # greater than
+'lt'    => '<',  # less than
+ quot   => '"',  # double quote
+ apos   => "'",  # single quote
+
+ # PUBLIC ISO 8879-1986//ENTITIES Added Latin 1//EN//HTML
+ AElig	=> chr(198),  # capital AE diphthong (ligature)
+ Aacute	=> chr(193),  # capital A, acute accent
+ Acirc	=> chr(194),  # capital A, circumflex accent
+ Agrave	=> chr(192),  # capital A, grave accent
+ Aring	=> chr(197),  # capital A, ring
+ Atilde	=> chr(195),  # capital A, tilde
+ Auml	=> chr(196),  # capital A, dieresis or umlaut mark
+ Ccedil	=> chr(199),  # capital C, cedilla
+ ETH	=> chr(208),  # capital Eth, Icelandic
+ Eacute	=> chr(201),  # capital E, acute accent
+ Ecirc	=> chr(202),  # capital E, circumflex accent
+ Egrave	=> chr(200),  # capital E, grave accent
+ Euml	=> chr(203),  # capital E, dieresis or umlaut mark
+ Iacute	=> chr(205),  # capital I, acute accent
+ Icirc	=> chr(206),  # capital I, circumflex accent
+ Igrave	=> chr(204),  # capital I, grave accent
+ Iuml	=> chr(207),  # capital I, dieresis or umlaut mark
+ Ntilde	=> chr(209),  # capital N, tilde
+ Oacute	=> chr(211),  # capital O, acute accent
+ Ocirc	=> chr(212),  # capital O, circumflex accent
+ Ograve	=> chr(210),  # capital O, grave accent
+ Oslash	=> chr(216),  # capital O, slash
+ Otilde	=> chr(213),  # capital O, tilde
+ Ouml	=> chr(214),  # capital O, dieresis or umlaut mark
+ THORN	=> chr(222),  # capital THORN, Icelandic
+ Uacute	=> chr(218),  # capital U, acute accent
+ Ucirc	=> chr(219),  # capital U, circumflex accent
+ Ugrave	=> chr(217),  # capital U, grave accent
+ Uuml	=> chr(220),  # capital U, dieresis or umlaut mark
+ Yacute	=> chr(221),  # capital Y, acute accent
+ aacute	=> chr(225),  # small a, acute accent
+ acirc	=> chr(226),  # small a, circumflex accent
+ aelig	=> chr(230),  # small ae diphthong (ligature)
+ agrave	=> chr(224),  # small a, grave accent
+ aring	=> chr(229),  # small a, ring
+ atilde	=> chr(227),  # small a, tilde
+ auml	=> chr(228),  # small a, dieresis or umlaut mark
+ ccedil	=> chr(231),  # small c, cedilla
+ eacute	=> chr(233),  # small e, acute accent
+ ecirc	=> chr(234),  # small e, circumflex accent
+ egrave	=> chr(232),  # small e, grave accent
+ eth	=> chr(240),  # small eth, Icelandic
+ euml	=> chr(235),  # small e, dieresis or umlaut mark
+ iacute	=> chr(237),  # small i, acute accent
+ icirc	=> chr(238),  # small i, circumflex accent
+ igrave	=> chr(236),  # small i, grave accent
+ iuml	=> chr(239),  # small i, dieresis or umlaut mark
+ ntilde	=> chr(241),  # small n, tilde
+ oacute	=> chr(243),  # small o, acute accent
+ ocirc	=> chr(244),  # small o, circumflex accent
+ ograve	=> chr(242),  # small o, grave accent
+ oslash	=> chr(248),  # small o, slash
+ otilde	=> chr(245),  # small o, tilde
+ ouml	=> chr(246),  # small o, dieresis or umlaut mark
+ szlig	=> chr(223),  # small sharp s, German (sz ligature)
+ thorn	=> chr(254),  # small thorn, Icelandic
+ uacute	=> chr(250),  # small u, acute accent
+ ucirc	=> chr(251),  # small u, circumflex accent
+ ugrave	=> chr(249),  # small u, grave accent
+ uuml	=> chr(252),  # small u, dieresis or umlaut mark
+ yacute	=> chr(253),  # small y, acute accent
+ yuml	=> chr(255),  # small y, dieresis or umlaut mark
+
+ # Some extra Latin 1 chars that are listed in the HTML3.2 draft (21-May-96)
+ copy   => chr(169),  # copyright sign
+ reg    => chr(174),  # registered sign
+ nbsp   => chr(160),  # non breaking space
+
+ # Additional ISO-8859/1 entities listed in rfc1866 (section 14)
+ iexcl  => chr(161),
+ cent   => chr(162),
+ pound  => chr(163),
+ curren => chr(164),
+ yen    => chr(165),
+ brvbar => chr(166),
+ sect   => chr(167),
+ uml    => chr(168),
+ ordf   => chr(170),
+ laquo  => chr(171),
+'not'   => chr(172),    # not is a keyword in perl
+ shy    => chr(173),
+ macr   => chr(175),
+ deg    => chr(176),
+ plusmn => chr(177),
+ sup1   => chr(185),
+ sup2   => chr(178),
+ sup3   => chr(179),
+ acute  => chr(180),
+ micro  => chr(181),
+ para   => chr(182),
+ middot => chr(183),
+ cedil  => chr(184),
+ ordm   => chr(186),
+ raquo  => chr(187),
+ frac14 => chr(188),
+ frac12 => chr(189),
+ frac34 => chr(190),
+ iquest => chr(191),
+'times' => chr(215),    # times is a keyword in perl
+ divide => chr(247),
+
+ ( $] > 5.007 ? (
+  'OElig;'    => chr(338),
+  'oelig;'    => chr(339),
+  'Scaron;'   => chr(352),
+  'scaron;'   => chr(353),
+  'Yuml;'     => chr(376),
+  'fnof;'     => chr(402),
+  'circ;'     => chr(710),
+  'tilde;'    => chr(732),
+  'Alpha;'    => chr(913),
+  'Beta;'     => chr(914),
+  'Gamma;'    => chr(915),
+  'Delta;'    => chr(916),
+  'Epsilon;'  => chr(917),
+  'Zeta;'     => chr(918),
+  'Eta;'      => chr(919),
+  'Theta;'    => chr(920),
+  'Iota;'     => chr(921),
+  'Kappa;'    => chr(922),
+  'Lambda;'   => chr(923),
+  'Mu;'       => chr(924),
+  'Nu;'       => chr(925),
+  'Xi;'       => chr(926),
+  'Omicron;'  => chr(927),
+  'Pi;'       => chr(928),
+  'Rho;'      => chr(929),
+  'Sigma;'    => chr(931),
+  'Tau;'      => chr(932),
+  'Upsilon;'  => chr(933),
+  'Phi;'      => chr(934),
+  'Chi;'      => chr(935),
+  'Psi;'      => chr(936),
+  'Omega;'    => chr(937),
+  'alpha;'    => chr(945),
+  'beta;'     => chr(946),
+  'gamma;'    => chr(947),
+  'delta;'    => chr(948),
+  'epsilon;'  => chr(949),
+  'zeta;'     => chr(950),
+  'eta;'      => chr(951),
+  'theta;'    => chr(952),
+  'iota;'     => chr(953),
+  'kappa;'    => chr(954),
+  'lambda;'   => chr(955),
+  'mu;'       => chr(956),
+  'nu;'       => chr(957),
+  'xi;'       => chr(958),
+  'omicron;'  => chr(959),
+  'pi;'       => chr(960),
+  'rho;'      => chr(961),
+  'sigmaf;'   => chr(962),
+  'sigma;'    => chr(963),
+  'tau;'      => chr(964),
+  'upsilon;'  => chr(965),
+  'phi;'      => chr(966),
+  'chi;'      => chr(967),
+  'psi;'      => chr(968),
+  'omega;'    => chr(969),
+  'thetasym;' => chr(977),
+  'upsih;'    => chr(978),
+  'piv;'      => chr(982),
+  'ensp;'     => chr(8194),
+  'emsp;'     => chr(8195),
+  'thinsp;'   => chr(8201),
+  'zwnj;'     => chr(8204),
+  'zwj;'      => chr(8205),
+  'lrm;'      => chr(8206),
+  'rlm;'      => chr(8207),
+  'ndash;'    => chr(8211),
+  'mdash;'    => chr(8212),
+  'lsquo;'    => chr(8216),
+  'rsquo;'    => chr(8217),
+  'sbquo;'    => chr(8218),
+  'ldquo;'    => chr(8220),
+  'rdquo;'    => chr(8221),
+  'bdquo;'    => chr(8222),
+  'dagger;'   => chr(8224),
+  'Dagger;'   => chr(8225),
+  'bull;'     => chr(8226),
+  'hellip;'   => chr(8230),
+  'permil;'   => chr(8240),
+  'prime;'    => chr(8242),
+  'Prime;'    => chr(8243),
+  'lsaquo;'   => chr(8249),
+  'rsaquo;'   => chr(8250),
+  'oline;'    => chr(8254),
+  'frasl;'    => chr(8260),
+  'euro;'     => chr(8364),
+  'image;'    => chr(8465),
+  'weierp;'   => chr(8472),
+  'real;'     => chr(8476),
+  'trade;'    => chr(8482),
+  'alefsym;'  => chr(8501),
+  'larr;'     => chr(8592),
+  'uarr;'     => chr(8593),
+  'rarr;'     => chr(8594),
+  'darr;'     => chr(8595),
+  'harr;'     => chr(8596),
+  'crarr;'    => chr(8629),
+  'lArr;'     => chr(8656),
+  'uArr;'     => chr(8657),
+  'rArr;'     => chr(8658),
+  'dArr;'     => chr(8659),
+  'hArr;'     => chr(8660),
+  'forall;'   => chr(8704),
+  'part;'     => chr(8706),
+  'exist;'    => chr(8707),
+  'empty;'    => chr(8709),
+  'nabla;'    => chr(8711),
+  'isin;'     => chr(8712),
+  'notin;'    => chr(8713),
+  'ni;'       => chr(8715),
+  'prod;'     => chr(8719),
+  'sum;'      => chr(8721),
+  'minus;'    => chr(8722),
+  'lowast;'   => chr(8727),
+  'radic;'    => chr(8730),
+  'prop;'     => chr(8733),
+  'infin;'    => chr(8734),
+  'ang;'      => chr(8736),
+  'and;'      => chr(8743),
+  'or;'       => chr(8744),
+  'cap;'      => chr(8745),
+  'cup;'      => chr(8746),
+  'int;'      => chr(8747),
+  'there4;'   => chr(8756),
+  'sim;'      => chr(8764),
+  'cong;'     => chr(8773),
+  'asymp;'    => chr(8776),
+  'ne;'       => chr(8800),
+  'equiv;'    => chr(8801),
+  'le;'       => chr(8804),
+  'ge;'       => chr(8805),
+  'sub;'      => chr(8834),
+  'sup;'      => chr(8835),
+  'nsub;'     => chr(8836),
+  'sube;'     => chr(8838),
+  'supe;'     => chr(8839),
+  'oplus;'    => chr(8853),
+  'otimes;'   => chr(8855),
+  'perp;'     => chr(8869),
+  'sdot;'     => chr(8901),
+  'lceil;'    => chr(8968),
+  'rceil;'    => chr(8969),
+  'lfloor;'   => chr(8970),
+  'rfloor;'   => chr(8971),
+  'lang;'     => chr(9001),
+  'rang;'     => chr(9002),
+  'loz;'      => chr(9674),
+  'spades;'   => chr(9824),
+  'clubs;'    => chr(9827),
+  'hearts;'   => chr(9829),
+  'diams;'    => chr(9830),
+ ) : ())
+);
+
+
+# Make the opposite mapping
+while (my($entity, $char) = each(%entity2char)) {
+    $entity =~ s/;\z//;
+    $char2entity{$char} = "&$entity;";
+}
+delete $char2entity{"'"};  # only one-way decoding
+
+# Fill in missing entities
+for (0 .. 255) {
+    next if exists $char2entity{chr($_)};
+    $char2entity{chr($_)} = "&#$_;";
+}
+
+my %subst;  # compiled encoding regexps
+
+sub encode_entities
+{
+    return undef unless defined $_[0];
+    my $ref;
+    if (defined wantarray) {
+	my $x = $_[0];
+	$ref = \$x;     # copy
+    } else {
+	$ref = \$_[0];  # modify in-place
+    }
+    if (defined $_[1] and length $_[1]) {
+	unless (exists $subst{$_[1]}) {
+	    # Because we can't compile regex we fake it with a cached sub
+	    my $chars = $_[1];
+	    $chars =~ s,(?<!\\)([]/]),\\$1,g;
+	    $chars =~ s,(?<!\\)\\\z,\\\\,;
+	    my $code = "sub {\$_[0] =~ s/([$chars])/\$char2entity{\$1} || num_entity(\$1)/ge; }";
+	    $subst{$_[1]} = eval $code;
+	    die( $@ . " while trying to turn range: \"$_[1]\"\n "
+	      . "into code: $code\n "
+	    ) if $@;
+	}
+	&{$subst{$_[1]}}($$ref);
+    } else {
+	# Encode control chars, high bit chars and '<', '&', '>', ''' and '"'
+	$$ref =~ s/([^\n\r\t !\#\$%\(-;=?-~])/$char2entity{$1} || num_entity($1)/ge;
+    }
+    $$ref;
+}
+
+sub encode_entities_numeric {
+    local %char2entity;
+    return &encode_entities;   # a goto &encode_entities wouldn't work
+}
+
+
+sub num_entity {
+    sprintf "&#x%X;", ord($_[0]);
+}
+
+# Set up aliases
+*encode = \&encode_entities;
+*encode_numeric = \&encode_entities_numeric;
+*encode_numerically = \&encode_entities_numeric;
+*decode = \&decode_entities;
+
+1;
--- a/database/perl/vendor/lib/HTML/Filter.pm
+++ b/database/perl/vendor/lib/HTML/Filter.pm
@@ -0,0 +1,110 @@
+package HTML::Filter;
+
+use strict;
+
+require HTML::Parser;
+our @ISA = qw(HTML::Parser);
+our $VERSION = '3.75';
+
+sub declaration { $_[0]->output("<!$_[1]>")     }
+sub process     { $_[0]->output($_[2])          }
+sub comment     { $_[0]->output("<!--$_[1]-->") }
+sub start       { $_[0]->output($_[4])          }
+sub end         { $_[0]->output($_[2])          }
+sub text        { $_[0]->output($_[1])          }
+
+sub output      { print $_[1] }
+
+1;
+
+__END__
+
+=head1 NAME
+
+HTML::Filter - Filter HTML text through the parser
+
+=head1 NOTE
+
+B<This module is deprecated.> The C<HTML::Parser> now provides the
+functionally of C<HTML::Filter> much more efficiently with the
+C<default> handler.
+
+=head1 SYNOPSIS
+
+ require HTML::Filter;
+ $p = HTML::Filter->new->parse_file("index.html");
+
+=head1 DESCRIPTION
+
+C<HTML::Filter> is an HTML parser that by default prints the
+original text of each HTML element (a slow version of cat(1) basically).
+The callback methods may be overridden to modify the filtering for some
+HTML elements and you can override output() method which is called to
+print the HTML text.
+
+C<HTML::Filter> is a subclass of C<HTML::Parser>. This means that
+the document should be given to the parser by calling the $p->parse()
+or $p->parse_file() methods.
+
+=head1 EXAMPLES
+
+The first example is a filter that will remove all comments from an
+HTML file.  This is achieved by simply overriding the comment method
+to do nothing.
+
+  package CommentStripper;
+  require HTML::Filter;
+  @ISA=qw(HTML::Filter);
+  sub comment { }  # ignore comments
+
+The second example shows a filter that will remove any E<lt>TABLE>s
+found in the HTML file.  We specialize the start() and end() methods
+to count table tags and then make output not happen when inside a
+table.
+
+  package TableStripper;
+  require HTML::Filter;
+  @ISA=qw(HTML::Filter);
+  sub start
+  {
+     my $self = shift;
+     $self->{table_seen}++ if $_[0] eq "table";
+     $self->SUPER::start(@_);
+  }
+
+  sub end
+  {
+     my $self = shift;
+     $self->SUPER::end(@_);
+     $self->{table_seen}-- if $_[0] eq "table";
+  }
+
+  sub output
+  {
+      my $self = shift;
+      unless ($self->{table_seen}) {
+	  $self->SUPER::output(@_);
+      }
+  }
+
+If you want to collect the parsed text internally you might want to do
+something like this:
+
+  package FilterIntoString;
+  require HTML::Filter;
+  @ISA=qw(HTML::Filter);
+  sub output { push(@{$_[0]->{fhtml}}, $_[1]) }
+  sub filtered_html { join("", @{$_[0]->{fhtml}}) }
+
+=head1 SEE ALSO
+
+L<HTML::Parser>
+
+=head1 COPYRIGHT
+
+Copyright 1997-1999 Gisle Aas.
+
+This library is free software; you can redistribute it and/or
+modify it under the same terms as Perl itself.
+
+=cut
--- a/database/perl/vendor/lib/HTML/Form.pm
+++ b/database/perl/vendor/lib/HTML/Form.pm
--- a/database/perl/vendor/lib/HTML/HeadParser.pm
+++ b/database/perl/vendor/lib/HTML/HeadParser.pm
@@ -0,0 +1,314 @@
+package HTML::HeadParser;
+
+=head1 NAME
+
+HTML::HeadParser - Parse <HEAD> section of a HTML document
+
+=head1 SYNOPSIS
+
+ require HTML::HeadParser;
+ $p = HTML::HeadParser->new;
+ $p->parse($text) and  print "not finished";
+
+ $p->header('Title')          # to access <title>....</title>
+ $p->header('Content-Base')   # to access <base href="http://...">
+ $p->header('Foo')            # to access <meta http-equiv="Foo" content="...">
+ $p->header('X-Meta-Author')  # to access <meta name="author" content="...">
+ $p->header('X-Meta-Charset') # to access <meta charset="...">
+
+=head1 DESCRIPTION
+
+The C<HTML::HeadParser> is a specialized (and lightweight)
+C<HTML::Parser> that will only parse the E<lt>HEAD>...E<lt>/HEAD>
+section of an HTML document.  The parse() method
+will return a FALSE value as soon as some E<lt>BODY> element or body
+text are found, and should not be called again after this.
+
+Note that the C<HTML::HeadParser> might get confused if raw undecoded
+UTF-8 is passed to the parse() method.  Make sure the strings are
+properly decoded before passing them on.
+
+The C<HTML::HeadParser> keeps a reference to a header object, and the
+parser will update this header object as the various elements of the
+E<lt>HEAD> section of the HTML document are recognized.  The following
+header fields are affected:
+
+=over 4
+
+=item Content-Base:
+
+The I<Content-Base> header is initialized from the E<lt>base
+href="..."> element.
+
+=item Title:
+
+The I<Title> header is initialized from the E<lt>title>...E<lt>/title>
+element.
+
+=item Isindex:
+
+The I<Isindex> header will be added if there is a E<lt>isindex>
+element in the E<lt>head>.  The header value is initialized from the
+I<prompt> attribute if it is present.  If no I<prompt> attribute is
+given it will have '?' as the value.
+
+=item X-Meta-Foo:
+
+All E<lt>meta> elements containing a C<name> attribute will result in
+headers using the prefix C<X-Meta-> appended with the value of the
+C<name> attribute as the name of the header, and the value of the
+C<content> attribute as the pushed header value.
+
+E<lt>meta> elements containing a C<http-equiv> attribute will result
+in headers as in above, but without the C<X-Meta-> prefix in the
+header name.
+
+E<lt>meta> elements containing a C<charset> attribute will result in
+an C<X-Meta-Charset> header, using the value of the C<charset>
+attribute as the pushed header value.
+
+The ':' character can't be represented in header field names, so
+if the meta element contains this char it's substituted with '-'
+before forming the field name.
+
+=back
+
+=head1 METHODS
+
+The following methods (in addition to those provided by the
+superclass) are available:
+
+=over 4
+
+=cut
+
+
+require HTML::Parser;
+our @ISA = qw(HTML::Parser);
+
+use HTML::Entities ();
+
+use strict;
+use vars qw($DEBUG);
+#$DEBUG = 1;
+our $VERSION = '3.75';
+
+=item $hp = HTML::HeadParser->new
+
+=item $hp = HTML::HeadParser->new( $header )
+
+The object constructor.  The optional $header argument should be a
+reference to an object that implement the header() and push_header()
+methods as defined by the C<HTTP::Headers> class.  Normally it will be
+of some class that is a or delegates to the C<HTTP::Headers> class.
+
+If no $header is given C<HTML::HeadParser> will create an
+C<HTTP::Headers> object by itself (initially empty).
+
+=cut
+
+sub new
+{
+    my($class, $header) = @_;
+    unless ($header) {
+	require HTTP::Headers;
+	$header = HTTP::Headers->new;
+    }
+
+    my $self = $class->SUPER::new(api_version => 3,
+				  start_h => ["start", "self,tagname,attr"],
+				  end_h   => ["end",   "self,tagname"],
+				  text_h  => ["text",  "self,text"],
+				  ignore_elements => [qw(script style)],
+				 );
+    $self->{'header'} = $header;
+    $self->{'tag'} = '';   # name of active element that takes textual content
+    $self->{'text'} = '';  # the accumulated text associated with the element
+    $self;
+}
+
+=item $hp->header;
+
+Returns a reference to the header object.
+
+=item $hp->header( $key )
+
+Returns a header value.  It is just a shorter way to write
+C<$hp-E<gt>header-E<gt>header($key)>.
+
+=cut
+
+sub header
+{
+    my $self = shift;
+    return $self->{'header'} unless @_;
+    $self->{'header'}->header(@_);
+}
+
+sub as_string    # legacy
+{
+    my $self = shift;
+    $self->{'header'}->as_string;
+}
+
+sub flush_text   # internal
+{
+    my $self = shift;
+    my $tag  = $self->{'tag'};
+    my $text = $self->{'text'};
+    $text =~ s/^\s+//;
+    $text =~ s/\s+$//;
+    $text =~ s/\s+/ /g;
+    print "FLUSH $tag => '$text'\n"  if $DEBUG;
+    if ($tag eq 'title') {
+	my $decoded;
+	$decoded = utf8::decode($text) if $self->utf8_mode && defined &utf8::decode;
+	HTML::Entities::decode($text);
+	utf8::encode($text) if $decoded;
+	$self->{'header'}->push_header(Title => $text);
+    }
+    $self->{'tag'} = $self->{'text'} = '';
+}
+
+# This is an quote from the HTML3.2 DTD which shows which elements
+# that might be present in a <HEAD>...</HEAD>.  Also note that the
+# <HEAD> tags themselves might be missing:
+#
+# <!ENTITY % head.content "TITLE & ISINDEX? & BASE? & STYLE? &
+#                            SCRIPT* & META* & LINK*">
+#
+# <!ELEMENT HEAD O O  (%head.content)>
+#
+# From HTML 4.01:
+#
+# <!ENTITY % head.misc "SCRIPT|STYLE|META|LINK|OBJECT">
+# <!ENTITY % head.content "TITLE & BASE?">
+# <!ELEMENT HEAD O O (%head.content;) +(%head.misc;)>
+#
+# From HTML 5 as of WD-html5-20090825:
+#
+# One or more elements of metadata content, [...]
+# => base, command, link, meta, noscript, script, style, title
+
+sub start
+{
+    my($self, $tag, $attr) = @_;  # $attr is reference to a HASH
+    print "START[$tag]\n" if $DEBUG;
+    $self->flush_text if $self->{'tag'};
+    if ($tag eq 'meta') {
+	my $key = $attr->{'http-equiv'};
+	if (!defined($key) || !length($key)) {
+	    if ($attr->{name}) {
+		$key = "X-Meta-\u$attr->{name}";
+	    } elsif ($attr->{charset}) { # HTML 5 <meta charset="...">
+		$key = "X-Meta-Charset";
+		$self->{header}->push_header($key => $attr->{charset});
+		return;
+	    } else {
+		return;
+	    }
+	}
+	$key =~ s/:/-/g;
+	$self->{'header'}->push_header($key => $attr->{content});
+    } elsif ($tag eq 'base') {
+	return unless exists $attr->{href};
+	(my $base = $attr->{href}) =~ s/^\s+//; $base =~ s/\s+$//; # HTML5
+	$self->{'header'}->push_header('Content-Base' => $base);
+    } elsif ($tag eq 'isindex') {
+	# This is a non-standard header.  Perhaps we should just ignore
+	# this element
+	$self->{'header'}->push_header(Isindex => $attr->{prompt} || '?');
+    } elsif ($tag =~ /^(?:title|noscript|object|command)$/) {
+	# Just remember tag.  Initialize header when we see the end tag.
+	$self->{'tag'} = $tag;
+    } elsif ($tag eq 'link') {
+	return unless exists $attr->{href};
+	# <link href="http:..." rel="xxx" rev="xxx" title="xxx">
+	my $href = delete($attr->{href});
+	$href =~ s/^\s+//; $href =~ s/\s+$//; # HTML5
+	my $h_val = "<$href>";
+	for (sort keys %{$attr}) {
+	    next if $_ eq "/";  # XHTML junk
+	    $h_val .= qq(; $_="$attr->{$_}");
+	}
+	$self->{'header'}->push_header(Link => $h_val);
+    } elsif ($tag eq 'head' || $tag eq 'html') {
+	# ignore
+    } else {
+	 # stop parsing
+	$self->eof;
+    }
+}
+
+sub end
+{
+    my($self, $tag) = @_;
+    print "END[$tag]\n" if $DEBUG;
+    $self->flush_text if $self->{'tag'};
+    $self->eof if $tag eq 'head';
+}
+
+sub text
+{
+    my($self, $text) = @_;
+    print "TEXT[$text]\n" if $DEBUG;
+    unless ($self->{first_chunk}) {
+	# drop Unicode BOM if found
+	if ($self->utf8_mode) {
+	    $text =~ s/^\xEF\xBB\xBF//;
+	}
+	else {
+	    $text =~ s/^\x{FEFF}//;
+	}
+	$self->{first_chunk}++;
+    }
+    my $tag = $self->{tag};
+    if (!$tag && $text =~ /\S/) {
+	# Normal text means start of body
+        $self->eof;
+	return;
+    }
+    return if $tag ne 'title';
+    $self->{'text'} .= $text;
+}
+
+BEGIN {
+    *utf8_mode = sub { 1 } unless HTML::Entities::UNICODE_SUPPORT;
+}
+
+1;
+
+__END__
+
+=back
+
+=head1 EXAMPLE
+
+ $h = HTTP::Headers->new;
+ $p = HTML::HeadParser->new($h);
+ $p->parse(<<EOT);
+ <title>Stupid example</title>
+ <base href="http://www.linpro.no/lwp/">
+ Normal text starts here.
+ EOT
+ undef $p;
+ print $h->title;   # should print "Stupid example"
+
+=head1 SEE ALSO
+
+L<HTML::Parser>, L<HTTP::Headers>
+
+The C<HTTP::Headers> class is distributed as part of the
+I<libwww-perl> package.  If you don't have that distribution installed
+you need to provide the $header argument to the C<HTML::HeadParser>
+constructor with your own object that implements the documented
+protocol.
+
+=head1 COPYRIGHT
+
+Copyright 1996-2001 Gisle Aas. All rights reserved.
+
+This library is free software; you can redistribute it and/or
+modify it under the same terms as Perl itself.
+
+=cut
--- a/database/perl/vendor/lib/HTML/LinkExtor.pm
+++ b/database/perl/vendor/lib/HTML/LinkExtor.pm
@@ -0,0 +1,185 @@
+package HTML::LinkExtor;
+
+require HTML::Parser;
+our @ISA = qw(HTML::Parser);
+our $VERSION = '3.75';
+
+=head1 NAME
+
+HTML::LinkExtor - Extract links from an HTML document
+
+=head1 SYNOPSIS
+
+ require HTML::LinkExtor;
+ $p = HTML::LinkExtor->new(\&cb, "http://www.perl.org/");
+ sub cb {
+     my($tag, %links) = @_;
+     print "$tag @{[%links]}\n";
+ }
+ $p->parse_file("index.html");
+
+=head1 DESCRIPTION
+
+I<HTML::LinkExtor> is an HTML parser that extracts links from an
+HTML document.  The I<HTML::LinkExtor> is a subclass of
+I<HTML::Parser>. This means that the document should be given to the
+parser by calling the $p->parse() or $p->parse_file() methods.
+
+=cut
+
+use strict;
+use HTML::Tagset ();
+
+# legacy (some applications grabs this hash directly)
+use vars qw(%LINK_ELEMENT);
+*LINK_ELEMENT = \%HTML::Tagset::linkElements;
+
+=over 4
+
+=item $p = HTML::LinkExtor->new
+
+=item $p = HTML::LinkExtor->new( $callback )
+
+=item $p = HTML::LinkExtor->new( $callback, $base )
+
+The constructor takes two optional arguments. The first is a reference
+to a callback routine. It will be called as links are found. If a
+callback is not provided, then links are just accumulated internally
+and can be retrieved by calling the $p->links() method.
+
+The $base argument is an optional base URL used to absolutize all URLs found.
+You need to have the I<URI> module installed if you provide $base.
+
+The callback is called with the lowercase tag name as first argument,
+and then all link attributes as separate key/value pairs.  All
+non-link attributes are removed.
+
+=cut
+
+sub new
+{
+    my($class, $cb, $base) = @_;
+    my $self = $class->SUPER::new(
+                    start_h => ["_start_tag", "self,tagname,attr"],
+		    report_tags => [keys %HTML::Tagset::linkElements],
+	       );
+    $self->{extractlink_cb} = $cb;
+    if ($base) {
+	require URI;
+	$self->{extractlink_base} = URI->new($base);
+    }
+    $self;
+}
+
+sub _start_tag
+{
+    my($self, $tag, $attr) = @_;
+
+    my $base = $self->{extractlink_base};
+    my $links = $HTML::Tagset::linkElements{$tag};
+    $links = [$links] unless ref $links;
+
+    my @links;
+    my $a;
+    for $a (@$links) {
+	next unless exists $attr->{$a};
+	(my $link = $attr->{$a}) =~ s/^\s+//; $link =~ s/\s+$//; # HTML5
+	push(@links, $a, $base ? URI->new($link, $base)->abs($base) : $link);
+    }
+    return unless @links;
+    $self->_found_link($tag, @links);
+}
+
+sub _found_link
+{
+    my $self = shift;
+    my $cb = $self->{extractlink_cb};
+    if ($cb) {
+	&$cb(@_);
+    } else {
+	push(@{$self->{'links'}}, [@_]);
+    }
+}
+
+=item $p->links
+
+Returns a list of all links found in the document.  The returned
+values will be anonymous arrays with the following elements:
+
+  [$tag, $attr => $url1, $attr2 => $url2,...]
+
+The $p->links method will also truncate the internal link list.  This
+means that if the method is called twice without any parsing
+between them the second call will return an empty list.
+
+Also note that $p->links will always be empty if a callback routine
+was provided when the I<HTML::LinkExtor> was created.
+
+=cut
+
+sub links
+{
+    my $self = shift;
+    exists($self->{'links'}) ? @{delete $self->{'links'}} : ();
+}
+
+# We override the parse_file() method so that we can clear the links
+# before we start a new file.
+sub parse_file
+{
+    my $self = shift;
+    delete $self->{'links'};
+    $self->SUPER::parse_file(@_);
+}
+
+=back
+
+=head1 EXAMPLE
+
+This is an example showing how you can extract links from a document
+received using LWP:
+
+  use LWP::UserAgent;
+  use HTML::LinkExtor;
+  use URI::URL;
+
+  $url = "http://www.perl.org/";  # for instance
+  $ua = LWP::UserAgent->new;
+
+  # Set up a callback that collect image links
+  my @imgs = ();
+  sub callback {
+     my($tag, %attr) = @_;
+     return if $tag ne 'img';  # we only look closer at <img ...>
+     push(@imgs, values %attr);
+  }
+
+  # Make the parser.  Unfortunately, we don't know the base yet
+  # (it might be different from $url)
+  $p = HTML::LinkExtor->new(\&callback);
+
+  # Request document and parse it as it arrives
+  $res = $ua->request(HTTP::Request->new(GET => $url),
+                      sub {$p->parse($_[0])});
+
+  # Expand all image URLs to absolute ones
+  my $base = $res->base;
+  @imgs = map { $_ = url($_, $base)->abs; } @imgs;
+
+  # Print them out
+  print join("\n", @imgs), "\n";
+
+=head1 SEE ALSO
+
+L<HTML::Parser>, L<HTML::Tagset>, L<LWP>, L<URI::URL>
+
+=head1 COPYRIGHT
+
+Copyright 1996-2001 Gisle Aas.
+
+This library is free software; you can redistribute it and/or
+modify it under the same terms as Perl itself.
+
+=cut
+
+1;
--- a/database/perl/vendor/lib/HTML/Parse.pm
+++ b/database/perl/vendor/lib/HTML/Parse.pm
@@ -0,0 +1,196 @@
+package HTML::Parse;
+
+use 5.008;
+
+#ABSTRACT: Deprecated, a wrapper around HTML::TreeBuilder
+
+
+use warnings;
+use strict;
+
+our $VERSION = '5.07'; # VERSION from OurPkgVersion
+
+use vars qw(@ISA @EXPORT
+    $IMPLICIT_TAGS $IGNORE_UNKNOWN $IGNORE_TEXT $WARN
+);
+
+require Exporter;
+@ISA    = qw(Exporter);
+@EXPORT = qw(parse_html parse_htmlfile);
+
+# Backwards compatability
+$IMPLICIT_TAGS  = 1;
+$IGNORE_UNKNOWN = 1;
+$IGNORE_TEXT    = 0;
+$WARN           = 0;
+
+require HTML::TreeBuilder;
+
+sub parse_html {
+    my $p = $_[1];
+    $p = _new_tree_maker() unless $p;
+    $p->parse( $_[0] );
+}
+
+sub parse_htmlfile {
+    my ( $file, $p ) = @_;
+    my ($HTML);
+    open( $HTML, "<", $file ) or return;
+    $p = _new_tree_maker() unless $p;
+    $p->parse_file($HTML);
+}
+
+sub _new_tree_maker {
+    my $p = HTML::TreeBuilder->new(
+        implicit_tags  => $IMPLICIT_TAGS,
+        ignore_unknown => $IGNORE_UNKNOWN,
+        ignore_text    => $IGNORE_TEXT,
+        'warn'         => $WARN,
+    );
+    $p->strict_comment(1);
+    $p;
+}
+
+1;
+
+__END__
+
+=pod
+
+=head1 NAME
+
+HTML::Parse - Deprecated, a wrapper around HTML::TreeBuilder
+
+=head1 VERSION
+
+This document describes version 5.07 of
+HTML::Parse, released August 31, 2017
+as part of L<HTML-Tree|HTML::Tree>.
+
+=head1 SYNOPSIS
+
+  See the documentation for HTML::TreeBuilder
+
+=head1 DESCRIPTION
+
+Disclaimer: This module is provided only for backwards compatibility
+with earlier versions of this library.  New code should I<not> use
+this module, and should really use the HTML::Parser and
+HTML::TreeBuilder modules directly, instead.
+
+The C<HTML::Parse> module provides functions to parse HTML documents.
+There are two functions exported by this module:
+
+=over 4
+
+=item parse_html($html) or parse_html($html, $obj)
+
+This function is really just a synonym for $obj->parse($html) and $obj
+is assumed to be a subclass of C<HTML::Parser>.  Refer to
+L<HTML::Parser> for more documentation.
+
+If $obj is not specified, the $obj will default to an internally
+created new C<HTML::TreeBuilder> object configured with strict_comment()
+turned on.  That class implements a parser that builds (and is) a HTML
+syntax tree with HTML::Element objects as nodes.
+
+The return value from parse_html() is $obj.
+
+=item parse_htmlfile($file, [$obj])
+
+Same as parse_html(), but pulls the HTML to parse, from the named file.
+
+Returns C<undef> if the file could not be opened, or $obj otherwise.
+
+=back
+
+When a C<HTML::TreeBuilder> object is created, the following variables
+control how parsing takes place:
+
+=over 4
+
+=item $HTML::Parse::IMPLICIT_TAGS
+
+Setting this variable to true will instruct the parser to try to
+deduce implicit elements and implicit end tags.  If this variable is
+false you get a parse tree that just reflects the text as it stands.
+Might be useful for quick & dirty parsing.  Default is true.
+
+Implicit elements have the implicit() attribute set.
+
+=item $HTML::Parse::IGNORE_UNKNOWN
+
+This variable contols whether unknow tags should be represented as
+elements in the parse tree.  Default is true.
+
+=item $HTML::Parse::IGNORE_TEXT
+
+Do not represent the text content of elements.  This saves space if
+all you want is to examine the structure of the document.  Default is
+false.
+
+=item $HTML::Parse::WARN
+
+Call warn() with an appropriate message for syntax errors.  Default is
+false.
+
+=back
+
+=head1 REMEMBER!
+
+HTML::TreeBuilder objects should be explicitly destroyed when you're
+finished with them.  See L<HTML::TreeBuilder>.
+
+=head1 SEE ALSO
+
+L<HTML::Parser>, L<HTML::TreeBuilder>, L<HTML::Element>
+
+=head1 AUTHOR
+
+Current maintainers:
+
+=over
+
+=item * Christopher J. Madsen S<C<< <perl AT cjmweb.net> >>>
+
+=item * Jeff Fearn S<C<< <jfearn AT cpan.org> >>>
+
+=back
+
+Original HTML-Tree author:
+
+=over
+
+=item * Gisle Aas
+
+=back
+
+Former maintainers:
+
+=over
+
+=item * Sean M. Burke
+
+=item * Andy Lester
+
+=item * Pete Krawczyk S<C<< <petek AT cpan.org> >>>
+
+=back
+
+You can follow or contribute to HTML-Tree's development at
+L<< https://github.com/kentfredric/HTML-Tree >>.
+
+=head1 COPYRIGHT AND LICENSE
+
+Copyright 1995-1998 Gisle Aas, 1999-2004 Sean M. Burke,
+2005 Andy Lester, 2006 Pete Krawczyk, 2010 Jeff Fearn,
+2012 Christopher J. Madsen.
+
+This library is free software; you can redistribute it and/or
+modify it under the same terms as Perl itself.
+
+The programs in this library are distributed in the hope that they
+will be useful, but without any warranty; without even the implied
+warranty of merchantability or fitness for a particular purpose.
+
+=cut
--- a/database/perl/vendor/lib/HTML/Parser.pm
+++ b/database/perl/vendor/lib/HTML/Parser.pm
--- a/database/perl/vendor/lib/HTML/PullParser.pm
+++ b/database/perl/vendor/lib/HTML/PullParser.pm
@@ -0,0 +1,210 @@
+package HTML::PullParser;
+
+use strict;
+
+require HTML::Parser;
+our @ISA = qw(HTML::Parser);
+our $VERSION = '3.75';
+
+use Carp ();
+
+sub new
+{
+    my($class, %cnf) = @_;
+
+    # Construct argspecs for the various events
+    my %argspec;
+    for (qw(start end text declaration comment process default)) {
+	my $tmp = delete $cnf{$_};
+	next unless defined $tmp;
+	$argspec{$_} = $tmp;
+    }
+    Carp::croak("Info not collected for any events")
+	  unless %argspec;
+
+    my $file = delete $cnf{file};
+    my $doc  = delete $cnf{doc};
+    Carp::croak("Can't parse from both 'doc' and 'file' at the same time")
+	  if defined($file) && defined($doc);
+    Carp::croak("No 'doc' or 'file' given to parse from")
+	  unless defined($file) || defined($doc);
+
+    # Create object
+    $cnf{api_version} = 3;
+    my $self = $class->SUPER::new(%cnf);
+
+    my $accum = $self->{pullparser_accum} = [];
+    while (my($event, $argspec) = each %argspec) {
+	$self->SUPER::handler($event => $accum, $argspec);
+    }
+
+    if (defined $doc) {
+	$self->{pullparser_str_ref} = ref($doc) ? $doc : \$doc;
+	$self->{pullparser_str_pos} = 0;
+    }
+    else {
+	if (!ref($file) && ref(\$file) ne "GLOB") {
+	    require IO::File;
+	    $file = IO::File->new($file, "r") || return;
+	}
+
+	$self->{pullparser_file} = $file;
+    }
+    $self;
+}
+
+
+sub handler
+{
+    Carp::croak("Can't set handlers for HTML::PullParser");
+}
+
+
+sub get_token
+{
+    my $self = shift;
+    while (!@{$self->{pullparser_accum}} && !$self->{pullparser_eof}) {
+	if (my $f = $self->{pullparser_file}) {
+	    # must try to parse more from the file
+	    my $buf;
+	    if (read($f, $buf, 512)) {
+		$self->parse($buf);
+	    } else {
+		$self->eof;
+		$self->{pullparser_eof}++;
+		delete $self->{pullparser_file};
+	    }
+	}
+	elsif (my $sref = $self->{pullparser_str_ref}) {
+	    # must try to parse more from the scalar
+	    my $pos = $self->{pullparser_str_pos};
+	    my $chunk = substr($$sref, $pos, 512);
+	    $self->parse($chunk);
+	    $pos += length($chunk);
+	    if ($pos < length($$sref)) {
+		$self->{pullparser_str_pos} = $pos;
+	    }
+	    else {
+		$self->eof;
+		$self->{pullparser_eof}++;
+		delete $self->{pullparser_str_ref};
+		delete $self->{pullparser_str_pos};
+	    }
+	}
+	else {
+	    die;
+	}
+    }
+    shift @{$self->{pullparser_accum}};
+}
+
+
+sub unget_token
+{
+    my $self = shift;
+    unshift @{$self->{pullparser_accum}}, @_;
+    $self;
+}
+
+1;
+
+
+__END__
+
+=head1 NAME
+
+HTML::PullParser - Alternative HTML::Parser interface
+
+=head1 SYNOPSIS
+
+ use HTML::PullParser;
+
+ $p = HTML::PullParser->new(file => "index.html",
+                            start => 'event, tagname, @attr',
+                            end   => 'event, tagname',
+                            ignore_elements => [qw(script style)],
+                           ) || die "Can't open: $!";
+ while (my $token = $p->get_token) {
+     #...do something with $token
+ }
+
+=head1 DESCRIPTION
+
+The HTML::PullParser is an alternative interface to the HTML::Parser class.
+It basically turns the HTML::Parser inside out.  You associate a file
+(or any IO::Handle object or string) with the parser at construction time and
+then repeatedly call $parser->get_token to obtain the tags and text
+found in the parsed document.
+
+The following methods are provided:
+
+=over 4
+
+=item $p = HTML::PullParser->new( file => $file, %options )
+
+=item $p = HTML::PullParser->new( doc => \$doc, %options )
+
+A C<HTML::PullParser> can be made to parse from either a file or a
+literal document based on whether the C<file> or C<doc> option is
+passed to the parser's constructor.
+
+The C<file> passed in can either be a file name or a file handle
+object.  If a file name is passed, and it can't be opened for reading,
+then the constructor will return an undefined value and $!  will tell
+you why it failed.  Otherwise the argument is taken to be some object
+that the C<HTML::PullParser> can read() from when it needs more data.
+The stream will be read() until EOF, but not closed.
+
+A C<doc> can be passed plain or as a reference
+to a scalar.  If a reference is passed then the value of this scalar
+should not be changed before all tokens have been extracted.
+
+Next the information to be returned for the different token types must
+be set up.  This is done by simply associating an argspec (as defined
+in L<HTML::Parser>) with the events you have an interest in.  For
+instance, if you want C<start> tokens to be reported as the string
+C<'S'> followed by the tagname and the attributes you might pass an
+C<start>-option like this:
+
+   $p = HTML::PullParser->new(
+          doc   => $document_to_parse,
+          start => '"S", tagname, @attr',
+          end   => '"E", tagname',
+        );
+
+At last other C<HTML::Parser> options, like C<ignore_tags>, and
+C<unbroken_text>, can be passed in.  Note that you should not use the
+I<event>_h options to set up parser handlers.  That would confuse the
+inner logic of C<HTML::PullParser>.
+
+=item $token = $p->get_token
+
+This method will return the next I<token> found in the HTML document,
+or C<undef> at the end of the document.  The token is returned as an
+array reference.  The content of this array match the argspec set up
+during C<HTML::PullParser> construction.
+
+=item $p->unget_token( @tokens )
+
+If you find out you have read too many tokens you can push them back,
+so that they are returned again the next time $p->get_token is called.
+
+=back
+
+=head1 EXAMPLES
+
+The 'eg/hform' script shows how we might parse the form section of
+HTML::Documents using HTML::PullParser.
+
+=head1 SEE ALSO
+
+L<HTML::Parser>, L<HTML::TokeParser>
+
+=head1 COPYRIGHT
+
+Copyright 1998-2001 Gisle Aas.
+
+This library is free software; you can redistribute it and/or
+modify it under the same terms as Perl itself.
+
+=cut
--- a/database/perl/vendor/lib/HTML/Tagset.pm
+++ b/database/perl/vendor/lib/HTML/Tagset.pm
@@ -0,0 +1,471 @@
+package HTML::Tagset;
+
+use strict;
+
+=head1 NAME
+
+HTML::Tagset - data tables useful in parsing HTML
+
+=head1 VERSION
+
+Version 3.20
+
+=cut
+
+use vars qw( $VERSION );
+
+$VERSION = '3.20';
+
+=head1 SYNOPSIS
+
+  use HTML::Tagset;
+  # Then use any of the items in the HTML::Tagset package
+  #  as need arises
+
+=head1 DESCRIPTION
+
+This module contains several data tables useful in various kinds of
+HTML parsing operations.
+
+Note that all tag names used are lowercase.
+
+In the following documentation, a "hashset" is a hash being used as a
+set -- the hash conveys that its keys are there, and the actual values
+associated with the keys are not significant.  (But what values are
+there, are always true.)
+
+=cut
+
+use vars qw(
+    $VERSION
+    %emptyElement %optionalEndTag %linkElements %boolean_attr
+    %isHeadElement %isBodyElement %isPhraseMarkup
+    %is_Possible_Strict_P_Content
+    %isHeadOrBodyElement
+    %isList %isTableElement %isFormElement
+    %isKnown %canTighten
+    @p_closure_barriers
+    %isCDATA_Parent
+);
+
+=head1 VARIABLES
+
+Note that none of these variables are exported.
+
+=head2 hashset %HTML::Tagset::emptyElement
+
+This hashset has as values the tag-names (GIs) of elements that cannot
+have content.  (For example, "base", "br", "hr".)  So
+C<$HTML::Tagset::emptyElement{'hr'}> exists and is true.
+C<$HTML::Tagset::emptyElement{'dl'}> does not exist, and so is not true.
+
+=cut
+
+%emptyElement   = map {; $_ => 1 } qw(base link meta isindex
+                                     img br hr wbr
+                                     input area param
+                                     embed bgsound spacer
+                                     basefont col frame
+                                     ~comment ~literal
+                                     ~declaration ~pi
+                                    );
+# The "~"-initial names are for pseudo-elements used by HTML::Entities
+#  and TreeBuilder
+
+=head2 hashset %HTML::Tagset::optionalEndTag
+
+This hashset lists tag-names for elements that can have content, but whose
+end-tags are generally, "safely", omissible.  Example:
+C<$HTML::Tagset::emptyElement{'li'}> exists and is true.
+
+=cut
+
+%optionalEndTag = map {; $_ => 1 } qw(p li dt dd); # option th tr td);
+
+=head2 hash %HTML::Tagset::linkElements
+
+Values in this hash are tagnames for elements that might contain
+links, and the value for each is a reference to an array of the names
+of attributes whose values can be links.
+
+=cut
+
+%linkElements =
+(
+ 'a'       => ['href'],
+ 'applet'  => ['archive', 'codebase', 'code'],
+ 'area'    => ['href'],
+ 'base'    => ['href'],
+ 'bgsound' => ['src'],
+ 'blockquote' => ['cite'],
+ 'body'    => ['background'],
+ 'del'     => ['cite'],
+ 'embed'   => ['pluginspage', 'src'],
+ 'form'    => ['action'],
+ 'frame'   => ['src', 'longdesc'],
+ 'iframe'  => ['src', 'longdesc'],
+ 'ilayer'  => ['background'],
+ 'img'     => ['src', 'lowsrc', 'longdesc', 'usemap'],
+ 'input'   => ['src', 'usemap'],
+ 'ins'     => ['cite'],
+ 'isindex' => ['action'],
+ 'head'    => ['profile'],
+ 'layer'   => ['background', 'src'],
+ 'link'    => ['href'],
+ 'object'  => ['classid', 'codebase', 'data', 'archive', 'usemap'],
+ 'q'       => ['cite'],
+ 'script'  => ['src', 'for'],
+ 'table'   => ['background'],
+ 'td'      => ['background'],
+ 'th'      => ['background'],
+ 'tr'      => ['background'],
+ 'xmp'     => ['href'],
+);
+
+=head2 hash %HTML::Tagset::boolean_attr
+
+This hash (not hashset) lists what attributes of what elements can be
+printed without showing the value (for example, the "noshade" attribute
+of "hr" elements).  For elements with only one such attribute, its value
+is simply that attribute name.  For elements with many such attributes,
+the value is a reference to a hashset containing all such attributes.
+
+=cut
+
+%boolean_attr = (
+# TODO: make these all hashes
+  'area'   => 'nohref',
+  'dir'    => 'compact',
+  'dl'     => 'compact',
+  'hr'     => 'noshade',
+  'img'    => 'ismap',
+  'input'  => { 'checked' => 1, 'readonly' => 1, 'disabled' => 1 },
+  'menu'   => 'compact',
+  'ol'     => 'compact',
+  'option' => 'selected',
+  'select' => 'multiple',
+  'td'     => 'nowrap',
+  'th'     => 'nowrap',
+  'ul'     => 'compact',
+);
+
+#==========================================================================
+# List of all elements from Extensible HTML version 1.0 Transitional DTD:
+#
+#   a abbr acronym address applet area b base basefont bdo big
+#   blockquote body br button caption center cite code col colgroup
+#   dd del dfn dir div dl dt em fieldset font form h1 h2 h3 h4 h5 h6
+#   head hr html i iframe img input ins isindex kbd label legend li
+#   link map menu meta noframes noscript object ol optgroup option p
+#   param pre q s samp script select small span strike strong style
+#   sub sup table tbody td textarea tfoot th thead title tr tt u ul
+#   var
+#
+# Varia from Mozilla source internal table of tags:
+#   Implemented:
+#     xmp listing wbr nobr frame frameset noframes ilayer
+#     layer nolayer spacer embed multicol
+#   But these are unimplemented:
+#     sound??  keygen??  server??
+# Also seen here and there:
+#     marquee??  app??  (both unimplemented)
+#==========================================================================
+
+=head2 hashset %HTML::Tagset::isPhraseMarkup
+
+This hashset contains all phrasal-level elements.
+
+=cut
+
+%isPhraseMarkup = map {; $_ => 1 } qw(
+  span abbr acronym q sub sup
+  cite code em kbd samp strong var dfn strike
+  b i u s tt small big 
+  a img br
+  wbr nobr blink
+  font basefont bdo
+  spacer embed noembed
+);  # had: center, hr, table
+
+
+=head2 hashset %HTML::Tagset::is_Possible_Strict_P_Content
+
+This hashset contains all phrasal-level elements that be content of a
+P element, for a strict model of HTML.
+
+=cut
+
+%is_Possible_Strict_P_Content = (
+ %isPhraseMarkup,
+ %isFormElement,
+ map {; $_ => 1} qw( object script map )
+  # I've no idea why there's these latter exceptions.
+  # I'm just following the HTML4.01 DTD.
+);
+
+#from html4 strict:
+#<!ENTITY % fontstyle "TT | I | B | BIG | SMALL">
+#
+#<!ENTITY % phrase "EM | STRONG | DFN | CODE |
+#                   SAMP | KBD | VAR | CITE | ABBR | ACRONYM" >
+#
+#<!ENTITY % special
+#   "A | IMG | OBJECT | BR | SCRIPT | MAP | Q | SUB | SUP | SPAN | BDO">
+#
+#<!ENTITY % formctrl "INPUT | SELECT | TEXTAREA | LABEL | BUTTON">
+#
+#<!-- %inline; covers inline or "text-level" elements -->
+#<!ENTITY % inline "#PCDATA | %fontstyle; | %phrase; | %special; | %formctrl;">
+
+=head2 hashset %HTML::Tagset::isHeadElement
+
+This hashset contains all elements that elements that should be
+present only in the 'head' element of an HTML document.
+
+=cut
+
+%isHeadElement = map {; $_ => 1 }
+ qw(title base link meta isindex script style object bgsound);
+
+=head2 hashset %HTML::Tagset::isList
+
+This hashset contains all elements that can contain "li" elements.
+
+=cut
+
+%isList         = map {; $_ => 1 } qw(ul ol dir menu);
+
+=head2 hashset %HTML::Tagset::isTableElement
+
+This hashset contains all elements that are to be found only in/under
+a "table" element.
+
+=cut
+
+%isTableElement = map {; $_ => 1 }
+ qw(tr td th thead tbody tfoot caption col colgroup);
+
+=head2 hashset %HTML::Tagset::isFormElement
+
+This hashset contains all elements that are to be found only in/under
+a "form" element.
+
+=cut
+
+%isFormElement  = map {; $_ => 1 }
+ qw(input select option optgroup textarea button label);
+
+=head2 hashset %HTML::Tagset::isBodyMarkup
+
+This hashset contains all elements that are to be found only in/under
+the "body" element of an HTML document.
+
+=cut
+
+%isBodyElement = map {; $_ => 1 } qw(
+  h1 h2 h3 h4 h5 h6
+  p div pre plaintext address blockquote
+  xmp listing
+  center
+
+  multicol
+  iframe ilayer nolayer
+  bgsound
+
+  hr
+  ol ul dir menu li
+  dl dt dd
+  ins del
+  
+  fieldset legend
+  
+  map area
+  applet param object
+  isindex script noscript
+  table
+  center
+  form
+ ),
+ keys %isFormElement,
+ keys %isPhraseMarkup,   # And everything phrasal
+ keys %isTableElement,
+;
+
+
+=head2 hashset %HTML::Tagset::isHeadOrBodyElement
+
+This hashset includes all elements that I notice can fall either in
+the head or in the body.
+
+=cut
+
+%isHeadOrBodyElement = map {; $_ => 1 }
+  qw(script isindex style object map area param noscript bgsound);
+  # i.e., if we find 'script' in the 'body' or the 'head', don't freak out.
+
+
+=head2 hashset %HTML::Tagset::isKnown
+
+This hashset lists all known HTML elements.
+
+=cut
+
+%isKnown = (%isHeadElement, %isBodyElement,
+  map{; $_=>1 }
+   qw( head body html
+       frame frameset noframes
+       ~comment ~pi ~directive ~literal
+));
+ # that should be all known tags ever ever
+
+
+=head2 hashset %HTML::Tagset::canTighten
+
+This hashset lists elements that might have ignorable whitespace as
+children or siblings.
+
+=cut
+
+%canTighten = %isKnown;
+delete @canTighten{
+  keys(%isPhraseMarkup), 'input', 'select',
+  'xmp', 'listing', 'plaintext', 'pre',
+};
+  # xmp, listing, plaintext, and pre  are untightenable, and
+  #   in a really special way.
+@canTighten{'hr','br'} = (1,1);
+ # exceptional 'phrasal' things that ARE subject to tightening.
+
+# The one case where I can think of my tightening rules failing is:
+#  <p>foo bar<center> <em>baz quux</em> ...
+#                    ^-- that would get deleted.
+# But that's pretty gruesome code anyhow.  You gets what you pays for.
+
+#==========================================================================
+
+=head2 array @HTML::Tagset::p_closure_barriers
+
+This array has a meaning that I have only seen a need for in
+C<HTML::TreeBuilder>, but I include it here on the off chance that someone
+might find it of use:
+
+When we see a "E<lt>pE<gt>" token, we go lookup up the lineage for a p
+element we might have to minimize.  At first sight, we might say that
+if there's a p anywhere in the lineage of this new p, it should be
+closed.  But that's wrong.  Consider this document:
+
+  <html>
+    <head>
+      <title>foo</title>
+    </head>
+    <body>
+      <p>foo
+        <table>
+          <tr>
+            <td>
+               foo
+               <p>bar
+            </td>
+          </tr>
+        </table>
+      </p>
+    </body>
+  </html>
+
+The second p is quite legally inside a much higher p.
+
+My formalization of the reason why this is legal, but this:
+
+  <p>foo<p>bar</p></p>
+
+isn't, is that something about the table constitutes a "barrier" to
+the application of the rule about what p must minimize.
+
+So C<@HTML::Tagset::p_closure_barriers> is the list of all such
+barrier-tags.
+
+=cut
+
+@p_closure_barriers = qw(
+  li blockquote
+  ul ol menu dir
+  dl dt dd
+  td th tr table caption
+  div
+ );
+
+# In an ideal world (i.e., XHTML) we wouldn't have to bother with any of this
+# monkey business of barriers to minimization!
+
+=head2 hashset %isCDATA_Parent
+
+This hashset includes all elements whose content is CDATA.
+
+=cut
+
+%isCDATA_Parent = map {; $_ => 1 }
+  qw(script style  xmp listing plaintext);
+
+# TODO: there's nothing else that takes CDATA children, right?
+
+# As the HTML3 DTD (Raggett 1995-04-24) noted:
+#   The XMP, LISTING and PLAINTEXT tags are incompatible with SGML
+#   and derive from very early versions of HTML. They require non-
+#   standard parsers and will cause problems for processing
+#   documents with standard SGML tools.
+
+
+=head1 CAVEATS
+
+You may find it useful to alter the behavior of modules (like
+C<HTML::Element> or C<HTML::TreeBuilder>) that use C<HTML::Tagset>'s
+data tables by altering the data tables themselves.  You are welcome
+to try, but be careful; and be aware that different modules may or may
+react differently to the data tables being changed.
+
+Note that it may be inappropriate to use these tables for I<producing>
+HTML -- for example, C<%isHeadOrBodyElement> lists the tagnames
+for all elements that can appear either in the head or in the body,
+such as "script".  That doesn't mean that I am saying your code that
+produces HTML should feel free to put script elements in either place!
+If you are producing programs that spit out HTML, you should be
+I<intimately> familiar with the DTDs for HTML or XHTML (available at
+C<http://www.w3.org/>), and you should slavishly obey them, not
+the data tables in this document.
+
+=head1 SEE ALSO
+
+L<HTML::Element>, L<HTML::TreeBuilder>, L<HTML::LinkExtor>
+
+=head1 COPYRIGHT & LICENSE
+
+Copyright 1995-2000 Gisle Aas.
+
+Copyright 2000-2005 Sean M. Burke.
+
+Copyright 2005-2008 Andy Lester.
+
+This program is free software; you can redistribute it and/or modify it
+under the same terms as Perl itself.
+
+=head1 ACKNOWLEDGEMENTS
+
+Most of the code/data in this module was adapted from code written
+by Gisle Aas for C<HTML::Element>, C<HTML::TreeBuilder>, and
+C<HTML::LinkExtor>.  Then it was maintained by Sean M. Burke.
+
+=head1 AUTHOR
+
+Current maintainer: Andy Lester, C<< <andy at petdance.com> >>
+
+=head1 BUGS
+
+Please report any bugs or feature requests to
+C<bug-html-tagset at rt.cpan.org>, or through the web interface at
+L<http://rt.cpan.org/NoAuth/ReportBug.html?Queue=HTML-Tagset>.  I will
+be notified, and then you'll automatically be notified of progress on
+your bug as I make changes.
+
+=cut
+
+1;
--- a/database/perl/vendor/lib/HTML/TokeParser.pm
+++ b/database/perl/vendor/lib/HTML/TokeParser.pm
@@ -0,0 +1,372 @@
+package HTML::TokeParser;
+
+use strict;
+
+require HTML::PullParser;
+our @ISA = qw(HTML::PullParser);
+our $VERSION = '3.75';
+
+use Carp ();
+use HTML::Entities qw(decode_entities);
+use HTML::Tagset ();
+
+my %ARGS =
+(
+ start       => "'S',tagname,attr,attrseq,text",
+ end         => "'E',tagname,text",
+ text        => "'T',text,is_cdata",
+ process     => "'PI',token0,text",
+ comment     => "'C',text",
+ declaration => "'D',text",
+
+ # options that default on
+ unbroken_text => 1,
+);
+
+
+sub new
+{
+    my $class = shift;
+    my %cnf;
+
+    if (@_ == 1) {
+	my $type = (ref($_[0]) eq "SCALAR") ? "doc" : "file";
+	%cnf = ($type => $_[0]);
+    }
+    else {
+	unshift @_, (ref($_[0]) eq "SCALAR") ? "doc" : "file" if(scalar(@_) % 2 == 1);
+	%cnf = @_;
+    }
+
+    my $textify = delete $cnf{textify} || {img => "alt", applet => "alt"};
+
+    my $self = $class->SUPER::new(%ARGS, %cnf) || return undef;
+
+    $self->{textify} = $textify;
+    $self;
+}
+
+
+sub get_tag
+{
+    my $self = shift;
+    my $token;
+    while (1) {
+	$token = $self->get_token || return undef;
+	my $type = shift @$token;
+	next unless $type eq "S" || $type eq "E";
+	substr($token->[0], 0, 0) = "/" if $type eq "E";
+	return $token unless @_;
+	for (@_) {
+	    return $token if $token->[0] eq $_;
+	}
+    }
+}
+
+
+sub _textify {
+    my($self, $token) = @_;
+    my $tag = $token->[1];
+    return undef unless exists $self->{textify}{$tag};
+
+    my $alt = $self->{textify}{$tag};
+    my $text;
+    if (ref($alt)) {
+	$text = &$alt(@$token);
+    } else {
+	$text = $token->[2]{$alt || "alt"};
+	$text = "[\U$tag]" unless defined $text;
+    }
+    return $text;
+}
+
+
+sub get_text
+{
+    my $self = shift;
+    my @text;
+    while (my $token = $self->get_token) {
+	my $type = $token->[0];
+	if ($type eq "T") {
+	    my $text = $token->[1];
+	    decode_entities($text) unless $token->[2];
+	    push(@text, $text);
+	} elsif ($type =~ /^[SE]$/) {
+	    my $tag = $token->[1];
+	    if ($type eq "S") {
+		if (defined(my $text = _textify($self, $token))) {
+		    push(@text, $text);
+		    next;
+		}
+	    } else {
+		$tag = "/$tag";
+	    }
+	    if (!@_ || grep $_ eq $tag, @_) {
+		 $self->unget_token($token);
+		 last;
+	    }
+	    push(@text, " ")
+		if $tag eq "br" || !$HTML::Tagset::isPhraseMarkup{$token->[1]};
+	}
+    }
+    join("", @text);
+}
+
+
+sub get_trimmed_text
+{
+    my $self = shift;
+    my $text = $self->get_text(@_);
+    $text =~ s/^\s+//; $text =~ s/\s+$//; $text =~ s/\s+/ /g;
+    $text;
+}
+
+sub get_phrase {
+    my $self = shift;
+    my @text;
+    while (my $token = $self->get_token) {
+	my $type = $token->[0];
+	if ($type eq "T") {
+	    my $text = $token->[1];
+	    decode_entities($text) unless $token->[2];
+	    push(@text, $text);
+	} elsif ($type =~ /^[SE]$/) {
+	    my $tag = $token->[1];
+	    if ($type eq "S") {
+		if (defined(my $text = _textify($self, $token))) {
+		    push(@text, $text);
+		    next;
+		}
+	    }
+	    if (!$HTML::Tagset::isPhraseMarkup{$tag}) {
+		$self->unget_token($token);
+		last;
+	    }
+	    push(@text, " ") if $tag eq "br";
+	}
+    }
+    my $text = join("", @text);
+    $text =~ s/^\s+//; $text =~ s/\s+$//; $text =~ s/\s+/ /g;
+    $text;
+}
+
+1;
+
+
+__END__
+
+=head1 NAME
+
+HTML::TokeParser - Alternative HTML::Parser interface
+
+=head1 SYNOPSIS
+
+ require HTML::TokeParser;
+ $p = HTML::TokeParser->new("index.html") ||
+      die "Can't open: $!";
+ $p->empty_element_tags(1);  # configure its behaviour
+
+ while (my $token = $p->get_token) {
+     #...
+ }
+
+=head1 DESCRIPTION
+
+The C<HTML::TokeParser> is an alternative interface to the
+C<HTML::Parser> class.  It is an C<HTML::PullParser> subclass with a
+predeclared set of token types.  If you wish the tokens to be reported
+differently you probably want to use the C<HTML::PullParser> directly.
+
+The following methods are available:
+
+=over 4
+
+=item $p = HTML::TokeParser->new( $filename, %opt );
+
+=item $p = HTML::TokeParser->new( $filehandle, %opt );
+
+=item $p = HTML::TokeParser->new( \$document, %opt );
+
+The object constructor argument is either a file name, a file handle
+object, or the complete document to be parsed.  Extra options can be
+provided as key/value pairs and are processed as documented by the base
+classes.
+
+If the argument is a plain scalar, then it is taken as the name of a
+file to be opened and parsed.  If the file can't be opened for
+reading, then the constructor will return C<undef> and $! will tell
+you why it failed.
+
+If the argument is a reference to a plain scalar, then this scalar is
+taken to be the literal document to parse.  The value of this
+scalar should not be changed before all tokens have been extracted.
+
+Otherwise the argument is taken to be some object that the
+C<HTML::TokeParser> can read() from when it needs more data.  Typically
+it will be a filehandle of some kind.  The stream will be read() until
+EOF, but not closed.
+
+A newly constructed C<HTML::TokeParser> differ from its base classes
+by having the C<unbroken_text> attribute enabled by default. See
+L<HTML::Parser> for a description of this and other attributes that
+influence how the document is parsed. It is often a good idea to enable
+C<empty_element_tags> behaviour.
+
+Note that the parsing result will likely not be valid if raw undecoded
+UTF-8 is used as a source.  When parsing UTF-8 encoded files turn
+on UTF-8 decoding:
+
+   open(my $fh, "<:utf8", "index.html") || die "Can't open 'index.html': $!";
+   my $p = HTML::TokeParser->new( $fh );
+   # ...
+
+If a $filename is passed to the constructor the file will be opened in
+raw mode and the parsing result will only be valid if its content is
+Latin-1 or pure ASCII.
+
+If parsing from an UTF-8 encoded string buffer decode it first:
+
+   utf8::decode($document);
+   my $p = HTML::TokeParser->new( \$document );
+   # ...
+
+=item $p->get_token
+
+This method will return the next I<token> found in the HTML document,
+or C<undef> at the end of the document.  The token is returned as an
+array reference.  The first element of the array will be a string
+denoting the type of this token: "S" for start tag, "E" for end tag,
+"T" for text, "C" for comment, "D" for declaration, and "PI" for
+process instructions.  The rest of the token array depend on the type
+like this:
+
+  ["S",  $tag, $attr, $attrseq, $text]
+  ["E",  $tag, $text]
+  ["T",  $text, $is_data]
+  ["C",  $text]
+  ["D",  $text]
+  ["PI", $token0, $text]
+
+where $attr is a hash reference, $attrseq is an array reference and
+the rest are plain scalars.  The L<HTML::Parser/Argspec> explains the
+details.
+
+=item $p->unget_token( @tokens )
+
+If you find you have read too many tokens you can push them back,
+so that they are returned the next time $p->get_token is called.
+
+=item $p->get_tag
+
+=item $p->get_tag( @tags )
+
+This method returns the next start or end tag (skipping any other
+tokens), or C<undef> if there are no more tags in the document.  If
+one or more arguments are given, then we skip tokens until one of the
+specified tag types is found.  For example:
+
+   $p->get_tag("font", "/font");
+
+will find the next start or end tag for a font-element.
+
+The tag information is returned as an array reference in the same form
+as for $p->get_token above, but the type code (first element) is
+missing. A start tag will be returned like this:
+
+  [$tag, $attr, $attrseq, $text]
+
+The tagname of end tags are prefixed with "/", i.e. end tag is
+returned like this:
+
+  ["/$tag", $text]
+
+=item $p->get_text
+
+=item $p->get_text( @endtags )
+
+This method returns all text found at the current position. It will
+return a zero length string if the next token is not text. Any
+entities will be converted to their corresponding character.
+
+If one or more arguments are given, then we return all text occurring
+before the first of the specified tags found. For example:
+
+   $p->get_text("p", "br");
+
+will return the text up to either a paragraph of line break element.
+
+The text might span tags that should be I<textified>.  This is
+controlled by the $p->{textify} attribute, which is a hash that
+defines how certain tags can be treated as text.  If the name of a
+start tag matches a key in this hash then this tag is converted to
+text.  The hash value is used to specify which tag attribute to obtain
+the text from.  If this tag attribute is missing, then the upper case
+name of the tag enclosed in brackets is returned, e.g. "[IMG]".  The
+hash value can also be a subroutine reference.  In this case the
+routine is called with the start tag token content as its argument and
+the return value is treated as the text.
+
+The default $p->{textify} value is:
+
+  {img => "alt", applet => "alt"}
+
+This means that <IMG> and <APPLET> tags are treated as text, and that
+the text to substitute can be found in the ALT attribute.
+
+=item $p->get_trimmed_text
+
+=item $p->get_trimmed_text( @endtags )
+
+Same as $p->get_text above, but will collapse any sequences of white
+space to a single space character.  Leading and trailing white space is
+removed.
+
+=item $p->get_phrase
+
+This will return all text found at the current position ignoring any
+phrasal-level tags.  Text is extracted until the first non
+phrasal-level tag.  Textification of tags is the same as for
+get_text().  This method will collapse white space in the same way as
+get_trimmed_text() does.
+
+The definition of <i>phrasal-level tags</i> is obtained from the
+HTML::Tagset module.
+
+=back
+
+=head1 EXAMPLES
+
+This example extracts all links from a document.  It will print one
+line for each link, containing the URL and the textual description
+between the <A>...</A> tags:
+
+  use HTML::TokeParser;
+  $p = HTML::TokeParser->new(shift||"index.html");
+
+  while (my $token = $p->get_tag("a")) {
+      my $url = $token->[1]{href} || "-";
+      my $text = $p->get_trimmed_text("/a");
+      print "$url\t$text\n";
+  }
+
+This example extract the <TITLE> from the document:
+
+  use HTML::TokeParser;
+  $p = HTML::TokeParser->new(shift||"index.html");
+  if ($p->get_tag("title")) {
+      my $title = $p->get_trimmed_text;
+      print "Title: $title\n";
+  }
+
+=head1 SEE ALSO
+
+L<HTML::PullParser>, L<HTML::Parser>
+
+=head1 COPYRIGHT
+
+Copyright 1998-2005 Gisle Aas.
+
+This library is free software; you can redistribute it and/or
+modify it under the same terms as Perl itself.
+
+=cut
--- a/database/perl/vendor/lib/HTML/Tree.pm
+++ b/database/perl/vendor/lib/HTML/Tree.pm
@@ -0,0 +1,256 @@
+package HTML::Tree;
+
+# ABSTRACT: build and scan parse-trees of HTML
+
+# HTML::Tree is basically just a happy alias to HTML::TreeBuilder.
+
+use warnings;
+use strict;
+
+our $VERSION = '5.07'; # VERSION from OurPkgVersion
+
+use HTML::TreeBuilder ();
+
+
+sub new {
+    shift;
+    unshift @_, 'HTML::TreeBuilder';
+    goto &HTML::TreeBuilder::new;
+}
+
+sub new_from_file {
+    shift;
+    unshift @_, 'HTML::TreeBuilder';
+    goto &HTML::TreeBuilder::new_from_file;
+}
+
+sub new_from_content {
+    shift;
+    unshift @_, 'HTML::TreeBuilder';
+    goto &HTML::TreeBuilder::new_from_content;
+}
+
+sub new_from_url {
+    shift;
+    unshift @_, 'HTML::TreeBuilder';
+    goto &HTML::TreeBuilder::new_from_url;
+}
+
+1;
+
+__END__
+
+=pod
+
+=head1 NAME
+
+HTML::Tree - build and scan parse-trees of HTML
+
+=head1 VERSION
+
+This document describes version 5.07 of
+HTML::Tree, released August 31, 2017
+as part of L<HTML-Tree|HTML::Tree>.
+
+=head1 SYNOPSIS
+
+    use HTML::TreeBuilder;
+    my $tree = HTML::TreeBuilder->new();
+    $tree->parse_file($filename);
+
+        # Then do something with the tree, using HTML::Element
+        # methods -- for example:
+
+    $tree->dump
+
+        # Finally:
+
+    $tree->delete;
+
+=head1 DESCRIPTION
+
+HTML-Tree is a suite of Perl modules for making parse trees out of
+HTML source.  It consists of mainly two modules, whose documentation
+you should refer to: L<HTML::TreeBuilder|HTML::TreeBuilder>
+and L<HTML::Element|HTML::Element>.
+
+HTML::TreeBuilder is the module that builds the parse trees.  (It uses
+HTML::Parser to do the work of breaking the HTML up into tokens.)
+
+The tree that TreeBuilder builds for you is made up of objects of the
+class HTML::Element.
+
+If you find that you do not properly understand the documentation
+for HTML::TreeBuilder and HTML::Element, it may be because you are
+unfamiliar with tree-shaped data structures, or with object-oriented
+modules in general. Sean Burke has written some articles for
+I<The Perl Journal> (C<www.tpj.com>) that seek to provide that background.
+The full text of those articles is contained in this distribution, as:
+
+=over 4
+
+=item L<HTML::Tree::AboutObjects|HTML::Tree::AboutObjects>
+
+"User's View of Object-Oriented Modules" from TPJ17.
+
+=item L<HTML::Tree::AboutTrees|HTML::Tree::AboutTrees>
+
+"Trees" from TPJ18
+
+=item L<HTML::Tree::Scanning|HTML::Tree::Scanning>
+
+"Scanning HTML" from TPJ19
+
+=back
+
+Readers already familiar with object-oriented modules and tree-shaped
+data structures should read just the last article.  Readers without
+that background should read the first, then the second, and then the
+third.
+
+=head1 METHODS
+
+All these methods simply redirect to the corresponding method in
+HTML::TreeBuilder.  It's more efficient to use HTML::TreeBuilder
+directly, and skip loading HTML::Tree at all.
+
+=head2 new
+
+Redirects to L<HTML::TreeBuilder/new>.
+
+=head2 new_from_file
+
+Redirects to L<HTML::TreeBuilder/new_from_file>.
+
+=head2 new_from_content
+
+Redirects to L<HTML::TreeBuilder/new_from_content>.
+
+=head2 new_from_url
+
+Redirects to L<HTML::TreeBuilder/new_from_url>.
+
+=head1 SUPPORT
+
+You can find documentation for this module with the perldoc command.
+
+    perldoc HTML::Tree
+
+    You can also look for information at:
+
+=over 4
+
+=item * AnnoCPAN: Annotated CPAN documentation
+
+L<http://annocpan.org/dist/HTML-Tree>
+
+=item * CPAN Ratings
+
+L<http://cpanratings.perl.org/d/HTML-Tree>
+
+=item * RT: CPAN's request tracker
+
+L<http://rt.cpan.org/NoAuth/Bugs.html?Dist=HTML-Tree>
+
+=item * Search CPAN
+
+L<http://search.cpan.org/dist/HTML-Tree>
+
+=item * Stack Overflow
+
+L<http://stackoverflow.com/questions/tagged/html-tree>
+
+If you have a question about how to use HTML-Tree, Stack Overflow is
+the place to ask it.  Make sure you tag it both C<perl> and C<html-tree>.
+
+=back
+
+=head1 SEE ALSO
+
+L<HTML::TreeBuilder>, L<HTML::Element>, L<HTML::Tagset>,
+L<HTML::Parser>, L<HTML::DOMbo>
+
+The book I<Perl & LWP> by Sean M. Burke published by
+O'Reilly and Associates, 2002.  ISBN: 0-596-00178-9
+
+It has several chapters to do with HTML processing in general,
+and HTML-Tree specifically.  There's more info at:
+
+    http://www.oreilly.com/catalog/perllwp/
+
+    http://www.amazon.com/exec/obidos/ASIN/0596001789
+
+=head1 SOURCE REPOSITORY
+
+HTML-Tree is now maintained using Git.  The main public repository is
+L<< https://github.com/kentfredric/HTML-Tree >>.
+
+The best way to send a patch is to make a pull request there.
+
+=head1 ACKNOWLEDGEMENTS
+
+Thanks to Gisle Aas, Sean Burke and Andy Lester for their original work.
+
+Thanks to Chicago Perl Mongers (http://chicago.pm.org) for their
+patches submitted to HTML::Tree as part of the Phalanx project
+(http://qa.perl.org/phalanx).
+
+Thanks to the following people for additional patches and documentation:
+Terrence Brannon, Gordon Lack, Chris Madsen and Ricardo Signes.
+
+=head1 AUTHOR
+
+Current maintainers:
+
+=over
+
+=item * Christopher J. Madsen S<C<< <perl AT cjmweb.net> >>>
+
+=item * Jeff Fearn S<C<< <jfearn AT cpan.org> >>>
+
+=back
+
+Original HTML-Tree author:
+
+=over
+
+=item * Gisle Aas
+
+=back
+
+Former maintainers:
+
+=over
+
+=item * Sean M. Burke
+
+=item * Andy Lester
+
+=item * Pete Krawczyk S<C<< <petek AT cpan.org> >>>
+
+=back
+
+You can follow or contribute to HTML-Tree's development at
+L<< https://github.com/kentfredric/HTML-Tree >>.
+
+=head1 COPYRIGHT AND LICENSE
+
+Copyright 1995-1998 Gisle Aas, 1999-2004 Sean M. Burke,
+2005 Andy Lester, 2006 Pete Krawczyk, 2010 Jeff Fearn,
+2012 Christopher J. Madsen.
+(Except the articles contained in HTML::Tree::AboutObjects,
+HTML::Tree::AboutTrees, and HTML::Tree::Scanning, which are all
+copyright 2000 The Perl Journal.)
+
+Except for those three TPJ articles, the whole HTML-Tree distribution,
+of which this file is a part, is free software; you can redistribute
+it and/or modify it under the same terms as Perl itself.
+
+Those three TPJ articles may be distributed under the same terms as
+Perl itself.
+
+The programs in this library are distributed in the hope that they
+will be useful, but without any warranty; without even the implied
+warranty of merchantability or fitness for a particular purpose.
+
+=cut
--- a/database/perl/vendor/lib/HTML/Tree/AboutObjects.pod
+++ b/database/perl/vendor/lib/HTML/Tree/AboutObjects.pod
@@ -0,0 +1,686 @@
+
+#Time-stamp: "2001-02-23 20:07:25 MST" -*-Text-*-
+# This document contains text in Perl "POD" format.
+# Use a POD viewer like perldoc or perlman to render it.
+
+=head1 NAME
+
+HTML::Tree::AboutObjects -- article: "User's View of Object-Oriented Modules"
+
+=head1 SYNOPSIS
+
+  # This an article, not a module.
+
+=head1 DESCRIPTION
+
+The following article by Sean M. Burke first appeared in I<The Perl
+Journal> #17 and is copyright 2000 The Perl Journal. It appears
+courtesy of Jon Orwant and The Perl Journal.  This document may be
+distributed under the same terms as Perl itself.
+
+=head1 A User's View of Object-Oriented Modules
+
+-- Sean M. Burke
+
+The first time that most Perl programmers run into object-oriented
+programming when they need to use a module whose interface is
+object-oriented.  This is often a mystifying experience, since talk of
+"methods" and "constructors" is unintelligible to programmers who
+thought that functions and variables was all there was to worry about.
+
+Articles and books that explain object-oriented programming (OOP), do so
+in terms of how to program that way.  That's understandable, and if you
+learn to write object-oriented code of your own, you'd find it easy to
+use object-oriented code that others write.  But this approach is the
+I<long> way around for people whose immediate goal is just to use
+existing object-oriented modules, but who don't yet want to know all the
+gory details of having to write such modules for themselves.
+
+This article is for those programmers -- programmers who want to know
+about objects from the perspective of using object-oriented modules. 
+
+=head2 Modules and Their Functional Interfaces
+
+Modules are the main way that Perl provides for bundling up code for
+later use by yourself or others.  As I'm sure you can't help noticing
+from reading
+I<The Perl Journal>, CPAN (the Comprehensive Perl Archive
+Network) is the repository for modules (or groups of modules) that
+others have written, to do anything from composing music to accessing
+Web pages.  A good deal of those modules even come with every
+installation of Perl.
+
+One module that you may have used before, and which is fairly typical in
+its interface, is Text::Wrap.  It comes with Perl, so you don't even
+need to install it from CPAN.  You use it in a program of yours, by
+having your program code say early on:
+
+  use Text::Wrap;
+
+and after that, you can access a function called C<wrap>, which inserts
+line-breaks in text that you feed it, so that the text will be wrapped to
+seventy-two (or however many) columns.
+
+The way this C<use Text::Wrap> business works is that the module
+Text::Wrap exists as a file "Text/Wrap.pm" somewhere in one of your
+library directories.  That file contains Perl code...
+
+=over
+
+Footnote: And mixed in with the Perl code, there's documentation, which
+is what you read with "perldoc Text::Wrap".  The perldoc program simply
+ignores the code and formats the documentation text, whereas "use
+Text::Wrap" loads and runs the code while ignoring the documentation.
+
+=back
+
+...which, among other things, defines a function called C<Text::Wrap::wrap>,
+and then C<exports> that function, which means that when you say C<wrap>
+after having said "use Text::Wrap", you'll be actually calling the
+C<Text::Wrap::wrap> function.  Some modules don't export their
+functions, so you have to call them by their full name, like
+C<Text::Wrap::wrap(...parameters...)>.
+
+Regardless of whether the typical module exports the functions it
+provides, a module is basically just a container for chunks of code that
+do useful things.  The way the module allows for you to interact with
+it, is its I<interface>.  And when, like with Text::Wrap, its interface
+consists of functions, the module is said to have a B<functional
+interface>.
+
+=over
+
+Footnote: the term "function" (and therefore "functionI<al>") has
+various senses.  I'm using the term here in its broadest sense, to
+refer to routines -- bits of code that are called by some name and
+which take parameters and return some value.
+
+=back
+
+Using modules with functional interfaces is straightforward -- instead
+of defining your own "wrap" function with C<sub wrap { ... }>, you
+entrust "use Text::Wrap" to do that for you, along with whatever other
+functions its defines and exports, according to the module's
+documentation.  Without too much bother, you can even write your own
+modules to contain your frequently used functions; I suggest having a look at
+the C<perlmod> man page for more leads on doing this.
+
+=head2 Modules with Object-Oriented Interfaces
+
+So suppose that one day you want to write a program that will automate
+the process of C<ftp>ing a bunch of files from one server down to your
+local machine, and then off to another server.
+
+A quick browse through search.cpan.org turns up the module "Net::FTP",
+which you can download and install it using normal installation
+instructions (unless your sysadmin has already installed it, as many
+have).
+
+Like Text::Wrap or any other module with a familiarly functional
+interface, you start off using Net::FTP in your program by saying:
+
+  use Net::FTP;
+
+However, that's where the similarity ends.  The first hint of
+difference is that the documentation for Net::FTP refers to it as a
+B<class>.  A class is a kind of module, but one that has an
+object-oriented interface.
+
+Whereas modules like Text::Wrap
+provide bits of useful code as I<functions>, to be called like
+C<function(...parameters...)> or like
+C<PackageName::function(...parameters...)>, Net::FTP and other modules
+with object-oriented interfaces provide B<methods>.  Methods are sort of
+like functions in that they have a name and parameters; but methods
+look different, and are different, because you have to call them with a
+syntax that has a class name or an object as a special argument.  I'll
+explain the syntax for method calls, and then later explain what they
+all mean.
+
+Some methods are meant to be called as B<class methods>, with the class
+name (same as the module name) as a special argument.  Class methods
+look like this:
+
+  ClassName->methodname(parameter1, parameter2, ...)
+  ClassName->methodname()   # if no parameters
+  ClassName->methodname     # same as above
+
+which you will sometimes see written:
+
+  methodname ClassName (parameter1, parameter2, ...)
+  methodname ClassName      # if no parameters
+
+Basically all class methods are for making new objects, and methods that
+make objects are called "B<constructors>" (and the process of making them
+is called "constructing" or "instantiating").  Constructor methods
+typically have the name "new", or something including "new"
+("new_from_file", etc.); but they can conceivably be named
+anything -- DBI's constructor method is named "connect", for example.
+
+The object that a constructor method returns is
+typically captured in a scalar variable:
+
+  $object = ClassName->new(param1, param2...);
+
+Once you have an object (more later on exactly what that is), you can
+use the other kind of method call syntax, the syntax for B<object method>
+calls.  Calling object methods is just like class methods, except
+that instead of the ClassName as the special argument,
+you use an expression that yeilds an "object".  Usually this is
+just a scalar variable that you earlier captured the
+output of the constructor in.  Object method calls look like this:
+
+  $object->methodname(parameter1, parameter2, ...);
+  $object->methodname()   # if no parameters
+  $object->methodname     # same as above
+  
+which is occasionally written as:
+
+  methodname $object (parameter1, parameter2, ...)
+  methodname $object      # if no parameters
+
+Examples of method calls are:
+
+  my $session1 = Net::FTP->new("ftp.myhost.com");
+    # Calls a class method "new", from class Net::FTP,
+    #  with the single parameter "ftp.myhost.com",
+    #  and saves the return value (which is, as usual,
+    #  an object), in $session1.
+    # Could also be written:
+    #  new Net::FTP('ftp.myhost.com')
+  $session1->login("sburke","aoeuaoeu")
+    || die "failed to login!\n";
+     # calling the object method "login"
+  print "Dir:\n", $session1->dir(), "\n";
+  $session1->quit;
+    # same as $session1->quit()
+  print "Done\n";
+  exit;
+
+Incidentally, I suggest always using the syntaxes with parentheses and
+"->" in them,
+
+=over
+
+Footnote: the character-pair "->" is supposed to look like an
+arrow, not "negative greater-than"!
+
+=back
+
+and avoiding the syntaxes that start out "methodname $object" or
+"methodname ModuleName".  When everything's going right, they all mean
+the same thing as the "->" variants, but the syntax with "->" is more
+visually distinct from function calls, as well as being immune to some
+kinds of rare but puzzling ambiguities that can arise when you're trying
+to call methods that have the same name as subroutines you've defined.
+
+But, syntactic alternatives aside, all this talk of constructing objects
+and object methods begs the question -- what I<is> an object?  There are
+several angles to this question that the rest of this article will
+answer in turn: what can you do with objects?  what's in an object?
+what's an object value?  and why do some modules use objects at all? 
+
+=head2 What Can You Do with Objects?
+
+You've seen that you can make objects, and call object methods with
+them.  But what are object methods for?  The answer depends on the class:
+
+A Net::FTP object represents a session between your computer and an FTP
+server.  So the methods you call on a Net::FTP object are for doing
+whatever you'd need to do across an FTP connection.  You make the
+session and log in:
+
+  my $session = Net::FTP->new('ftp.aol.com');
+  die "Couldn't connect!" unless defined $session;
+    # The class method call to "new" will return
+    # the new object if it goes OK, otherwise it
+    # will return undef.
+    
+  $session->login('sburke', 'p@ssw3rD')
+   || die "Did I change my password again?";
+    # The object method "login" will give a true
+    # return value if actually logs in, otherwise
+    # it'll return false.
+    
+You can use the session object to change directory on that session:
+
+  $session->cwd("/home/sburke/public_html")
+     || die "Hey, that was REALLY supposed to work!";
+   # if the cwd fails, it'll return false
+
+...get files from the machine at the other end of the session...
+
+  foreach my $f ('log_report_ua.txt', 'log_report_dom.txt',
+                 'log_report_browsers.txt')
+  {
+    $session->get($f) || warn "Getting $f failed!"
+  };
+
+...and plenty else, ending finally with closing the connection:
+
+  $session->quit();
+
+In short, object methods are for doing things related to (or with)
+whatever the object represents.  For FTP sessions, it's about sending
+commands to the server at the other end of the connection, and that's
+about it -- there, methods are for doing something to the world outside
+the object, and the objects is just something that specifies what bit
+of the world (well, what FTP session) to act upon.
+
+With most other classes, however, the object itself stores some kind of
+information, and it typically makes no sense to do things with such an
+object without considering the data that's in the object.
+
+=head2 What's I<in> an Object?
+
+An object is (with rare exceptions) a data structure containing a
+bunch of attributes, each of which has a value, as well as a name
+that you use when you
+read or set the attribute's value.  Some of the object's attributes are
+private, meaning you'll never see them documented because they're not
+for you to read or write; but most of the object's documented attributes
+are at least readable, and usually writeable, by you.  Net::FTP objects
+are a bit thin on attributes, so we'll use objects from the class
+Business::US_Amort for this example.  Business::US_Amort is a very
+simple class (available from CPAN) that I wrote for making calculations
+to do with loans (specifically, amortization, using US-style
+algorithms).
+
+An object of the class Business::US_Amort represents a loan with
+particular parameters, i.e., attributes.  The most basic attributes of a
+"loan object" are its interest rate, its principal (how much money it's
+for), and it's term (how long it'll take to repay).  You need to set
+these attributes before anything else can be done with the object.  The
+way to get at those attributes for loan objects is just like the
+way to get at attributes for any class's objects: through accessors.
+An B<accessor> is simply any method that accesses (whether reading or
+writing, AKA getting or putting) some attribute in the given object.
+Moreover, accessors are the B<only> way that you can change
+an object's attributes.  (If a module's documentation wants you to
+know about any other way, it'll tell you.)
+
+Usually, for simplicity's sake, an accessor is named after the attribute
+it reads or writes.  With Business::US_Amort objects, the accessors you
+need to use first are C<principal>, C<interest_rate>, and C<term>.
+Then, with at least those attributes set, you can call the C<run> method
+to figure out several things about the loan.  Then you can call various
+accessors, like C<total_paid_toward_interest>, to read the results:
+
+  use Business::US_Amort;
+  my $loan = Business::US_Amort->new;
+  # Set the necessary attributes:
+  $loan->principal(123654);
+  $loan->interest_rate(9.25);
+  $loan->term(20); # twenty years
+
+  # NOW we know enough to calculate:
+  $loan->run;
+  
+  # And see what came of that:
+  print
+    "Total paid toward interest: A WHOPPING ",
+    $loan->total_paid_interest, "!!\n";
+
+This illustrates a convention that's common with accessors: calling the
+accessor with no arguments (as with $loan->total_paid_interest) usually
+means to read the value of that attribute, but providing a value (as
+with $loan->term(20)) means you want that attribute to be set to that
+value.  This stands to reason: why would you be providing a value, if
+not to set the attribute to that value?
+
+Although a loan's term, principal, and interest rates are all single
+numeric values, an objects values can any kind of scalar, or an array,
+or even a hash.  Moreover, an attribute's value(s) can be objects
+themselves.  For example, consider MIDI files (as I wrote about in
+TPJ#13): a MIDI file usually consists of several tracks.  A MIDI file is
+complex enough to merit being an object with attributes like its overall
+tempo, the file-format variant it's in, and the list of instrument
+tracks in the file.  But tracks themselves are complex enough to be
+objects too, with attributes like their track-type, a list of MIDI
+commands if they're a MIDI track, or raw data if they're not.  So I
+ended up writing the MIDI modules so that the "tracks" attribute of a
+MIDI::Opus object is an array of objects from the class MIDI::Track.
+This may seem like a runaround -- you ask what's in one object, and get
+I<another> object, or several!  But in this case, it exactly reflects
+what the module is for -- MIDI files contain MIDI tracks, which then
+contain data.
+
+=head2 What is an Object Value?
+
+When you call a constructor like Net::FTP->new(I<hostname>), you get
+back an object value, a value you can later use, in combination with a
+method name, to call object methods. 
+
+Now, so far we've been pretending, in the above examples, that the
+variables $session or $loan I<are> the objects you're dealing with.
+This idea is innocuous up to a point, but it's really a misconception
+that will, at best, limit you in what you know how to do.  The reality
+is not that the variables $session or $query are objects; it's a little
+more indirect -- they I<hold> values that symbolize objects.  The kind of
+value that $session or $query hold is what I'm calling an object value. 
+
+To understand what kind of value this is, first think about the other
+kinds of scalar values you know about: The first two scalar values you
+probably ever ran into in Perl are B<numbers> and B<strings>, which you
+learned (or just assumed) will usually turn into each other on demand;
+that is, the three-character string "2.5" can become the quantity two
+and a half, and vice versa.  Then, especially if you started using
+C<perl -w> early on, you learned about the B<undefined value>, which can
+turn into 0 if you treat it as a number, or the empty-string if you
+treat it as a string.
+
+=over
+
+Footnote: You may I<also> have been learning about references, in which
+case you're ready to hear that object values are just a kind of
+reference, except that they reflect the class that created thing they point
+to, instead of merely being a plain old array reference, hash reference,
+etc.  I<If> this makes makes sense to you, and you want to know more
+about how objects are implemented in Perl, have a look at the
+C<perltoot> man page. 
+
+=back
+
+And now you're learning about B<object values>.  An object value is a
+value that points to a data structure somewhere in memory, which is
+where all the attributes for this object are stored.  That data
+structure as a whole belongs to a class (probably the one you named in
+the constructor method, like ClassName->new), so that the object value
+can be used as part of object method calls. 
+
+If you want to actually I<see> what an object value is, you might try
+just saying "print $object".  That'll get you something like this:
+
+  Net::FTP=GLOB(0x20154240)
+
+or
+
+  Business::US_Amort=HASH(0x15424020)
+
+That's not very helpful if you wanted to really get at the object's
+insides, but that's because the object value is only a symbol for the
+object.  This may all sound very abstruse and metaphysical, so a
+real-world allegory might be very helpful:
+
+=over
+
+You get an advertisement in the mail saying that you have been
+(im)personally selected to have the rare privilege of applying for a
+credit card.  For whatever reason, I<this> offer sounds good to you, so you
+fill out the form and mail it back to the credit card company.  They
+gleefully approve the application and create your account, and send you
+a card with a number on it.
+
+Now, you can do things with the number on that card -- clerks at stores
+can ring up things you want to buy, and charge your account by keying in
+the number on the card.  You can pay for things you order online by
+punching in the card number as part of your online order.  You can pay
+off part of the account by sending the credit card people some of your
+money (well, a check) with some note (usually the pre-printed slip)
+that has the card number for the account you want to pay toward.  And you
+should be able to call the credit card company's computer and ask it
+things about the card, like its balance, its credit limit, its APR, and
+maybe an itemization of recent purchases ad payments.
+
+Now, what you're I<really> doing is manipulating a credit card
+I<account>, a completely abstract entity with some data attached to it
+(balance, APR, etc).  But for ease of access, you have a credit card
+I<number> that is a symbol for that account.  Now, that symbol is just a
+bunch of digits, and the number is effectively meaningless and useless
+in and of itself -- but in the appropriate context, it's understood to
+I<mean> the credit card account you're accessing.
+
+=back
+
+This is exactly the relationship between objects and object values, and
+from this analogy, several facts about object values are a bit more
+explicable:
+
+* An object value does nothing in and of itself, but it's useful when
+you use it in the context of an $object->method call, the same way that
+a card number is useful in the context of some operation dealing with a
+card account.
+
+Moreover, several copies of the same object value all refer to the same
+object, the same way that making several copies of your card number
+won't change the fact that they all still refer to the same single
+account (this is true whether you're "copying" the number by just
+writing it down on different slips of paper, or whether you go to the
+trouble of forging exact replicas of your own plastic credit card).  That's
+why this:
+
+  $x = Net::FTP->new("ftp.aol.com");
+  $x->login("sburke", "aoeuaoeu");
+
+does the same thing as this:
+
+  $x = Net::FTP->new("ftp.aol.com");
+  $y = $x;
+  $z = $y;
+  $z->login("sburke", "aoeuaoeu");
+
+That is, $z and $y and $x are three different I<slots> for values,
+but what's in those slots are all object values pointing to the same
+object -- you don't have three different FTP connections, just three
+variables with values pointing to the some single FTP connection.
+
+* You can't tell much of anything about the object just by looking at
+the object value, any more than you can see your credit account balance
+by holding the plastic card up to the light, or by adding up the digits
+in your credit card number.
+
+* You can't just make up your own object values and have them work --
+they can come only from constructor methods of the appropriate class.
+Similarly, you get a credit card number I<only> by having a bank approve
+your application for a credit card account -- at which point I<they>
+let I<you> know what the number of your new card is.
+
+Now, there's even more to the fact that you can't just make up your own
+object value: even though you can print an object value and get a string
+like "Net::FTP=GLOB(0x20154240)", that string is just a
+I<representation> of an object value.
+
+Internally, an object value has a basically different type from a
+string, or a number, or the undefined value -- if $x holds a real
+string, then that value's slot in memory says "this is a value of type
+I<string>, and its characters are...", whereas if it's an object value,
+the value's slot in memory says, "this is a value of type I<reference>,
+and the location in memory that it points to is..." (and by looking at
+what's at that location, Perl can tell the class of what's there). 
+
+Perl programmers typically don't have to think about all these details
+of Perl's internals.  Many other languages force you to be more
+conscious of the differences between all of these (and also between
+types of numbers, which are stored differently depending on their size
+and whether they have fractional parts).  But Perl does its best to
+hide the different types of scalars from you -- it turns numbers into
+strings and back as needed, and takes the string or number
+representation of undef or of object values as needed.  However, you
+can't go from a string representation of an object value, back to an
+object value.  And that's why this doesn't work:
+
+   $x = Net::FTP->new('ftp.aol.com');
+   $y = Net::FTP->new('ftp.netcom.com');
+   $z = Net::FTP->new('ftp.qualcomm.com');
+   $all = join(' ', $x,$y,$z);           # !!!
+  ...later...
+   ($aol, $netcom, $qualcomm) = split(' ', $all);  # !!!
+   $aol->login("sburke", "aoeuaoeu");
+   $netcom->login("sburke", "qjkxqjkx");
+   $qualcomm->login("smb", "dhtndhtn");
+
+This fails because $aol ends up holding merely the B<string representation>
+of the object value from $x, not the object value itself -- when
+C<join> tried to join the characters of the "strings" $x, $y, and $z,
+Perl saw that they weren't strings at all, so it gave C<join> their
+string representations.
+
+Unfortunately, this distinction between object values and their string
+representations doesn't really fit into the analogy of credit card
+numbers, because credit card numbers really I<are> numbers -- even
+thought they don't express any meaningful quantity, if you stored them
+in a database as a quantity (as opposed to just an ASCII string),
+that wouldn't stop them from being valid as credit card numbers.
+
+This may seem rather academic, but there's there's two common mistakes
+programmers new to objects often make, which make sense only in terms of
+the distinction between object values and their string representations:
+
+The first common error involves forgetting (or never having known in the
+first place) that when you go to use a value as a hash key, Perl uses
+the string representation of that value.  When you want to use the
+numeric value two and a half as a key, Perl turns it into the
+three-character string "2.5".  But if you then want to use that string
+as a number, Perl will treat it as meaning two and a half, so you're
+usually none the wiser that Perl converted the number to a string and
+back.  But recall that Perl can't turn strings back into objects -- so
+if you tried to use a Net::FTP object value as a hash key, Perl actually
+used its string representation, like "Net::FTP=GLOB(0x20154240)", but
+that string is unusable as an object value.  (Incidentally, there's
+a module Tie::RefHash that implements hashes that I<do> let you use
+real object-values as keys.)
+
+The second common error with object values is in
+trying to save an object value to disk (whether printing it to a
+file, or storing it in a conventional database file).  All you'll get is the
+string, which will be useless.
+
+When you want to save an object and restore it later, you may find that
+the object's class already provides a method specifically for this.  For
+example, MIDI::Opus provides methods for writing an object to disk as a
+standard MIDI file.  The file can later be read back into memory by
+a MIDI::Opus constructor method, which will return a new MIDI::Opus
+object representing whatever file you tell it to read into memory.
+Similar methods are available with, for example, classes that
+manipulate graphic images and can save them to files, which can be read
+back later.
+
+But some classes, like Business::US_Amort, provide no such methods for
+storing an object in a file.  When this is the case, you can try
+using any of the Data::Dumper, Storable, or FreezeThaw modules.  Using
+these will be unproblematic for objects of most classes, but it may run
+into limitations with others.  For example, a Business::US_Amort
+object can be turned into a string with Data::Dumper, and that string
+written to a file.  When it's restored later, its attributes will be
+accessible as normal.  But in the unlikely case that the loan object was
+saved in mid-calculation, the calculation may not be resumable.  This is
+because of the way that that I<particular> class does its calculations,
+but similar limitations may occur with objects from other classses.
+
+But often, even I<wanting> to save an object is basically wrong -- what would
+saving an ftp I<session> even mean?  Saving the hostname, username, and
+password?  current directory on both machines?  the local TCP/IP port
+number?  In the case of "saving" a Net::FTP object, you're better off
+just saving whatever details you actually need for your own purposes,
+so that you can make a new object later and just set those values for it.
+
+=head2 So Why Do Some Modules Use Objects?
+
+All these details of using objects are definitely enough to make you
+wonder -- is it worth the bother?  If you're a module author, writing
+your module with an object-oriented interface restricts the audience of
+potential users to those who understand the basic concepts of objects
+and object values, as well as Perl's syntax for calling methods.  Why
+complicate things by having an object-oriented interface?
+
+A somewhat esoteric answer is that a module has an object-oriented
+interface because the module's insides are written in an
+object-oriented style.  This article is about the basics of
+object-oriented I<interfaces>, and it'd be going far afield to explain
+what object-oriented I<design> is.  But the short story is that
+object-oriented design is just one way of attacking messy problems.
+It's a way that many programmers find very helpful (and which others
+happen to find to be far more of a hassle than it's worth,
+incidentally), and it just happens to show up for you, the module user,
+as merely the style of interface. 
+
+But a simpler answer is that a functional interface is sometimes a
+hindrance, because it limits the number of things you can do at once --
+limiting it, in fact, to one.  For many problems that some modules are
+meant to solve, doing without an object-oriented interface would be like
+wishing that Perl didn't use filehandles.  The ideas are rather simpler
+-- just imagine that Perl let you access files, but I<only> one at a
+time, with code like:
+
+  open("foo.txt") || die "Can't open foo.txt: $!";
+  while(readline) {
+    print $_ if /bar/;
+  }
+  close;
+
+That hypothetical kind of Perl would be simpler, by doing without
+filehandles.  But you'd be out of luck if you wanted to read from
+one file while reading from another, or read from two and print to a
+third.
+
+In the same way, a functional FTP module would be fine for just
+uploading files to one server at a time, but it wouldn't allow you to
+easily write programs that make need to use I<several> simultaneous
+sessions (like "look at server A and server B, and if A has a file
+called X.dat, then download it locally and then upload it to server B --
+except if B has a file called Y.dat, in which case do it the other way
+around").
+
+Some kinds of problems that modules solve just lend themselves to an
+object-oriented interface.  For those kinds of tasks, a functional
+interface would be more familiar, but less powerful.  Learning to use
+object-oriented modules' interfaces does require becoming comfortable
+with the concepts from this article.  But in the end it will allow you
+to use a broader range of modules and, with them, to write programs
+that can do more.
+
+B<[end body of article]>
+
+=head2 [Author Credit]
+
+Sean M. Burke has contributed several modules to CPAN, about half of
+them object-oriented.
+
+[The next section should be in a greybox:]
+
+=head2 The Gory Details
+
+For sake of clarity of explanation, I had to oversimplify some of the
+facts about objects.  Here's a few of the gorier details:
+
+* Every example I gave of a constructor was a class method.  But object
+methods can be constructors, too, if the class was written to work that
+way: $new = $old->copy, $node_y = $node_x->new_subnode, or the like.
+
+* I've given the impression that there's two kinds of methods: object
+methods and class methods.  In fact, the same method can be both,
+because it's not the kind of method it is, but the kind of calls it's
+written to accept -- calls that pass an object, or calls that pass a
+class-name.
+
+* The term "object value" isn't something you'll find used much anywhere
+else.  It's just my shorthand for what would properly be called an
+"object reference" or "reference to a blessed item".  In fact, people
+usually say "object" when they properly mean a reference to that object.
+
+* I mentioned creating objects with I<con>structors, but I didn't
+mention destroying them with I<de>structor -- a destructor is a kind of
+method that you call to tidy up the object once you're done with it, and
+want it to neatly go away (close connections, delete temporary files,
+free up memory, etc).  But because of the way Perl handles memory,
+most modules won't require the user to know about destructors.
+
+* I said that class method syntax has to have the class name, as in
+$session = B<Net::FTP>->new($host).  Actually, you can instead use any
+expression that returns a class name: $ftp_class = 'Net::FTP'; $session
+= B<$ftp_class>->new($host).  Moreover, instead of the method name for
+object- or class-method calls, you can use a scalar holding the method
+name: $foo->B<$method>($host).  But, in practice, these syntaxes are
+rarely useful.
+
+And finally, to learn about objects from the perspective of writing
+your own classes, see the C<perltoot> documentation,
+or Damian Conway's exhaustive and clear book I<Object Oriented Perl>
+(Manning Publications 1999, ISBN 1-884777-79-1).
+
+=head1 BACK
+
+Return to the L<HTML::Tree|HTML::Tree> docs.
+
+=cut
+
--- a/database/perl/vendor/lib/HTML/Tree/AboutTrees.pod
+++ b/database/perl/vendor/lib/HTML/Tree/AboutTrees.pod
--- a/database/perl/vendor/lib/HTML/Tree/Scanning.pod
+++ b/database/perl/vendor/lib/HTML/Tree/Scanning.pod
@@ -0,0 +1,714 @@
+
+#Time-stamp: "2001-03-10 23:19:11 MST" -*-Text-*-
+# This document contains text in Perl "POD" format.
+# Use a POD viewer like perldoc or perlman to render it.
+
+=head1 NAME
+
+HTML::Tree::Scanning -- article: "Scanning HTML"
+
+=head1 SYNOPSIS
+
+  # This an article, not a module.
+
+=head1 DESCRIPTION
+
+The following article by Sean M. Burke first appeared in I<The Perl
+Journal> #19 and is copyright 2000 The Perl Journal. It appears
+courtesy of Jon Orwant and The Perl Journal.  This document may be
+distributed under the same terms as Perl itself.
+
+(Note that this is discussed in chapters 6 through 10 of the
+book I<Perl and LWP> L<http://lwp.interglacial.com/> which
+was written after the following documentation, and which is
+available free online.)
+
+=head1 Scanning HTML
+
+-- Sean M. Burke
+
+In I<The Perl Journal> issue 17, Ken MacFarlane's article "Parsing
+HTML with HTML::Parser" describes how the HTML::Parser module scans
+HTML source as a stream of start-tags, end-tags, text, comments, etc.
+In TPJ #18, my "Trees" article kicked around the idea of tree-shaped
+data structures.  Now I'll try to tie it together, in a discussion of
+HTML trees.
+
+The CPAN module HTML::TreeBuilder takes the
+tags that HTML::Parser picks out, and builds a parse tree -- a
+tree-shaped network of objects...
+
+=over
+
+Footnote:
+And if you need a quick explanation of objects, see my TPJ17 article "A
+User's View of Object-Oriented Modules"; or go whole hog and get Damian
+Conway's excellent book I<Object-Oriented Perl>, from Manning
+Publications.
+
+=back
+
+...representing the structured content of the HTML document.  And once
+the document is parsed as a tree, you'll find the common tasks
+of extracting data from that HTML document/tree to be quite
+straightforward.
+
+=head2 HTML::Parser, HTML::TreeBuilder, and HTML::Element
+
+You use HTML::TreeBuilder to make a parse tree out of an HTML source
+file, by simply saying:
+
+  use HTML::TreeBuilder;
+  my $tree = HTML::TreeBuilder->new();
+  $tree->parse_file('foo.html');
+
+and then C<$tree> contains a parse tree built from the HTML source from
+the file "foo.html".  The way this parse tree is represented is with a
+network of objects -- C<$tree> is the root, an element with tag-name
+"html", and its children typically include a "head" and "body" element,
+and so on.  Elements in the tree are objects of the class
+HTML::Element.
+
+So, if you take this source:
+
+  <html><head><title>Doc 1</title></head>
+  <body>
+  Stuff <hr> 2000-08-17
+  </body></html>
+ 
+and feed it to HTML::TreeBuilder, it'll return a tree of objects that
+looks like this:
+
+               html
+             /      \
+         head        body
+        /          /   |  \
+     title    "Stuff"  hr  "2000-08-17"
+       |
+    "Doc 1"
+
+This is a pretty simple document, but if it were any more complex,
+it'd be a bit hard to draw in that style, since it's sprawl left and
+right.  The same tree can be represented a bit more easily sideways,
+with indenting:
+
+  . html
+     . head
+        . title
+           . "Doc 1"
+     . body
+        . "Stuff"
+        . hr
+        . "2000-08-17"
+
+Either way expresses the same structure.  In that structure, the root
+node is an object of the class HTML::Element
+
+=over
+
+Footnote:
+Well actually, the root is of the class HTML::TreeBuilder, but that's
+just a subclass of HTML::Element, plus the few extra methods like
+C<parse_file> that elaborate the tree
+
+=back
+
+, with the tag name "html", and with two children: an HTML::Element
+object whose tag names are "head" and "body".  And each of those
+elements have children, and so on down.  Not all elements (as we'll
+call the objects of class HTML::Element) have children -- the "hr"
+element doesn't.  And note all nodes in the tree are elements -- the
+text nodes ("Doc 1", "Stuff", and "2000-08-17") are just strings.
+
+Objects of the class HTML::Element each have three noteworthy attributes:
+
+=over
+
+=item C<_tag> -- (best accessed as C<$e-E<gt>tag>)
+this element's tag-name, lowercased (e.g., "em" for an "em" element).
+
+=over
+
+Footnote: Yes, this is misnamed.  In proper SGML terminology, this is
+instead called a "GI", short for "generic identifier"; and the term
+"tag" is used for a token of SGML source that represents either
+the start of an element (a start-tag like "<em lang='fr'>") or the end
+of an element (an end-tag like "</em>".  However, since more people
+claim to have been abducted by aliens than to have ever seen the
+SGML standard, and since both encounters typically involve a feeling of
+"missing time", it's not surprising that the terminology of the SGML
+standard is not closely followed.
+
+=back
+
+=item C<_parent> -- (best accessed as C<$e-E<gt>parent>)
+the element that is C<$obj>'s parent, or undef if this element is the
+root of its tree.
+
+=item C<_content> -- (best accessed as C<$e-E<gt>content_list>)
+the list of nodes (i.e., elements or text segments) that are C<$e>'s
+children.
+
+=back
+
+Moreover, if an element object has any attributes in the SGML sense of
+the word, then those are readable as C<$e-E<gt>attr('name')> -- for
+example, with the object built from having parsed "E<lt>a
+B<id='foo'>E<gt>barE<lt>/aE<gt>", C<$e-E<gt>attr('id')> will return
+the string "foo".  Moreover, C<$e-E<gt>tag> on that object returns the
+string "a", C<$e-E<gt>content_list> returns a list consisting of just
+the single scalar "bar", and C<$e-E<gt>parent> returns the object
+that's this node's parent -- which may be, for example, a "p" element.
+
+And that's all that there is to it -- you throw HTML
+source at TreeBuilder, and it returns a tree built of HTML::Element
+objects and some text strings.
+
+However, what do you I<do> with a tree of objects?  People code
+information into HTML trees not for the fun of arranging elements, but
+to represent the structure of specific text and images -- some text is
+in this "li" element, some other text is in that heading, some
+images are in that other table cell that has those attributes, and so on.
+
+Now, it may happen that you're rendering that whole HTML tree into some
+layout format.  Or you could be trying to make some systematic change to
+the HTML tree before dumping it out as HTML source again.  But, in my
+experience, by far the most common programming task that Perl
+programmers face with HTML is in trying to extract some piece
+of information from a larger document.  Since that's so common (and
+also since it involves concepts that are basic to more complex tasks),
+that is what the rest of this article will be about.
+
+=head2 Scanning HTML trees
+
+Suppose you have a thousand HTML documents, each of them a press
+release.  They all start out:
+
+  [...lots of leading images and junk...]
+  <h1>ConGlomCo to Open New Corporate Office in Ougadougou</h1>
+  BAKERSFIELD, CA, 2000-04-24 -- ConGlomCo's vice president in charge
+  of world conquest, Rock Feldspar, announced today the opening of a
+  new office in Ougadougou, the capital city of Burkino Faso, gateway
+  to the bustling "Silicon Sahara" of Africa...
+  [...etc...]
+
+...and what you've got to do is, for each document, copy whatever text
+is in the "h1" element, so that you can, for example, make a table of
+contents of it.  Now, there are three ways to do this:
+
+=over
+
+=item * You can just use a regexp to scan the file for a text pattern.
+
+For many very simple tasks, this will do fine.  Many HTML documents are,
+in practice, very consistently formatted as far as placement of
+linebreaks and whitespace, so you could just get away with scanning the
+file like so:
+
+  sub get_heading {
+    my $filename = $_[0];
+    local *HTML;
+    open(HTML, $filename)
+      or die "Couldn't open $filename);
+    my $heading;
+   Line:
+    while(<HTML>) {
+      if( m{<h1>(.*?)</h1>}i ) {  # match it!
+        $heading = $1;
+        last Line;
+      }
+    }
+    close(HTML);
+    warn "No heading in $filename?"
+     unless defined $heading;
+    return $heading;
+  }
+
+This is quick and fast, but awfully fragile -- if there's a newline in
+the middle of a heading's text, it won't match the above regexp, and
+you'll get an error.  The regexp will also fail if the "h1" element's
+start-tag has any attributes.  If you have to adapt your code to fit
+more kinds of start-tags, you'll end up basically reinventing part of
+HTML::Parser, at which point you should probably just stop, and use
+HTML::Parser itself:
+
+=item * You can use HTML::Parser to scan the file for an "h1" start-tag
+token, then capture all the text tokens until the "h1" close-tag.  This
+approach is extensively covered in the Ken MacFarlane's TPJ17 article
+"Parsing HTML with HTML::Parser".  (A variant of this approach is to use
+HTML::TokeParser, which presents a different and rather handier
+interface to the tokens that HTML::Parser picks out.)
+
+Using HTML::Parser is less fragile than our first approach, since it's
+not sensitive to the exact internal formatting of the start-tag (much
+less whether it's split across two lines).  However, when you need more
+information about the context of the "h1" element, or if you're having
+to deal with any of the tricky bits of HTML, such as parsing of tables,
+you'll find out the flat list of tokens that HTML::Parser returns
+isn't immediately useful.  To get something useful out of those tokens,
+you'll need to write code that knows some things about what elements
+take no content (as with "hr" elements), and that a "</p>" end-tags
+are omissible, so a "<p>" will end any currently
+open paragraph -- and you're well on your way to pointlessly
+reinventing much of the code in HTML::TreeBuilder
+
+=over
+
+Footnote:
+And, as the person who last rewrote that module, I can attest that it
+wasn't terribly easy to get right!  Never underestimate the perversity
+of people coding HTML.
+
+=back
+
+, at which point you should probably just stop, and use
+HTML::TreeBuilder itself:
+
+=item * You can use HTML::Treebuilder, and scan the tree of element
+objects that you get back.
+
+=back
+
+The last approach, using HTML::TreeBuilder, is the diametric opposite of
+first approach:  The first approach involves just elementary Perl and one
+regexp, whereas the TreeBuilder approach involves being at home with
+the concept of tree-shaped data structures and modules with
+object-oriented interfaces, as well as with the particular interfaces
+that HTML::TreeBuilder and HTML::Element provide.
+
+However, what the TreeBuilder approach has going for it is that it's
+the most robust, because it involves dealing with HTML in its "native"
+format -- it deals with the tree structure that HTML code represents,
+without any consideration of how the source is coded and with what
+tags omitted.
+
+So, to extract the text from the "h1" elements of an HTML document:
+
+  sub get_heading {
+    my $tree = HTML::TreeBuilder->new;
+    $tree->parse_file($_[0]);   # !
+    my $heading;
+    my $h1 = $tree->look_down('_tag', 'h1');  # !
+    if($h1) {
+      $heading = $h1->as_text;   # !
+    } else {
+      warn "No heading in $_[0]?";
+    }
+    $tree->delete; # clear memory!
+    return $heading;
+  }
+
+This uses some unfamiliar methods that need explaining.  The
+C<parse_file> method that we've seen before, builds a tree based on
+source from the file given.  The C<delete> method is for marking a
+tree's contents as available for garbage collection, when you're done
+with the tree.  The C<as_text> method returns a string that contains
+all the text bits that are children (or otherwise descendants) of the
+given node -- to get the text content of the C<$h1> object, we could
+just say:
+
+  $heading = join '', $h1->content_list;
+
+but that will work only if we're sure that the "h1" element's children
+will be only text bits -- if the document contained:
+
+  <h1>Local Man Sees <cite>Blade</cite> Again</h1>
+
+then the sub-tree would be:
+
+  . h1
+    . "Local Man Sees "
+    . cite
+      . "Blade"
+    . " Again'
+
+so C<join '', $h1-E<gt>content_list> will be something like:
+
+  Local Man Sees HTML::Element=HASH(0x15424040) Again
+
+whereas C<$h1-E<gt>as_text> would yield:
+
+  Local Man Sees Blade Again
+
+and depending on what you're doing with the heading text, you might
+want the C<as_HTML> method instead.  It returns the (sub)tree
+represented as HTML source.  C<$h1-E<gt>as_HTML> would yield:
+
+  <h1>Local Man Sees <cite>Blade</cite> Again</h1>
+
+However, if you wanted the contents of C<$h1> as HTML, but not the
+C<$h1> itself, you could say:
+
+  join '',
+    map(
+      ref($_) ? $_->as_HTML : $_,
+      $h1->content_list
+    )
+
+This C<map> iterates over the nodes in C<$h1>'s list of children; and
+for each node that's just a text bit (as "Local Man Sees " is), it just
+passes through that string value, and for each node that's an actual
+object (causing C<ref> to be true), C<as_HTML> will used instead of the
+string value of the object itself (which would be something quite
+useless, as most object values are).  So that C<as_HTML> for the "cite"
+element will be the string "<cite>BladeE<lt>/cite>".  And then,
+finally, C<join> just puts into one string all the strings that the
+C<map> returns.
+
+Last but not least, the most important method in our C<get_heading> sub
+is the C<look_down> method.  This method looks down at the subtree
+starting at the given object (C<$h1>), looking for elements that meet
+criteria you provide.
+
+The criteria are specified in the method's argument list.  Each
+criterion can consist of two scalars, a key and a value, which express
+that you want elements that have that attribute (like "_tag", or
+"src") with the given value ("h1"); or the criterion can be a
+reference to a subroutine that, when called on the given element,
+returns true if that is a node you're looking for.  If you specify
+several criteria, then that's taken to mean that you want all the
+elements that each satisfy I<all> the criteria.  (In other words,
+there's an "implicit AND".)
+
+And finally, there's a bit of an optimization -- if you call the
+C<look_down> method in a scalar context, you get just the I<first> node
+(or undef if none) -- and, in fact, once C<look_down> finds that first
+matching element, it doesn't bother looking any further.
+
+So the example:
+
+  $h1 = $tree->look_down('_tag', 'h1');
+
+returns the first element at-or-under C<$tree> whose C<"_tag">
+attribute has the value C<"h1">.
+
+=head2 Complex Criteria in Tree Scanning
+
+Now, the above C<look_down> code looks like a lot of bother, with
+barely more benefit than just grepping the file!  But consider if your
+criteria were more complicated -- suppose you found that some of the
+press releases that you were scanning had several "h1" elements,
+possibly before or after the one you actually want.  For example:
+
+  <h1><center>Visit Our Corporate Partner
+   <br><a href="/dyna/clickthru"
+     ><img src="/dyna/vend_ad"></a>
+  </center></h1>
+  <h1><center>ConGlomCo President Schreck to Visit Regional HQ
+   <br><a href="/photos/Schreck_visit_large.jpg"
+     ><img src="/photos/Schreck_visit.jpg"></a>
+  </center></h1>
+
+Here, you want to ignore the first "h1" element because it contains an
+ad, and you want the text from the second "h1".  The problem is in
+formalizing the way you know that it's an ad.  Since ad banners are
+always entreating you to "visit" the sponsoring site, you could exclude
+"h1" elements that contain the word "visit" under them:
+
+  my $real_h1 = $tree->look_down(
+    '_tag', 'h1',
+    sub {
+      $_[0]->as_text !~ m/\bvisit/i
+    }
+  );
+
+The first criterion looks for "h1" elements, and the second criterion
+limits those to only the ones whose text content doesn't match
+C<m/\bvisit/>.  But unfortunately, that won't work for our example,
+since the second "h1" mentions "ConGlomCo President Schreck to
+I<Visit> Regional HQ".
+
+Instead you could try looking for the first "h1" element that
+doesn't contain an image:
+
+  my $real_h1 = $tree->look_down(
+    '_tag', 'h1',
+    sub {
+      not $_[0]->look_down('_tag', 'img')
+    }
+  );
+
+This criterion sub might seem a bit odd, since it calls C<look_down>
+as part of a larger C<look_down> operation, but that's fine.  Note that
+when considered as a boolean value, a C<look_down> in a scalar context
+value returns false (specifically, undef) if there's no matching element
+at or under the given element; and it returns the first matching
+element (which, being a reference and object, is always a true value),
+if any matches.  So, here, 
+
+  sub {
+    not $_[0]->look_down('_tag', 'img')
+  }
+
+means "return true only if this element has no 'img' element as
+descendants (and isn't an 'img' element itself)."
+
+This correctly filters out the first "h1" that contains the ad, but it
+also incorrectly filters out the second "h1" that contains a
+non-advertisement photo besides the headline text you want.
+
+There clearly are detectable differences between the first and second
+"h1" elements -- the only second one contains the string "Schreck", and
+we could just test for that:
+
+  my $real_h1 = $tree->look_down(
+    '_tag', 'h1',
+    sub {
+      $_[0]->as_text =~ m{Schreck}
+    }
+  );
+
+And that works fine for this one example, but unless all thousand of
+your press releases have "Schreck" in the headline, that's just not a
+general solution.  However, if all the ads-in-"h1"s that you want to
+exclude involve a link whose URL involves "/dyna/", then you can use
+that:
+
+  my $real_h1 = $tree->look_down(
+    '_tag', 'h1',
+    sub {
+      my $link = $_[0]->look_down('_tag','a');
+      return 1 unless $link;
+        # no link means it's fine
+      return 0 if $link->attr('href') =~ m{/dyna/};
+        # a link to there is bad
+      return 1; # otherwise okay
+    }
+  );
+
+Or you can look at it another way and say that you want the first "h1"
+element that either contains no images, or else whose image has a "src"
+attribute whose value contains "/photos/":
+
+  my $real_h1 = $tree->look_down(
+    '_tag', 'h1',
+    sub {
+      my $img = $_[0]->look_down('_tag','img');
+      return 1 unless $img;
+        # no image means it's fine
+      return 1 if $img->attr('src') =~ m{/photos/};
+        # good if a photo
+      return 0; # otherwise bad
+    }
+  );
+
+Recall that this use of C<look_down> in a scalar context means to return
+the first element at or under C<$tree> that matches all the criteria.
+But if you notice that you can formulate criteria that'll match several
+possible "h1" elements, some of which may be bogus but the I<last> one
+of which is always the one you want, then you can use C<look_down> in a
+list context, and just use the last element of that list:
+
+  my @h1s = $tree->look_down(
+    '_tag', 'h1',
+    ...maybe more criteria...
+  );
+  die "What, no h1s here?" unless @h1s;
+  my $real_h1 = $h1s[-1]; # last or only
+
+=head2 A Case Study: Scanning Yahoo News's HTML
+
+The above (somewhat contrived) case involves extracting data from a
+bunch of pre-existing HTML files.  In that sort of situation, if your
+code works for all the files, then you know that the code I<works> --
+since the data it's meant to handle won't go changing or growing; and,
+typically, once you've used the program, you'll never need to use it
+again.
+
+The other kind of situation faced in many data extraction tasks is
+where the program is used recurringly to handle new data -- such as
+from ever-changing Web pages.  As a real-world example of this,
+consider a program that you could use (suppose it's crontabbed) to
+extract headline-links from subsections of Yahoo News
+(C<http://dailynews.yahoo.com/>).
+
+Yahoo News has several subsections:
+
+=over
+
+=item http://dailynews.yahoo.com/h/tc/ for technology news
+
+=item http://dailynews.yahoo.com/h/sc/ for science news
+
+=item http://dailynews.yahoo.com/h/hl/ for health news
+
+=item http://dailynews.yahoo.com/h/wl/ for world news
+
+=item http://dailynews.yahoo.com/h/en/ for entertainment news
+
+=back
+
+and others.  All of them are built on the same basic HTML template --
+and a scarily complicated template it is, especially when you look at
+it with an eye toward making up rules that will select where the real
+headline-links are, while screening out all the links to other parts of
+Yahoo, other news services, etc.  You will need to puzzle
+over the HTML source, and scrutinize the output of
+C<$tree-E<gt>dump> on the parse tree of that HTML.
+
+Sometimes the only way to pin down what you're after is by position in
+the tree. For example, headlines of interest may be in the third
+column of the second row of the second table element in a page:
+
+  my $table = ( $tree->look_down('_tag','table') )[1];
+  my $row2  = ( $table->look_down('_tag', 'tr' ) )[1];
+  my $col3  = ( $row2->look-down('_tag', 'td')   )[2];
+  ...then do things with $col3...
+
+Or they may be all the links in a "p" element that has at least three
+"br" elements as children:
+
+  my $p = $tree->look_down(
+    '_tag', 'p',
+    sub {
+      2 < grep { ref($_) and $_->tag eq 'br' }
+               $_[0]->content_list
+    }
+  );
+  @links = $p->look_down('_tag', 'a');
+
+But almost always, you can get away with looking for properties of the
+of the thing itself, rather than just looking for contexts.  Now, if
+you're lucky, the document you're looking through has clear semantic
+tagging, such is as useful in CSS -- note the
+class="headlinelink" bit here:
+
+  <a href="...long_news_url..." class="headlinelink">Elvis
+  seen in tortilla</a>
+
+If you find anything like that, you could leap right in and select
+links with:
+
+  @links = $tree->look_down('class','headlinelink');
+
+Regrettably, your chances of seeing any sort of semantic markup
+principles really being followed with actual HTML are pretty thin.
+
+=over
+
+Footnote:
+In fact, your chances of finding a page that is simply free of HTML
+errors are even thinner.  And surprisingly, sites like Amazon or Yahoo
+are typically worse as far as quality of code than personal sites
+whose entire production cycle involves simply being saved and uploaded
+from Netscape Composer.
+
+=back
+
+The code may be sort of "accidentally semantic", however -- for example,
+in a set of pages I was scanning recently, I found that looking for
+"td" elements with a "width" attribute value of "375" got me exactly
+what I wanted.  No-one designing that page ever conceived of
+"width=375" as I<meaning> "this is a headline", but if you impute it
+to mean that, it works.
+
+An approach like this happens to work for the Yahoo News code, because
+the headline-links are distinguished by the fact that they (and they
+alone) contain a "b" element:
+
+  <a href="...long_news_url..."><b>Elvis seen in tortilla</b></a>
+
+or, diagrammed as a part of the parse tree:
+
+  . a  [href="...long_news_url..."]
+    . b
+      . "Elvis seen in tortilla"
+
+A rule that matches these can be formalized as "look for any 'a'
+element that has only one daughter node, which must be a 'b' element".
+And this is what it looks like when cooked up as a C<look_down>
+expression and prefaced with a bit of code that retrieves the text of
+the given Yahoo News page and feeds it to TreeBuilder:
+
+  use strict;
+  use HTML::TreeBuilder 2.97;
+  use LWP::UserAgent;
+  sub get_headlines {
+    my $url = $_[0] || die "What URL?";
+    
+    my $response = LWP::UserAgent->new->request(
+      HTTP::Request->new( GET => $url )
+    );
+    unless($response->is_success) {
+      warn "Couldn't get $url: ", $response->status_line, "\n";
+      return;
+    }
+    
+    my $tree = HTML::TreeBuilder->new();
+    $tree->parse($response->content);
+    $tree->eof;
+    
+    my @out;
+    foreach my $link (
+      $tree->look_down(   # !
+        '_tag', 'a',
+        sub {
+          return unless $_[0]->attr('href');
+          my @c = $_[0]->content_list;
+          @c == 1 and ref $c[0] and $c[0]->tag eq 'b';
+        }
+      )
+    ) {
+      push @out, [ $link->attr('href'), $link->as_text ];
+    }
+    
+    warn "Odd, fewer than 6 stories in $url!" if @out < 6;
+    $tree->delete;
+    return @out;
+  }
+
+...and add a bit of code to actually call that routine and display the
+results...
+
+  foreach my $section (qw[tc sc hl wl en]) {
+    my @links = get_headlines(
+      "http://dailynews.yahoo.com/h/$section/"
+    );
+    print
+      $section, ": ", scalar(@links), " stories\n",
+      map(("  ", $_->[0], " : ", $_->[1], "\n"), @links),
+      "\n";
+  }
+
+And we've got our own headline-extractor service!  This in and of
+itself isn't no amazingly useful (since if you want to see the
+headlines, you I<can> just look at the Yahoo News pages), but it could
+easily be the basis for quite useful features like filtering the
+headlines for matching certain keywords of interest to you.
+
+Now, one of these days, Yahoo News will decide to change its HTML
+template.  When this happens, this will appear to the above program as
+there being no links that meet the given criteria; or, less likely,
+dozens of erroneous links will meet the criteria.  In either case, the
+criteria will have to be changed for the new template; they may just
+need adjustment, or you may need to scrap them and start over.
+
+=head2 I<Regardez, duvet!>
+
+It's often quite a challenge to write criteria to match the desired
+parts of an HTML parse tree.  Very often you I<can> pull it off with a
+simple C<$tree-E<gt>look_down('_tag', 'h1')>, but sometimes you do
+have to keep adding and refining criteria, until you might end up with
+complex filters like what I've shown in this article.  The
+benefit to learning how to deal with HTML parse trees is that one main
+search tool, the C<look_down> method, can do most of the work, making
+simple things easy, while still making hard things possible.
+
+B<[end body of article]>
+
+=head2 [Author Credit]
+
+Sean M. Burke (C<sburke@cpan.org>) is the current maintainer of
+C<HTML::TreeBuilder> and C<HTML::Element>, both originally by
+Gisle Aas.
+
+Sean adds: "I'd like to thank the folks who listened to me ramble
+incessantly about HTML::TreeBuilder and HTML::Element at this year's Yet
+Another Perl Conference and O'Reilly Open Source Software Convention."
+
+=head1 BACK
+
+Return to the L<HTML::Tree|HTML::Tree> docs.
+
+=cut
+
--- a/database/perl/vendor/lib/HTML/TreeBuilder.pm
+++ b/database/perl/vendor/lib/HTML/TreeBuilder.pm