Initial Commit

This commit is contained in:
Riley Schneider
2025-12-03 16:38:10 +01:00
parent c5e26bf594
commit b732d8d4b5
17680 changed files with 5977495 additions and 2 deletions

586
database/webalizer/CHANGES Normal file
View File

@@ -0,0 +1,586 @@
--------------------------------------------------------------------
2.23-xx changes from 2.21-xx
--------------------------------------------------------------------
Fixes:
o Fix sporadic eol problem with some IIS/W3C logs
o Fix compiler directive syntax error (broke some 64 bit systems)
Changes/Additions:
o Modest speed improvements in hash table code
--------------------------------------------------------------------
2.21-xx changes from 2.20-xx
--------------------------------------------------------------------
Fixes:
o Added missing memory deallocation call in DNS lookup code.
o Minor fixes to configure script
Changes/Additions:
o Added "YearTotals" config option for main index page totals
o Rename local stricmp() function to ouricmp() to prevent name
confilict on systems that happen to provide it already.
--------------------------------------------------------------------
2.20-xx changes from 2.01-xx
--------------------------------------------------------------------
Fixes:
o Fixed problem with timing totals.
o Fixed referrer linking to avoid possible xss injection.
o Fixed month change detection error that caused incorrect report
dates when logs had a 'gap' longer than a year.
o Fixed buffer overrun possibility in parsing code and user agent
mangle logic.
o Added symbolic link checks for file I/O to prevent possible
privilege escalation exploits. Disallows reading from or writing
to any file that is a symlink. Thanks to Julien Danjou.
o Added code to preserve the history and incremental data files in
the event of a crash before writing to them completely. Thanks
to Robert Millan for the idea and initial code.
Changes/Additions:
o Added native geolocation services, which fully supports both IPv4
and IPv6 lookups. Adds the configuration keywords 'GeoDB' and
'GeoDBDatabase' along with the '-j' and '-J' command line options.
o Added 'wcmgr', "The Webalizer (DNS) Cache file Manager" to the
distribution to provide cache file maintenance. See the supplied
man page for a description and usage information.
o Changed history code and main index page to allow for more than
12 months of reports to be displayed. Added the config keywords
'IndexMonths' (-K command line option), 'GraphMonths' (-k command
line option) and 'YearHeaders' to control how index is displayed.
o Changed Berkeley DB code to use current 4.x APIs.
o Added support for bzip2 compressed log files (.bz2) as a compile
time option (--enable-bz2). If enabled, bzipped files will be
decompressed automatically during processing.
o Added support for W3C formatted logs. Based on code submitted
by Klaus Reimer.
o Added GeoIP support as compile time option (--enable-geoip). Adds
'GeoIP' and 'GeoIPDatabase' config keywords, '-w' and '-W'
command line options. (http://www.maxmind.com/)
o Added IPv6 support. Based on initial code by Jose Carlos Meneiros
and modified to support Solaris and other problematic platforms.
o Added 'CacheIPs' config option to allow saving unresolved addresses
in the DNS cache.
o Added 'CacheTTL' config option which allows the DNS cache time to
live (TTL) value to be specified at run-time.
o Added 'SearchCaseI' config option to specify if search strings
should be treated as case insensitive or not. The default value,
'yes', causes search strings to be treated as case insensitive.
o Added 'HTAccess' config option. Allows writing a default .htaccess
file to the output directory.
o Added ability to display flags in the top country table. Adds the
config keywords 'CountryFlags' and 'FlagDir', and -z command line
option.
o Added 'StripCGI' config option to configure how CGI variables on
the end of URLs are treated (can now be stripped or left in place).
o Added 'DefaultIndex' config option to enable/disable the use of
"index." as a default index name to be stripped from the end of URLs.
o Added 'TrimSquidURL' config option to allow squid log URLs to be
reduced in granularity by a user definable amount. Thanks to code
submitted by Stuart Gall.
o Added 'OmitPage' config option (and the '-O' command line switch)
to prevent specified URLs from being counted as pages even if they
otherwise would be. Thanks to code submitted by Adam Morton.
o Added 'IgnoreState' config option (and the -b command line switch)
to allow ignoring any existing incremental data file (similar to
the IgnoreHist/-i option).
o Changed logic to always generate summary report (index.html),
even if no records were processed.
o Added color support to allow changing graph colors. Based on the
Webalizer-usecolor code submitted by Benoit Rouits. Adds 11 new
config options, see the README file for complete descriptions.
o Added language 'lang=' specification in generated HTML files.
o Added 'LinkReferrer' config option to allow/disallow links in the
top referrers table.
o Added 'PagePrefix' config option to allow URL prefix matches to
be counted as pages, regardless of file extension or type. Thanks
to code submitted by Remco Van de Meent.
o Enabled large file support (LFS) to support logs greater than 2Gb
in size on systems that support LFS. Also increased the size of
most internal counters to handle larger sites.
o Minor changes to generated HTML output
o Updated language files country codes for current IANA TLDs
o Changed the meaning of the -v command line switch. It now
causes verbose information to be displayed at run-time
(Informational and Debug messages).
o Changed Group* config options to allow a quoted string for
the match string. This allows spaces to be embedded in the
string.
o Changed log record parsing logic to allow spaces in URLs.
o Made configuration keywords, boolean configuration values
(yes/no), and log file types case insensitive. Also fixed
defaults for invalid values to reflect documented defaults.
o Changed configure script to use --sysconfdir to specify the
location of the default webalizer.conf configuration file.
Also added support for DESTDIR during install to aid binary
package builds.
--------------------------------------------------------------------
2.01-xx changes from 1.30-04
--------------------------------------------------------------------
Fixes:
o Fix posible obscure buffer overflow bug in DNS resolver code
o Added additional extended character fixes
o Let code accept partial content response codes along with 200's
o Added code to catch blank hostnames (yes, they have been found!)
Will convert them into 'Unknown'
o Security fix for cross-site scripting vulnerability found by
Flavio Veloso (www.magnux.com).
o Fixed a TOTAL_RC off by one error, which would prevent the last
response code from being saved when using incremental mode.
o Fixed possible segfault condition in MangleAgent code on
some malformed user agent names.
o Fixed DNS to prevent hangs on blank and malformed hostnames.
o Fixed problem calculating visits. Changed timestamps to use
seconds since epoch (1/1/1970) which results in more accurate
analysis. Also changed normal out of sequence code to handle
up to 1 hour of 'slop' in the timestamps. This changed the
semantics of the VisitTimeout and -m configuration options, as
the values are now specified in number of seconds.
o Fixed hostname lowercase problem (wasn't) when using DNS lookups.
o Fixed problem with incremental datafile which could cause a read
error under certain circumstances (removes control characters).
Also changed code to now abort on a read error.
o Fixed problem with hash table node creation where objects that
were exactly the maximum length would wind up leaving a garbage
byte at the end of the memory space allocated. This was causing
some very infrequent and widely different problems.
o Fixed problem where country graph could be produced incorrectly
if using a non-english language and the country name overlapped
the pie chart.
o Found and fixed a problem with a possible 32-bit wrap around
problem using incremental mode on large sites. The problem
would cause the KBytes data on large groups to become inaccuate.
Changes/Additions:
o Modified configure to allow specification of the default config
directory. If not given, will use /etc (/etc/webalizer.conf).
o Added DailyGraph and DailyStats configuration options to enable
or disable the Daily usage graph and stats table from output.
o Improved visit calculation logic to reduce 'false' counts generated
by external image referrals.
o Added reverse DNS lookup capability. This adds the command
line switchs -D and -N, and configuration keywords "DNSCache"
and "DNSChildren". See the DNS.README for additional info.
Based in part on code submitted by Henning P. Schmiedehausen
(hps@tanstaafl.de).
o Added ability to dump Sites, URLs, Referrers, User Agents,
Usernames and Search Strings to tab delimited files, suitable
for import into most database and spreadsheet programs. The
location of this file may be specified using the "DumpPath"
configuration keyword, allowing the data to be kept someplace
outside the web servers document tree. The configuration
keywords "DumpSites", "DumpURLs", "DumpReferrers", "DumpAgents",
"DumpUsers" and "DumpSearchStr" have been added to control the
file dumps. Column headers can be included in the file with
the "DumpHeader" keyword. Dump filename extensions may be
specified using the "DumpExtension" keyword (default is .tab).
o Added username analysis, based on usernames found in the log,
and only available if username information is present in the
log (ie: http authentication or wu-ftpd xferlog). The keywords
'GroupUser', 'HideUser', 'IgnoreUser', 'IncludeUser', 'AllUsers',
and 'TopUsers' have been added to the configuration file code.
This change also modified the format of the incremental data file.
o Added the ability to display ALL sites, URLs, Referrers,
User Agents and Search Strings on a seperate HTML page from
the normal statistics page. This adds the configuration
keywords 'AllSites', 'AllURLs', 'AllReferrers', 'AllAgents'
and 'AllSearchStr', which can have either a "yes" or "no"
value (default is "no"). Will add a "View All..." link to
the bottom of the appropriate "Top" table if enabled.
o Added support for squid proxy logs, thanks to code submitted
by Steinar H. Gunderson (sgunderson@bigfoot.com). To use
squid logs, specify a LogType of 'squid' in the configuration
file. This also changed the behaviour of the '-F' command
line switch, which now requires a second argument of either
'clf', 'ftp' or 'squid'.
o Completely modified the way the various TOP tables are handled
and sorted, which now allows extremely large top tables without
any performance degredation. Previously, tables greater than
a few hundred elements produced a noticable perfomance penalty
during processing.
o Added the ability to group domains automatically and to hide
individual host names from the report, using the 'GroupDomains'
and 'HideAllSites' configuration keywords (-g and -X command
line options). Domain Grouping is configurable as to the level
of grouping (second level domain, third, etc...). HideAllSites
forces only grouped site records to be displayed if any. Based
on ideas/code by Michael Klemme (mklemme@gmx.de). This changes
the behaviour of the '-g' switch, which previously was used to
force the use of GMT time for reports.
o Added user configurable search engine specification, used for
search string analysis. This adds the 'SearchEngine' keyword
in configuration files. Based on idea/code by Alexey Kizilov.
o Changed code to use the latest version of GD which supports PNG
images instead of GIF images. Also included changes in configure
script to ensure the presence of the libpng and libz libraries.
o Added ability to override log file to STDIN by use of '-' on
the command line.
o Added gzipped logfile support. The program will automatically
detect logfiles with a '.gz' extension and uncompress on the
fly. Uses gz file support of zlib, since it's required for
our gd/png stuff anyway. Please note that using gzipped logs
will incur a small performance penality.
o Minor changes to search string code to increase accuracy. This
also removes a previous condition that would occasionally cause
search strings to incorrectly be counted twice or to be counted
as different search strings when only differing by a space.
o Minor changes to URL parse code to allow additional characters.
Also changed unescape code to properly handle extended chars.
o Major changes to hash table node format for reduced memory usage.
Instead of fixed size strings, the new format will dynamically
allocate string memory and use pointers to existing table data
under certain circumstances. The memory savings is significant
and will be greatly noticed with large sites. Because of these
changes, the formatting of the incremental data file had to be
changed, therefore it is incompatible with previous versions.
o Major code reorganization and cleanup. This was to facilitate
future developent and make things more managable.
o Usual documentation updates for new features/functions.
--------------------------------------------------------------------
1.30-xx changes from 1.22-06
--------------------------------------------------------------------
Fixes:
o Fixed minor bug that would allow incorrect site totals for the
first day of the month under certain conditions.
Changes/Additions:
o Added Top Entry and Exit Page tables. Added configuration file
keywords TopEntry (-e command line) and TopExit (-E command line)
to specify the number of entries to display for each table. The
default for both is 10. See README for additional information.
o Added 'Group' labels. Allows display of a specified label for
grouped entries (in 'Top' tables). Based on patch submitted
by Oliver Graf (ograf@rhein-zeitung.de). See sample.conf for
examples.
o Added 'Visits' totals. The length of time that constitutes a
'visit' can be set using the VisitTimeout configuration keyword
(-m command line option). The value must be given in HHMMSS
format, you can omit leading zeros. Default is 30 minutes (3000).
o Added 'Pages' totals, based on user specified extensions. Changes
made to generated graphs as well. Configuration keyword PageType
(and command line -P switch) allows specification of extensions
to use (defaults to 'htm*' and 'cgi'). Also called "pageviews".
o Added Search String analysis. Keyword 'TopSearch' defines how
many of the top search strings to display. Default is 20. Can
be disabled by using zero (0).
o Added native support for ftp logs (xferlog ala wu-ftpd). Added
'LogType' configuration file keyword (-F command line option)
to specify log type. Values can be either 'web' or 'ftp', with
the default of 'web'.
o Changed graphs to handle pages and visits totals. Also added
color coded legends, which can be disabled using the GraphLegend
configuration keyword (-L command line option). Default is to
display them.
o Added background lines to graphs. Default is 2 lines, and can
be set to any number using the GraphLines configuration keyword
(-l command line option). Can use anywhere from none (0) to
twenty lines. They will be drawn in all but the country graph.
o Added CountryGraph configuration file keyword (-Y command line
option) to enable/disable display of country usage pie chart.
o Added FoldSeqErr keyword (-f command line option). Normally,
the program will ignore log records that are out of sequence
(chronological order). This option lets them be folded into
the analysis anyway, as if the were the same date/time as the
last good record. Apache users can safely ignore :)
o Added additonal 'Top' tables for SITES and URLs, sorted by
KBytes instead of hits. Two new configuration file keywords,
TopKSites and TopKURLs, can be used to specify the number of
entries for each (zero to disable). Default for both is 10.
o Added additional calculations for max/avg files, pages, visits
and KBytes in monthly statistics.
o Updated generated HTML code to fully comply with the HTML 4.0
Transitional spec. DOCTYPE header reflects this change as well.
o Changed code to use 4 digit years in filenames. Purely for the
Y2K phobes who couldn't deal with only two digits (even though
it was _purely_ for humans, the program couldn't care less).
Unfortunately, this means that you will have to rename previous
month files to the new format. Not a big deal if you plan on
re-running all your logs to take advantage of the new features.
o Major changes to both history file and incremental file formats
to handle additional totals (pages/visits data). As a result,
this version is INCOMPATABLE with previous versions. See the
file README.FIRST for important information on upgrading.
o Language files and documentation updated for new functions.
--------------------------------------------------------------------
1.22-xx changes from 1.20-11
--------------------------------------------------------------------
Fixes:
o Fixed bug in country total generation. Caused country table
to show bogus entries if logs contain hostnames that were not
fully qualified (ie: don't have the domain name/TLD portion).
o Changed/fixed incremental data I/O routines to better detect and
handle error conditions. This involved some minor incremental
data file format changes as well. Fixes problem large sites were
having where random tables were getting munged.
o Fixed record parse code to better detect and strip query portion
from URLs and Referrer strings.
o Fixed segfault condition when more than MAX_CTRY entries were
specified for the "Top Countries" table.
Changes/Additions:
o Added code to detect negative byte transfer sizes in logs (another
netscape server kludge :) Could cause KByte xfer sizes to become
corrupt.
o Several small changes (mostly ifdef/endif's) to make code compile
clean 'out-of-the-box' across more platforms (ala SunOS). Also
added a GNU autoconf 'configure' script which helps a bit as well.
o Added Include* keywords. Allows forcing the inclusion of specified
log records. Takes precedence over counterpart Ignore* keywords.
o Added HTMLPre, HTMLBody, HTMLEnd and HTMLExtension keywords, and
changed behaviour of HTMLHead keyword. Previous versions need
only change the 'HTMLHead' keword in existing files to 'HTMLBody'
to upgrade. Thanks to Colin Viebrock <cmv@privateworld.com> for
the idea and code examples.
o Changed mangle agent code to support Opera and other browsers.
Also updated response codes to IETF HTTP/1.1 Rev 6 draft.
Thanks to Yves Lafon <ylafon@w3.org> for this these.
o Added HistoryName and IncrementalName keywords, which allow the
specification of the history and incremental data filenames.
o Added UseHTTPS keyword, which allows using 'https://' instead
of 'http://' for links to URLS in the 'Top URLs' table. Also
added check for URLs that already have the protocol specified
(such as on virtual web and proxy servers), and to use unmodified
if found (will only force to lowercase for matching).
o Added code to ignore out-of-sequence log records.
o Added code to force hostnames to lowercase (was causing country skew).
o Disabled display of blank (zero hit) days at start of daily stat table.
o Added records per second calculation to timing totals.
o ALT= tags now use translated strings instead of forcing english.
o Updated documentation for new functions/features.
--------------------------------------------------------------------
1.20-xx changes from 1.12-10
--------------------------------------------------------------------
Fixes:
o Modified record parse routine to not touch stuff between quotes
("). Was causing problems parsing some malformed request fields.
o Fixed memory leak in MangleAgent code, and relocated to elimitate
un-necessary processing (causing segfault on some machines).
Changes/Additions:
o Changed transfer totals on host/url structures to support large
groupings (such as *.gif) on heavly hit servers. Hopefully, this
should cure the 32bit overflow problem large sites were having.
o Changed daily transfer totals to support transfers greater than
roughly 4.2 gigabytes a day.
o Added some missing HTML tags and altered the way totals are
calculated on the 'Top' tables (to correct for grouped records).
o Added incremental run capability (-p command line option or
"Incremental" configuration file keyword).
--------------------------------------------------------------------
1.1x-xx changes from 1.00-05
--------------------------------------------------------------------
Fixes:
o Re-wrote the Group* logic, fixing a bug that allowed hiding of
objects when they shouldn't be.
o Fixed broken IgnoreReferrer code.
o Modified config parse code to handle extended characters.
o Misc. minor bug fixes/changes. Added a missing fclose.
o Cleaned up generated HTML.
o Fixed duplicate warnings on large referrer fields.
o Fixed country table bug adding grouped records to totals.
Changes/Additions:
o Added GroupSite, GroupReferrer and GroupAgent keywords to round
out the Group* configuration options.
o Added GroupShading and GroupHighlight keywords to allow selective
highlight and shading on grouped rows in table.
o Removed the '-L' command line option. Groupings can now only
be specified from a configuration file. Language files changed
to reflect change.
o Added '-V' command line option (identical to '-v') for version.
o Added additional language support. Language files will be marked
/* New for 1.1 */ where changes have been made.
o Various rewrites to streamline the code, accomidate the new
group options and make things easier down the road when I implement
incremental (partial log) processing.
o Usual README and CHANGES documentation updates.
--------------------------------------------------------------------
1.00-xx changes from 0.99-06
--------------------------------------------------------------------
Fixes:
o Modify record parser so that spaces in usernames (auth field)
don't cause record to be skipped (w/'Bad Record' message).
o Included various error conditions that were being ignored in
the timing statistics ('bad records' value) totals.
Changes/Additions:
o Added GMTTime (-g) option to force display of timestamps in
GMT (UTC) time instead of local timezone.
o Added GroupURL (-L) option for grouping of URLs as if they
were a single object. See README for details.
o Language support in the form of a language specific header
file containing all strings used by The Webalizer. English
file is used by default unless changed. Support for other
languages will be distributed as I receive them.
--------------------------------------------------------------------
0.99-xx changes from 0.98-16
--------------------------------------------------------------------
0.99 is mostly a bug-fix release, with a few added extra goodies.
Fixes:
o Fixed monthly total transfer size (silent) overflow problem.
o Fixed the numerous fprintf format errors. Only seemed to wreak havok
on non-intel machines though.
o Fixed core dump condition on certain machines when using stdin for
input.
o Fixed floating point code that caused divide by zero errors on some
platforms (most noticably on SCO OpenServer).
o Netscape server kludges: Added code to deal with Netscape log header
record gracefully. Also added workaround for timestamp error where
Netscape sometimes makes a day have 0-24 hours instead of 0-23. The
Webalizer will now treat anything greater than 23 as 0.
o Resized some fixed field sizes to gain memory usage improvements.
Changes/Additions:
o Ignore* config keywords added. This allows you to completely ignore
certain log records based on site name, URL, user agent or referrer.
* Use will cause inaccurate statistics results. See documentation.
o ReallyQuiet config keyword (-Q command line option) added. Causes
The Webalizer to supress _all_ messages. Useful for cron jobs.
o Removed the "Sites" total at the bottom of the summary by month.
The total for sites is a useless number and produces a misleadingly
high value which detracts from the accuracy of the other totals.
o Updated README and CHANGES

339
database/webalizer/COPYING Normal file
View File

@@ -0,0 +1,339 @@
GNU GENERAL PUBLIC LICENSE
Version 2, June 1991
Copyright (C) 1989, 1991 Free Software Foundation, Inc.
59 Temple Place - Suite 330, Boston, MA 02111-1307, USA
Everyone is permitted to copy and distribute verbatim copies
of this license document, but changing it is not allowed.
Preamble
The licenses for most software are designed to take away your
freedom to share and change it. By contrast, the GNU General Public
License is intended to guarantee your freedom to share and change free
software--to make sure the software is free for all its users. This
General Public License applies to most of the Free Software
Foundation's software and to any other program whose authors commit to
using it. (Some other Free Software Foundation software is covered by
the GNU Library General Public License instead.) You can apply it to
your programs, too.
When we speak of free software, we are referring to freedom, not
price. Our General Public Licenses are designed to make sure that you
have the freedom to distribute copies of free software (and charge for
this service if you wish), that you receive source code or can get it
if you want it, that you can change the software or use pieces of it
in new free programs; and that you know you can do these things.
To protect your rights, we need to make restrictions that forbid
anyone to deny you these rights or to ask you to surrender the rights.
These restrictions translate to certain responsibilities for you if you
distribute copies of the software, or if you modify it.
For example, if you distribute copies of such a program, whether
gratis or for a fee, you must give the recipients all the rights that
you have. You must make sure that they, too, receive or can get the
source code. And you must show them these terms so they know their
rights.
We protect your rights with two steps: (1) copyright the software, and
(2) offer you this license which gives you legal permission to copy,
distribute and/or modify the software.
Also, for each author's protection and ours, we want to make certain
that everyone understands that there is no warranty for this free
software. If the software is modified by someone else and passed on, we
want its recipients to know that what they have is not the original, so
that any problems introduced by others will not reflect on the original
authors' reputations.
Finally, any free program is threatened constantly by software
patents. We wish to avoid the danger that redistributors of a free
program will individually obtain patent licenses, in effect making the
program proprietary. To prevent this, we have made it clear that any
patent must be licensed for everyone's free use or not licensed at all.
The precise terms and conditions for copying, distribution and
modification follow.
GNU GENERAL PUBLIC LICENSE
TERMS AND CONDITIONS FOR COPYING, DISTRIBUTION AND MODIFICATION
0. This License applies to any program or other work which contains
a notice placed by the copyright holder saying it may be distributed
under the terms of this General Public License. The "Program", below,
refers to any such program or work, and a "work based on the Program"
means either the Program or any derivative work under copyright law:
that is to say, a work containing the Program or a portion of it,
either verbatim or with modifications and/or translated into another
language. (Hereinafter, translation is included without limitation in
the term "modification".) Each licensee is addressed as "you".
Activities other than copying, distribution and modification are not
covered by this License; they are outside its scope. The act of
running the Program is not restricted, and the output from the Program
is covered only if its contents constitute a work based on the
Program (independent of having been made by running the Program).
Whether that is true depends on what the Program does.
1. You may copy and distribute verbatim copies of the Program's
source code as you receive it, in any medium, provided that you
conspicuously and appropriately publish on each copy an appropriate
copyright notice and disclaimer of warranty; keep intact all the
notices that refer to this License and to the absence of any warranty;
and give any other recipients of the Program a copy of this License
along with the Program.
You may charge a fee for the physical act of transferring a copy, and
you may at your option offer warranty protection in exchange for a fee.
2. You may modify your copy or copies of the Program or any portion
of it, thus forming a work based on the Program, and copy and
distribute such modifications or work under the terms of Section 1
above, provided that you also meet all of these conditions:
a) You must cause the modified files to carry prominent notices
stating that you changed the files and the date of any change.
b) You must cause any work that you distribute or publish, that in
whole or in part contains or is derived from the Program or any
part thereof, to be licensed as a whole at no charge to all third
parties under the terms of this License.
c) If the modified program normally reads commands interactively
when run, you must cause it, when started running for such
interactive use in the most ordinary way, to print or display an
announcement including an appropriate copyright notice and a
notice that there is no warranty (or else, saying that you provide
a warranty) and that users may redistribute the program under
these conditions, and telling the user how to view a copy of this
License. (Exception: if the Program itself is interactive but
does not normally print such an announcement, your work based on
the Program is not required to print an announcement.)
These requirements apply to the modified work as a whole. If
identifiable sections of that work are not derived from the Program,
and can be reasonably considered independent and separate works in
themselves, then this License, and its terms, do not apply to those
sections when you distribute them as separate works. But when you
distribute the same sections as part of a whole which is a work based
on the Program, the distribution of the whole must be on the terms of
this License, whose permissions for other licensees extend to the
entire whole, and thus to each and every part regardless of who wrote it.
Thus, it is not the intent of this section to claim rights or contest
your rights to work written entirely by you; rather, the intent is to
exercise the right to control the distribution of derivative or
collective works based on the Program.
In addition, mere aggregation of another work not based on the Program
with the Program (or with a work based on the Program) on a volume of
a storage or distribution medium does not bring the other work under
the scope of this License.
3. You may copy and distribute the Program (or a work based on it,
under Section 2) in object code or executable form under the terms of
Sections 1 and 2 above provided that you also do one of the following:
a) Accompany it with the complete corresponding machine-readable
source code, which must be distributed under the terms of Sections
1 and 2 above on a medium customarily used for software interchange; or,
b) Accompany it with a written offer, valid for at least three
years, to give any third party, for a charge no more than your
cost of physically performing source distribution, a complete
machine-readable copy of the corresponding source code, to be
distributed under the terms of Sections 1 and 2 above on a medium
customarily used for software interchange; or,
c) Accompany it with the information you received as to the offer
to distribute corresponding source code. (This alternative is
allowed only for noncommercial distribution and only if you
received the program in object code or executable form with such
an offer, in accord with Subsection b above.)
The source code for a work means the preferred form of the work for
making modifications to it. For an executable work, complete source
code means all the source code for all modules it contains, plus any
associated interface definition files, plus the scripts used to
control compilation and installation of the executable. However, as a
special exception, the source code distributed need not include
anything that is normally distributed (in either source or binary
form) with the major components (compiler, kernel, and so on) of the
operating system on which the executable runs, unless that component
itself accompanies the executable.
If distribution of executable or object code is made by offering
access to copy from a designated place, then offering equivalent
access to copy the source code from the same place counts as
distribution of the source code, even though third parties are not
compelled to copy the source along with the object code.
4. You may not copy, modify, sublicense, or distribute the Program
except as expressly provided under this License. Any attempt
otherwise to copy, modify, sublicense or distribute the Program is
void, and will automatically terminate your rights under this License.
However, parties who have received copies, or rights, from you under
this License will not have their licenses terminated so long as such
parties remain in full compliance.
5. You are not required to accept this License, since you have not
signed it. However, nothing else grants you permission to modify or
distribute the Program or its derivative works. These actions are
prohibited by law if you do not accept this License. Therefore, by
modifying or distributing the Program (or any work based on the
Program), you indicate your acceptance of this License to do so, and
all its terms and conditions for copying, distributing or modifying
the Program or works based on it.
6. Each time you redistribute the Program (or any work based on the
Program), the recipient automatically receives a license from the
original licensor to copy, distribute or modify the Program subject to
these terms and conditions. You may not impose any further
restrictions on the recipients' exercise of the rights granted herein.
You are not responsible for enforcing compliance by third parties to
this License.
7. If, as a consequence of a court judgment or allegation of patent
infringement or for any other reason (not limited to patent issues),
conditions are imposed on you (whether by court order, agreement or
otherwise) that contradict the conditions of this License, they do not
excuse you from the conditions of this License. If you cannot
distribute so as to satisfy simultaneously your obligations under this
License and any other pertinent obligations, then as a consequence you
may not distribute the Program at all. For example, if a patent
license would not permit royalty-free redistribution of the Program by
all those who receive copies directly or indirectly through you, then
the only way you could satisfy both it and this License would be to
refrain entirely from distribution of the Program.
If any portion of this section is held invalid or unenforceable under
any particular circumstance, the balance of the section is intended to
apply and the section as a whole is intended to apply in other
circumstances.
It is not the purpose of this section to induce you to infringe any
patents or other property right claims or to contest validity of any
such claims; this section has the sole purpose of protecting the
integrity of the free software distribution system, which is
implemented by public license practices. Many people have made
generous contributions to the wide range of software distributed
through that system in reliance on consistent application of that
system; it is up to the author/donor to decide if he or she is willing
to distribute software through any other system and a licensee cannot
impose that choice.
This section is intended to make thoroughly clear what is believed to
be a consequence of the rest of this License.
8. If the distribution and/or use of the Program is restricted in
certain countries either by patents or by copyrighted interfaces, the
original copyright holder who places the Program under this License
may add an explicit geographical distribution limitation excluding
those countries, so that distribution is permitted only in or among
countries not thus excluded. In such case, this License incorporates
the limitation as if written in the body of this License.
9. The Free Software Foundation may publish revised and/or new versions
of the General Public License from time to time. Such new versions will
be similar in spirit to the present version, but may differ in detail to
address new problems or concerns.
Each version is given a distinguishing version number. If the Program
specifies a version number of this License which applies to it and "any
later version", you have the option of following the terms and conditions
either of that version or of any later version published by the Free
Software Foundation. If the Program does not specify a version number of
this License, you may choose any version ever published by the Free Software
Foundation.
10. If you wish to incorporate parts of the Program into other free
programs whose distribution conditions are different, write to the author
to ask for permission. For software which is copyrighted by the Free
Software Foundation, write to the Free Software Foundation; we sometimes
make exceptions for this. Our decision will be guided by the two goals
of preserving the free status of all derivatives of our free software and
of promoting the sharing and reuse of software generally.
NO WARRANTY
11. BECAUSE THE PROGRAM IS LICENSED FREE OF CHARGE, THERE IS NO WARRANTY
FOR THE PROGRAM, TO THE EXTENT PERMITTED BY APPLICABLE LAW. EXCEPT WHEN
OTHERWISE STATED IN WRITING THE COPYRIGHT HOLDERS AND/OR OTHER PARTIES
PROVIDE THE PROGRAM "AS IS" WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESSED
OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF
MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE. THE ENTIRE RISK AS
TO THE QUALITY AND PERFORMANCE OF THE PROGRAM IS WITH YOU. SHOULD THE
PROGRAM PROVE DEFECTIVE, YOU ASSUME THE COST OF ALL NECESSARY SERVICING,
REPAIR OR CORRECTION.
12. IN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW OR AGREED TO IN WRITING
WILL ANY COPYRIGHT HOLDER, OR ANY OTHER PARTY WHO MAY MODIFY AND/OR
REDISTRIBUTE THE PROGRAM AS PERMITTED ABOVE, BE LIABLE TO YOU FOR DAMAGES,
INCLUDING ANY GENERAL, SPECIAL, INCIDENTAL OR CONSEQUENTIAL DAMAGES ARISING
OUT OF THE USE OR INABILITY TO USE THE PROGRAM (INCLUDING BUT NOT LIMITED
TO LOSS OF DATA OR DATA BEING RENDERED INACCURATE OR LOSSES SUSTAINED BY
YOU OR THIRD PARTIES OR A FAILURE OF THE PROGRAM TO OPERATE WITH ANY OTHER
PROGRAMS), EVEN IF SUCH HOLDER OR OTHER PARTY HAS BEEN ADVISED OF THE
POSSIBILITY OF SUCH DAMAGES.
END OF TERMS AND CONDITIONS
Appendix: How to Apply These Terms to Your New Programs
If you develop a new program, and you want it to be of the greatest
possible use to the public, the best way to achieve this is to make it
free software which everyone can redistribute and change under these terms.
To do so, attach the following notices to the program. It is safest
to attach them to the start of each source file to most effectively
convey the exclusion of warranty; and each file should have at least
the "copyright" line and a pointer to where the full notice is found.
<one line to give the program's name and a brief idea of what it does.>
Copyright (C) 19yy <name of author>
This program is free software; you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation; either version 2 of the License, or
(at your option) any later version.
This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU General Public License for more details.
You should have received a copy of the GNU General Public License
along with this program; if not, write to the Free Software
Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA
Also add information on how to contact you by electronic and paper mail.
If the program is interactive, make it output a short notice like this
when it starts in an interactive mode:
Gnomovision version 69, Copyright (C) 19yy name of author
Gnomovision comes with ABSOLUTELY NO WARRANTY; for details type `show w'.
This is free software, and you are welcome to redistribute it
under certain conditions; type `show c' for details.
The hypothetical commands `show w' and `show c' should show the appropriate
parts of the General Public License. Of course, the commands you use may
be called something other than `show w' and `show c'; they could even be
mouse-clicks or menu items--whatever suits your program.
You should also get your employer (if you work as a programmer) or your
school, if any, to sign a "copyright disclaimer" for the program, if
necessary. Here is a sample; alter the names:
Yoyodyne, Inc., hereby disclaims all copyright interest in the program
`Gnomovision' (which makes passes at compilers) written by James Hacker.
<signature of Ty Coon>, 1 April 1989
Ty Coon, President of Vice
This General Public License does not permit incorporating your program into
proprietary programs. If your program is a subroutine library, you may
consider it more useful to permit linking proprietary applications with the
library. If this is what you want to do, use the GNU Library General
Public License instead of this License.

View File

@@ -0,0 +1,20 @@
webalizer - a web server log analysis program
Copyright (C) 1997-2011 Bradford L. Barrett
This program is free software; you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation; either version 2 of the License, or
(at your option) any later version, and provided that the above
copyright and permission notice is included with all distributed
copies of this or derived software.
This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU General Public License for more details.
You should have received a copy of the GNU General Public License
along with this program; if not, write to the Free Software
Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA

View File

@@ -0,0 +1,295 @@
The Webalizer - A log file analysis program -- DNS information
The webalizer has the ability to perform reverse DNS lookups, and
fully supports both IPv4 and IPv6 addressing schemes. This document
attempts to explain how it works, and some things that you should be
aware of when using the DNS lookup features.
Note: The Reverse DNS feature may be enabled or disabled at compile
time. DNS lookup code is enabled by default. You can run The
Webalizer using the '-vV' command line options to determine what
options are enabled in the version you are using.
How it works
------------
DNS lookups are made against a DNS cache file containing IP addresses
and resolved names. If the IP address is not found in the cache file,
it will be left as an IP address. In order for this to happen, a
cache file MUST be specified when the Webalizer is run, either using
the '-D' command line switch, or a "DNSCache" configuration file
keyword. If no cache file is specified, no attempts to perform DNS
lookups will be done. The cache file can be made three different ways.
1) You can have the Webalizer pre-process the specified log file at
run-time, creating the cache file before processing the log file
normally. This is done by setting the number of DNS Children
processes to run, either by using the '-N' command line switch or
the "DNSChildren" configuration keyword. This will cause the
Webalizer to spawn the specified number of processes which will
be used to do reverse DNS lookups.. generally, a larger number
of processes will result in faster resolution of the log, however
if set too high may cause overall system degradation. A setting
of between 5 and 20 should be acceptable, and there is a maximum
limit of 100. If used, a cache filename MUST be specified also,
using either the '-D' command line switch, or the "DNSCache"
configuration keyword. Using this method, normal processing will
continue only after all IP addresses have been processed, and the
cache file is created/updated.
2) You can pre-process the log file as a standalone process, creating
the cache file that will be used later by the Webalizer. This is
done by running the Webalizer with a name of 'webazolver' (ie: the
name 'webazolver' is a symbolic link to 'webalizer') and specifying
the cache filename (either with '-D' or DNSCache). If the number
of child processes is not given, the default of 5 will be used. In
this mode, the log will be read and processed, creating a DNS cache
file or updating an existing one, and the program will then exit
without any further processing.
3) You can use The Webalizer (DNS) Cache file Manager program 'wcmgr'
to create and manipulate a cache file. A blank cache file can be
created which would be later populated, or data for the cache file
can be imported using tab delimited text files. See the wcmgr(1)
man page for usage information.
Run-time DNS cache file creation/update
---------------------------------------
The creation/update of a DNS cache file at run-time occurs as follows:
1) The log file is read, creating a list of all IP addresses that are
not already cached (or cached but expired) and need to be resolved.
Addresses are expired based on the TTL value specified using the
'CacheTTL' configuration option or after 7 days (default) if no TTL
is specified.
2) The specified number of children processes are forked, and are used
to perform DNS lookups.
3) Each IP address is given, one at a time, to the next available child
process until all IP addresses have been processed. Each child will
update the cache file when a result is returned. This may be either
a resolved name or a failed lookup, in which case the address will be
left unresolved. Unresolved addresses are not normally cached, but
can be, if enabled using the 'CacheIPs' configuration file keyword.
4) Once all IP addresses have been processed and the cache file updated,
the Webalizer will process the log normally. Each record it finds
that has an unresolved IP address will be looked up in the cache file
to see if a hostname is available (ie: was previously found).
Because there may be a significant amount of time between the initial
unresolved IP list and normal processing, the Webalizer should not be
run against live log files (ie: a log file that is actively being written
to by a server), otherwise there may be additional records present that
were not resolved.
Stand-Alone DNS cache file creation/update
------------------------------------------
The creation/update of the DNS cache file, when run in stand-alone mode,
occurs as follows:
1) The log file is read, creating a list of all IP addresses that are
not already cached (or cached but expired) and need to be resolved.
2) The specified number of children processes are forked, and are used
to perform DNS lookups. If the number of processes was not specified,
the default of 5 will be used.
3) Each IP address is given, one at a time, to the next available child
process until all IP addresses have been processed. Each child will
update the cache file when a result is returned.
4) Once all IP addresses have been processed and the cache file updated,
the program will terminate without any further processing.
Larger sites may prefer to use a stand-alone process to create the DNS
cache file, and then run the Webalizer against the cache file. This
allows a single cache file to be used for many virtual hosts, and reduces
the processing needed if many sites are being processed. The Webalizer
can be used in stand alone mode by running it as 'webazolver'. When
run in this fashion, it will only create the cache file and then exit
without any further processing. A cache filename MUST be specified,
however unlike when running the Webalizer normally, the number of child
processes does not have to be given (will default to 5). All normal
configuration and command line options are recognized, however, many
of them will simply be ignored.. this allows the use of a standard
configuration file for both normal use and stand alone use.
Examples:
---------
webalizer -c test.conf -N 10 -D dns_cache.db /var/log/my_www_log
This will use the configuration file 'test.conf' to obtain normal
configuration options such as hostname and output directory.. it
will then either create or update the file 'dns_cache.db' in the
default output directory (using 10 child processes) based on the
IP addresses it finds in the log /var/lib/my_www_log, and then
process that log file normally.
webalizer -o out -D dns_cache.db /var/log/my_www_log
This will process the log file /var/log/my_www_log, resolving IP
addresses from the cache file 'dns_cache.db' found in the default
output directory "out". The cache file must be present as it will
not be created with this command.
for i in /var/log/*/access_log; do
webazolver -N 20 -D /var/lib/dns_cache.db $i
done
The above is an example of how to run through multiple log files
creating a single DNS cache file.. this might be typically used on
a larger site that has many virtual hosts, all keeping their log
files in a separate directory. It will process each access_log it
finds in /var/log/* and create a cache file (var/lib/dns_cache.db).
This cache file can then be used to process the logs normally with
with the Webalizer in a read-only fashion (see next example).
for i in /etc/webalizer/*.conf; do webalizer -c $i -D /etc/cache.db; done
This will process each configuration file found in /etc/webalizer,
using the DNS cache file /etc/cache.db. This will also typically be
used on a larger site with multiple hosts.. Each configuration file
will specify a site specific log file, hostname, output directory, etc.
The cache file used will typically be created using a command similar
to the one previous to this example.
Cache File Maintenance
----------------------
The Webalizer DNS cache files generally require very little or no
special attention. There are times though when some maintenance
is required, such as occasional purging of very old cache entries.
The Webalizer never removes a record once it's inserted into the
cache. If a record expires based on its timestamp, the next time
that address is seen in a log, its name is looked up again and the
timestamp is updated. However, there will always be addresses that
are never seen again, which will cause the cache files to continue
to grow in size over time. On extremely busy sites or sites that
attract many one time visitors, the cache file may grow extremely
large, yet only contain a small amount of valid entries. Using
The Webalizer (DNS) Cache file Manager ('wcmgr'), cache files can
be purged, removing expired entries and shrinking the file size.
A TTL (time to live) value can be specified, so the length of time
an entry remains in the cache can be varied depending on individual
site requirements. In addition to purging cache files, 'wcmgr' can
also be used to list cache file contents, import/export cache data,
lookup/add/delete individual entries and gather overall statistics
regarding the cache file (number of records, number expired, etc..).
To purge a cache file using 'wcmgr', an example command would be:
wcmgr -p31 /path/to/dns.cache
This would purge the 'dns.cache' cache file of any records that are
over 31 days old, and would reclaim the space that those records
were using in the file. If you would like to see the records that
get purged, adding the command line option '-v' (verbose) will cause
the program to print each entry and its age as they are removed.
You can also use the 'wcmgr' to display statistics on cache files
to aid in determining when a cache file should be purged. See the
'wcmgr' man page (wcmgr.1) for additional information on the various
options available.
Stupid Cache Tricks
-------------------
The DNS cache files used by The Webalizer allow for efficient IP address
to name translations. Resolved names are normally generated by using an
existing DNS name server to query the address, either locally or over
the Internet. However, using The Webalizer (DNS) Cache file Manager,
almost any IP address to Name translation can be included in the cache.
One such example would be for mapping local network addresses to real
names, even though those addresses may not have real DNS entries on the
network (or may be 'local' addresses prohibited from use on the Internet).
A simple tab delimited text file can be created and imported into a cache
for use by The Webalizer, which will then be used to convert the local
IP addresses to real names. Additional configuration options for The
Webalizer can then be used as would be normally. For example, consider
a small business with 10 computers and a DSL router to the Internet.
Each machine on the local network would use a private IP address that
would not be resolved using an external (public) DNS server, so would
always be reported by The Webalizer as 'unknown/unresolved'. A simple
cache file could be created to map those unresolved addresses into more
meaningful names, which could then be further processed by the Webalizer.
An example might look something like:
# Local machines
192.168.123.254 0 0 gw.widgetsareus.lan
192.168.123.253 0 0 mail.widgetsareus.lan
192.168.123.250 0 0 sales.widgetsareus.lan
192.168.123.240 0 0 service.widgetsareus.lan
192.168.123.237 0 0 mgr.widgetsareus.lan
192.168.123.235 0 0 support1.widgetsareus.lan
192.168.123.234 0 0 support2.widgetsareus.lan
192.168.123.232 0 0 pres.widgetsareus.lan
192.168.123.230 0 0 vp.widgetsareus.lan
192.168.123.225 0 0 reception.widgetsareus.lan
192.168.123.224 0 0 finance.widgetsareus.lan
127.0.0.1 0 1 127.0.0.1
There are a couple of things here that should be noted. The first
is that the timestamps (first zero on each line above) are set to
zero. This tells The Webalizer that these cached entries are to
be considered 'permanent', and should never be expired (infinite
TTL or time to live). The second thing to note is that the resolved
names are using a non-standard TLD (top level domain) of '.lan'.
The Webalizer will map this special TLD to mean "Local Network" in
its reports, which allows local traffic to be grouped separately
from normal Internet traffic. Lastly, you may notice that the
last line of the file contains an entry with the same IP address
where a name should be. This entry will prevent the Webalizer
from ever trying to lookup 127.0.0.1, which is the 'localhost'
address, when it is found in a log. The second number after the IP
address (1) tells the Webalizer that it is an unresolved entry, not
a resolved hostname (ie: has no name). Entries such as this one can
be used to reduce DNS lookups on addresses that are known not to
resolve.
Considerations
--------------
Processing of live log files is discouraged, as the chances of log records
being written between the time of DNS resolution and normal processing will
cause problems.
If you are using STDIN for the input stream (log file) and have run-time
DNS cache file creation/update enabled.. the program will exit after the
cache file has been created/updated and no output will be produced. If
you must use STDIN for the input log, you will need to process the stream
twice, once to create/update the cache file, and again to produce the
reports. The reason for this is that stream inputs from STDIN cannot
be 'rewound' to the beginning like files can, so must be given twice.
Cached DNS addresses have a default TTL (time to live) of 7 days. This
may now be changed using the CacheTTL config file keyword to any value
from 1 to 100 (days). You may also now specify if unresolved addresses
should be stored in the DNS cache. Normally, unresolved IP addresses
are NOT saved in the cache and are looked up each time the program is
run.
There is an absolute maximum of 100 child processes that may be created,
however the actual number of children should be significantly less than
the maximum.. typical usage should be between 5 and 20.
Special thanks to Henning P. Schmiedehausen <hps@tanstaafl.de> for the
original dns-resolver code he submitted, which was the basis for this
implementation. Also thanks to Jose Carlos Medeiros for the inital IPv6
support code.

340
database/webalizer/INSTALL Normal file
View File

@@ -0,0 +1,340 @@
Installation instructions for The Webalizer
The Webalizer is distributed in either source or binary distributions,
and installation is different for each type. Regardless of the type
of installation, you need to obtain and un-tar/un-zip the distribution.
For binary distributions, you should create a directory somewhere and
chdir to it before unpacking the file. Source distributions will
automagically create a directory for you (webalizer-x.xx-xx). If you
are upgrading from a previous version, check the CHANGES file, and the
README.FIRST file for important upgrade information.
For Binary distributions
------------------------
You should have all the files you need in the directory you created
when you un-tarred/un-zipped the distribution file. The file
'webalizer' in this directory is the binary executable. Copy this
someplace useful, like /usr/local/bin or /usr/bin. A man page for
The Webalizer is also supplied... If desired, copy the file
'webalizer.1' to your local man directory (ie: /usr/local/man/man1).
(You may also need to run 'makeinfo' or similar)
Note: There may also be platform specific installation instructions
and/or usage notes supplied with the binary distribution. You
should read them, as that will be your starting point if problems
are encountered. Most of the binary distributions are submitted
by users, and I cannot support them the same way I can the
Linux binary distribution and the source code itself.
For Source distributions
------------------------
The Webalizer requires, at a minimum, the GD graphics library
(http://www.libgd.org/), the PNG (portable network graphics)
graphics library ( http://www.libpng.org/pub/png/ ), the Zlib
compression library ( http://www.zlib.net/ ) and associated
header files for these libraries. Most modern systems will have
these libraries, but may or may not have the required header files
for them unless you installed the 'dev' (development) versions
(which include the required header files along with the libraries).
Consult your systems documentation for specifics.
For native DNS and Geolocation (GeoDB) support, the Berkeley DB
library (by sleepycat, now owned by Oracle) v4.1 or higher and
associated header file is required.
http://www.oracle.com/technology/products/berkeley-db/
For BZip2 support, the bzip2 compression library and header file is
required. http://www.bzip.org/
For GeoIP geolocation support, the GeoIP library (by MaxMind, Inc.)
and header file is required, along with a Country Edition database.
http://www.maxmind.com/app/ip-location
New style build:
The Webalizer source distribution now comes packaged with a GNU
autoconf 'configure' script, which should allow you to simply type:
./configure
make
make install
Normal configure options apply, type ./configure --help to get a
complete list. A few options in particular may be useful:
--sysconfdir=/etc
The sysconfdir switch specifies where the default configuration
file (webalizer.conf) should be looked for. If not specified, the
default of ${prefix}/etc is used.
--with-language=<language>
Allows you to specify the language to use. Check the /lang directory
to see the available language choices. As an example, you could use
./configure --with-language=french
to compile the program using french (webalizer_lang.french) for output.
You can also use the --without-language switch, which will use the
default language (english).
--enable-dns
DNS lookup and native geolocation features are added if the required
library (libdb) and header file (db.h) are found. DNS/GeoDB code is
enabled at compile time by using the -DUSE_DNS compiler switch. For
GeoDB lookups, a current geodb database is also required (available
at ftp://ftp.mrunix.net/pub/webalizer/geodb).
--with-geodb=<path>
The default location for the GeoDB database is /usr/share/GeoDB but
may be changed using this option.
--enable-bz2
BZip2 compression support will be added if the required library
(libbz2) and header file (bzlib.h) are found. BZip2 code is
enabled at compile time using the -DUSE_BZIP compiler switch.
--enable-geoip
GeoIP geolocation support will be added if the required library
(libGeoIP) and header file (GeoIP.h) are found. No attempt is
made to locate a valid Country Edition database, which is also
required for GeoIP lookups to be performed. GeoIP code is
enabled at compile time using the -DUSE_GEOIP compiler switch.
Some systems may require unusual settings that the configure script
cannot determine. You can pass values to the script by setting
environment variables. For example:
CC=c89 CFLAGS=-O LIBS=-lposix ./configure --with-language=german
Would allow you to set the compiler (c89) and various flags and
libraries to use, which would then be passed to the configure script
and eventually to the Makefile generated. It also will cause the
program to be compiled using German instead of the English default.
Additionally, the various --with-<package> and --with-<packagelib>
options allow specification of non-standard locations for the
various libraries and headers. For example, if you built the bzip2
library in /src/bzip2, you could use:
./configure --with-bz2=/src/bzip2 --with-bz2lib=/src/bzip2 --enable-bz2
to specify where the bz2 header files (--with-bz2) and library
(--with-bz2lib) are located. They should then be detected by
the configure script and enabled. Please note that if you are
linking against a shared library (ie: libbz2.so), then even though
configure script finds the library, and The Webalizer compiles
successfully, the program may FAIL when run because the systems
run-time linking loader cannot find the library. If this happens,
then you need to tell the loader where the library is, and is
dependent upon what type system is being used. Some platforms
require the path to the library to be placed in the LD_LIBRARY_PATH
environment variable.. some (such as linux based platforms) use
the ld.so.conf file and ldconfig program to configure the dynamic
linker run-time bindings. Consult the documentation for your
system specific requirements.
For package maintainers, the environment variable DESTDIR can be
used to specify a root directory for installation. This is the
top level directory under which all other directories will be
placed when 'make install' is invoked, and allows binary packages
to be easily built outside the normal root directory tree. For
example, if you wish to build a binary package of The Webalizer
under the /usr/pkg/webalizer-2.20 directory, you could type:
make install DESTDIR=/usr/pkg/webalizer-2.20
Which would then create the following directory tree:
/usr/pkg/webalizer-2.20/
/usr/pkg/webalizer-2.20/etc/
/usr/pkg/webalizer-2.20/etc/webalizer.conf.sample
/usr/pkg/webalizer-2.20/usr/
/usr/pkg/webalizer-2.20/usr/bin/
/usr/pkg/webalizer-2.20/usr/bin/webalizer
/usr/pkg/webalizer-2.20/usr/bin/webazolver -> webalizer
/usr/pkg/webalizer-2.20/usr/bin/wcmgr
/usr/pkg/webalizer-2.20/usr/man/
/usr/pkg/webalizer-2.20/usr/man/man1/
/usr/pkg/webalizer-2.20/usr/man/man1/webalizer.1
/usr/pkg/webalizer-2.20/usr/man/man1/webazolver.1 -> webalizer.1
/usr/pkg/webalizer-2.20/usr/man/man1/wcmgr.1
If the configure script doesn't work for you.. please let me know
(along with relevant info like system type, compiler, etc..) If you
are able and can tweak something to make it work, let me know as well.
Old style build:
If you have a platform that the configure script won't work on, or
some other situation where you have to configure and build the
source yourself, the file 'Makefile.std' is a "stock" Makefile
that you can use to build the Webalizer. Copy or rename the file
to 'Makefile', edit to match your system, and do the usual 'make'.
This is a very generic Makefile, so expect to have to tweak it for
your particular platform and configuration. If everything seems
to have gone well, next type 'make install' to do a stock install.
Again, you may want to tweak the Makefile for the install, or
skip the 'make install' step completely (see below).
This will install the Webalizer on your system, and put a sample
configuration file in /etc (named 'webalizer.conf.sample'). If
you don't want to use the 'make install' method... just copy the
file 'webalizer' to someplace useful, and you are ready to go :)
Usage
-----
When run, The Webalizer will read the specified log file and
produce HTML output in the directory specified (or current
directory if none). You may specify various configuration
options either on the command line or in a configuration file.
The format of the command line is:
webalizer [options] [log_file]
Where 'options' may be any of the valid command line options
described in the README file. If a log filename is not given,
input is taken from stdin. A typical command line might look
something similar to:
webalizer /var/lib/httpd/logs/access_log
This will produce output in the current directory based on the
logfile /var/lib/httpd/logs/access_log. Another example:
webalizer -c somehost.conf
This will read the configuration file somehost.conf, which
should specify, among other things, the log filename and
output directory to use. You can use 'webalizer -h' to get
a list of available command line options, or view the file
README for complete instructions on all available configuration
options. You should note that The Webalizer will _always_
look for a configuration file named 'webalizer.conf' in either
the current directory or in /etc/, and will process that file
_before_ any other configuration or command line options. If
you run a single server, you may want to create a default
configuration file and place it in the /etc/ directory. This
will allow you to simply type 'webalizer' without the need to
specify additional command line options.
Configuration
-------------
The Webalizer can be customized in many ways using either the
command line or configuration files. To test The Webalizer,
type: 'webalizer /var/lib/httpd/logs/access_log', changing the
directory to wherever your log files are. After processing,
you should have the output and a file named index.html which
can be viewed with any browser. The Webalizer can accept many
command line options as well, type 'webalizer -h' to view them.
In addition to the command line options, The Webalizer can
be customized using configuration files. There is a sample.conf
file that is part of both the source and binary distributions
that can be used as a 'template' for creating your own site
configuration file. Just make a copy of the file and name it
something like 'mysite.conf'. Edit the new file to match your
particular setup and taste.
To test the new configuration file, type 'webalizer -c mysite.conf'
(or whatever your configuration file is named). Fire up the
browser and look at the results. If you rename your new
configuration file to 'webalizer.conf', you will only need
to type 'webalizer', and The Webalizer will use it as the
default. See the README file for more on configuration and
use of configuration files.
Language Support
----------------
Language support is provided as language specific header
files that must be compiled into the program. If you don't
have the source code, get it. If you can't compile the
program yourself, ask a friend. The /lang/ directory of
the distribution contains all supported languages at the
time of release. Additional/updated language files will
be found at ftp://ftp.webalizer.org/pub/webalizer/lang and
are always the most current versions.
To build with language support, use the --with-language
option of the configure script. This will automagically
do for you the steps described below. If you can't use
the configure script, you can manually select the language
file to use.
In the webalizer source directory, you will find a symbolic
link for the file webalizer_lang.h, and it will be pointing
to the file webalizer_lang.english which is the default.
Delete the link (ie: rm webalizer_lang.h) and create a new
one to the language file you want The Webalizer to use
(ie: ln -s lang/webalizer_lang.spanish webalizer_lang.h)
and re-compile the program.
Note: The source distribution of The Webalizer contains all
language support files that were available at the time.
Additional/updated language files can be found at:
ftp://ftp.webalizer.org/pub/webalizer/lang where I will
put them as I receive them.
Common Questions
----------------
Q: Will it run on [some platform]
A: If it is a *nix platform, it should without a problem. If it's
something different, probably not and your on your own if you
want to try to make it work.
Q: When I compile, I get "file not found" errors?
A: Most likely, the compiler cant find the header files for one
the required libraries. If they are someplace other than the
standard locations (ie: /usr/include), then you probably need
to specify an alternate location to look using one of the
--with-<package> command line switches when you run configure,
or edit the Makefile and specify the location with an '-I<path>'
compiler flag.
Q: I get "libgd not found' errors?
A: You don't have the GD graphics located in a standard library
path, or you don't have the GD graphics library at all. If
the later, go to http://www.boutell.com/gd/ and grab a copy.
If you do have it, add a -L switch in the Makefile to point
to the proper location.
Q: I get unresolved symbol errors when compiling, why?
A: This most often occurs when the GD library was built with
additional support for such things as TrueType fonts or
X11 graphics. The configure script for The Webalizer only
checks that the gd library is available, and does not check
any other dependencies it may have. Typically, to fix this
problem, you need to edit the Makefile and add the dependent
libraries to a compiler switch (or pass them on the command
line when running the configure script). For example, if
you are getting errors about not finding truetype routines,
you may need to add '-lttf' (for 'libttf', the truetype library)
to the "LIBS" variable.
Hint: I usually find it easier to just grab the GD library
source, and compile it myself locally as a static
library, in a directory just above where I compile The
Webalizer. Then, at configure time, just add the
'-with-gd=../gd' and '--with-gdlib=../gd' switches,
and the GD graphic stuff will be statically linked into
The Webalizer, eliminating any other library dependencies
that the normal, shared library on my system may have.

1949
database/webalizer/README Normal file

File diff suppressed because it is too large Load Diff

View File

@@ -0,0 +1,21 @@
Upgrade information for the Webalizer Version 2.2x
This release is, for the most part, a drop-in replacement for all
installations currently running 2.01, and all users are encouraged
to upgrade. See the 'CHANGES' file for a full list of changes
since version 2.01-10.
Note: The history file format has changed in v2.20 in order to keep
more than 12 months. Existing history files will be automatically
converted to the new format the first time they are read.
Note: This version redefines the '-v' command line switch to mean
'verbose', which will cause the program to display additional
informational and debugging messages at run-time. This should not
cause any major problems, as previously it would simply cause the
program to display its version information and then exit.
Report bugs to 'brad at mrunix dot net' with "Webalizer" somewhere
in the subject. Please do not send HTML formatted e-mails or e-mail
containing HTML tags as my mail server will reject them. Thanks!

BIN
database/webalizer/bgd.dll Normal file

Binary file not shown.

View File

@@ -0,0 +1,279 @@
ac Ascension Island
ad Andorra
ae United Arab Emirates
af Afghanistan
ag Antigua and Barbuda
ai Anguilla
al Albania
am Armenia
an Netherlands Antilles
ao Angola
aq Antarctica
ar Argentina
as American Samoa
at Austria
au Australia
aw Aruba
ax Aland Islands
az Azerbaijan
ba Bosnia and Herzegovina
bb Barbados
bd Bangladesh
be Belgium
bf Burkina Faso
bg Bulgaria
bh Bahrain
bi Burundi
bj Benin
bl Saint Barthelemy
bm Bermuda
bn Brunei Darussalam
bo Bolivia
br Brazil
bs Bahamas
bt Bhutan
bv Bouvet Island
bw Botswana
by Belarus
bz Belize
ca Canada
cc Cocos (Keeling) Islands
cd Congo, Democratic Republic
cf Central African Republic
cg Congo
ch Switzerland
ci Cote D'Ivoire (Ivory Coast)
ck Cook Islands
cl Chile
cm Cameroon
cn China
co Colombia
cr Costa Rica
cu Cuba
cv Cape Verde
cx Christmas Island
cy Cyprus
cz Czech Republic
de Germany
dj Djibouti
dk Denmark
dm Dominica
do Dominican Republic
dz Algeria
ec Ecuador
ee Estonia
eg Egypt
eh Western Sahara
er Eritrea
es Spain
et Ethiopia
eu European Union
fi Finland
fj Fiji
fk Falkland Islands (Malvinas)
fm Micronesia
fo Faroe Islands
fr France
ga Gabon
gb Great Britain (UK)
gd Grenada
ge Georgia
gf French Guiana
gg Guernsey
gh Ghana
gi Gibraltar
gl Greenland
gm Gambia
gn Guinea
gp Guadeloupe
gq Equatorial Guinea
gr Greece
gs S. Georgia and S. Sandwich Isls.
gt Guatemala
gu Guam
gw Guinea-Bissau
gy Guyana
hk Hong Kong
hm Heard and McDonald Islands
hn Honduras
hr Croatia
ht Haiti
hu Hungary
id Indonesia
ie Ireland
il Israel
im Isle of Man
in India
io British Indian Ocean Territory
iq Iraq
ir Iran
is Iceland
it Italy
je Jersey
jm Jamaica
jo Jordan
jp Japan
ke Kenya
kg Kyrgyzstan
kh Cambodia
ki Kiribati
km Comoros
kn Saint Kitts and Nevis
kp Korea, Democratic Republic of
kr Korea, Republic of
kw Kuwait
ky Cayman Islands
kz Kazakhstan
la Laos
lb Lebanon
lc Saint Lucia
li Liechtenstein
lk Sri Lanka
lr Liberia
ls Lesotho
lt Lithuania
lu Luxembourg
lv Latvia
ly Libya
ma Morocco
mc Monaco
md Moldova
me Montenegro
mf Saint Martin (French part)
mg Madagascar
mh Marshall Islands
mk Macedonia
ml Mali
mm Myanmar
mn Mongolia
mo Macau
mp Northern Mariana Islands
mq Martinique
mr Mauritania
ms Montserrat
mt Malta
mu Mauritius
mv Maldives
mw Malawi
mx Mexico
my Malaysia
mz Mozambique
na Namibia
nc New Caledonia
ne Niger
nf Norfolk Island
ng Nigeria
ni Nicaragua
nl Netherlands
no Norway
np Nepal
nr Nauru
nu Niue
nz New Zealand (Aotearoa)
om Oman
pa Panama
pe Peru
pf French Polynesia
pg Papua New Guinea
ph Philippines
pk Pakistan
pl Poland
pm St. Pierre and Miquelon
pn Pitcairn
pr Puerto Rico
ps Palestinian Territory, Occupied
pt Portugal
pw Palau
py Paraguay
qa Qatar
re Reunion
ro Romania
rs Serbia
ru Russian Federation
rw Rwanda
sa Saudi Arabia
sb Solomon Islands
sc Seychelles
sd Sudan
se Sweden
sg Singapore
sh St. Helena
si Slovenia
sj Svalbard and Jan Mayen Islands
sk Slovakia
sl Sierra Leone
sm San Marino
sn Senegal
so Somalia
sr Suriname
st Sao Tome and Principe
su Soviet Union
sv El Salvador
sy Syrian Arab Republic
sz Swaziland
tc Turks and Caicos Islands
td Chad
tf French Southern Territories
tg Togo
th Thailand
tj Tajikistan
tk Tokelau
tl Timor-Leste
tm Turkmenistan
tn Tunisia
to Tonga
tp Portuguese Timor
tr Turkey
tt Trinidad and Tobago
tv Tuvalu
tw Taiwan
tz Tanzania
ua Ukraine
ug Uganda
uk United Kingdom
um US Minor Outlying Islands
us United States
uy Uruguay
uz Uzbekistan
va Vatican City State (Holy See)
vc Saint Vincent and the Grenadines
ve Venezuela
vg Virgin Islands (British)
vi Virgin Islands (U.S.)
vn Viet Nam
vu Vanuatu
wf Wallis and Futuna Islands
ws Samoa
ye Yemen
yt Mayotte
yu Yugoslavia
za South Africa
zm Zambia
zw Zimbabwe
com US Commercial (com)
edu US Educational (edu)
gov US Government (gov)
int International (int)
mil US Military (mil)
net Network (net)
org Non-Profit Organization (org)
biz Generic Business (biz)
cat Catalan Community (cat)
pro Professional (pro)
tel Ind. Contact Data (tel)
aero Air Transport Industry (aero)
asia Asia Pacific Community (asia)
coop Cooperative Association (coop)
info Generic TLD (info)
jobs Human Resources (jobs)
mobi Generic Mobile TLD (mobi)
name Individual (name)
arpa Address Routing (arpa)
nato Nato field (nato)
museum Museums (museum)
travel Travel Ind. (travel)
a1 Anonymous Proxy
a2 Satellite Provider
o1 Other
ap Asia/Pacific Region
lan Local Network (lan)

Binary file not shown.

Binary file not shown.

After

Width:  |  Height:  |  Size: 1.4 KiB

View File

@@ -0,0 +1,779 @@
#
# Sample Webalizer configuration file
# Copyright 1997-2011 by Bradford L. Barrett
#
# Distributed under the GNU General Public License. See the
# files "Copyright" and "COPYING" provided with the webalizer
# distribution for additional information.
#
# This is a sample configuration file for the Webalizer (ver 2.20)
# Lines starting with pound signs '#' are comment lines and are
# ignored. Blank lines are skipped as well. Other lines are considered
# as configuration lines, and have the form "ConfigOption Value" where
# ConfigOption is a valid configuration keyword, and Value is the value
# to assign that configuration option. Invalid keyword/values are
# ignored, with appropriate warnings being displayed. There must be
# at least one space or tab between the keyword and its value.
#
# As of version 0.98, The Webalizer will look for a 'default' configuration
# file named "webalizer.conf" in the current directory, and if not found
# there, will look for "/etc/webalizer.conf".
# LogFile defines the web server log file to use. If not specified
# here or on on the command line, input will default to STDIN. If
# the log filename ends in '.gz' (a gzip compressed file), or '.bz2'
# (bzip2 compressed file), it will be decompressed on the fly as it
# is being read.
#LogFile /var/lib/httpd/logs/access_log
# LogType defines the log type being processed. Normally, the Webalizer
# expects a CLF or Combined web server log as input. Using this option,
# you can process ftp logs (xferlog as produced by wu-ftp and others),
# Squid native logs or W3C extended format web logs. Values can be 'clf',
# 'ftp', 'squid' or 'w3c'. The default is 'clf'.
#LogType clf
# OutputDir is where you want to put the output files. This should
# should be a full path name, however relative ones might work as well.
# If no output directory is specified, the current directory will be used.
#OutputDir /var/lib/httpd/htdocs/usage
# HistoryName allows you to specify the name of the history file produced
# by the Webalizer. The history file keeps the data for previous months,
# and is used for generating the main HTML page (index.html). The default
# is a file named "webalizer.hist", stored in the output directory being
# used. The name can include a path, which will be relative to the output
# directory unless absolute (starts with a leading '/').
#HistoryName webalizer.hist
# Incremental processing allows multiple partial log files to be used
# instead of one huge one. Useful for large sites that have to rotate
# their log files more than once a month. The Webalizer will save its
# internal state before exiting, and restore it the next time run, in
# order to continue processing where it left off. This mode also causes
# The Webalizer to scan for and ignore duplicate records (records already
# processed by a previous run). See the README file for additional
# information. The value may be 'yes' or 'no', with a default of 'no'.
# The file 'webalizer.current' is used to store the current state data,
# and is located in the output directory of the program (unless changed
# with the IncrementalName option below). Please read at least the section
# on Incremental processing in the README file before you enable this option.
#Incremental no
# IncrementalName allows you to specify the filename for saving the
# incremental data in. It is similar to the HistoryName option where the
# name is relative to the specified output directory, unless an absolute
# filename is specified. The default is a file named "webalizer.current"
# kept in the normal output directory. If you don't specify "Incremental"
# as 'yes' then this option has no meaning.
#IncrementalName webalizer.current
# ReportTitle is the text to display as the title. The hostname
# (unless blank) is appended to the end of this string (seperated with
# a space) to generate the final full title string.
# Default is (for english) "Usage Statistics for".
#ReportTitle Usage Statistics for
# HostName defines the hostname for the report. This is used in
# the title, and is prepended to the URL table items. This allows
# clicking on URLs in the report to go to the proper location in
# the event you are running the report on a 'virtual' web server,
# or for a server different than the one the report resides on.
# If not specified here, or on the command line, webalizer will
# try to get the hostname via a uname system call. If that fails,
# it will default to "localhost".
#HostName www.webalizer.org
# HTMLExtension allows you to specify the filename extension to use
# for generated HTML pages. Normally, this defaults to "html", but
# can be changed for sites who need it (like for PHP embeded pages).
#HTMLExtension html
# PageType lets you tell the Webalizer what types of URLs you
# consider a 'page'. Most people consider html and cgi documents
# as pages, while not images and audio files. If no types are
# specified, defaults will be used ('htm*', 'cgi' and HTMLExtension
# if different for web logs, 'txt' for ftp logs).
PageType htm*
PageType cgi
#PageType phtml
#PageType php3
#PageType pl
# PagePrefix allows all requests with a specified prefix to be
# considered as 'pages'. If you want everything under /documents
# to be treated as pages no matter what their extension is. Also
# useful if you have cgi-scripts with PATH_INFO.
#PagePrefix /documents
#PagePrefix /mycgi/parameters
# OmitPage lets you tell the Webalizer that certain URLs do not
# contain any pages. No URL matching an OmitPage value will be
# counted as a page, even if it matches a PageType above or has
# no extension (e.g., a directory). They will still be counted
# as a hit.
#OmitPage /render
# UseHTTPS should be used if the analysis is being run on a
# secure server, and links to urls should use 'https://' instead
# of the default 'http://'. If you need this, set it to 'yes'.
# Default is 'no'. This only changes the behaviour of the 'Top
# URLs' table.
#UseHTTPS no
# HTAccess allows the generation of a default .htaccess file in the
# output directory. If enabled, a default .htaccess file will be
# created (with a single "DirectoryIndex" directive), unless one
# already exists. Values may be 'yes' or 'no', with 'no'
# being the default (don't write .htaccess files).
#HTAccess no
# StripCGI determines if URL CGI variables should be striped or not.
# Historically, the Webalizer stripped all CGI variables from the end
# of URLs to improve accuracy. Some sites may prefer to keep the CGI
# variables in place, particularly those with highly dynamic pages.
# Values may be 'yes' or 'no', with the default being 'yes'.
#StripCGI yes
# The TrimSquidURL option only has effect on squid type log files.
# When analyzing a squid log, it is usually desirable to have less
# granularity on the URLs. TrimSquidURL = n where n is a number > 0
# causes all URLs to be truncated after the nth '/' after the http://
# portion. Setting TrimSquidURL to one (1) will cause all URLs to be
# summarized by domain only. The default is zero (0), which disables
# any such truncation and preserve the URLs as they are in the log.
# TrimSquidURL 0
# DNSCache specifies the DNS cache filename to use for reverse DNS lookups.
# This file must be specified if you wish to perform name lookups on any IP
# addresses found in the log file. If an absolute path is not given as
# part of the filename (ie: starts with a leading '/'), then the name is
# relative to the default output directory. See the DNS.README file for
# additional information.
#DNSCache dns_cache.db
# DNSChildren allows you to specify how many "children" processes are
# run to perform DNS lookups to create or update the DNS cache file.
# If a number is specified, the DNS cache file will be created/updated
# each time the Webalizer is run, immediately prior to normal processing,
# by running the specified number of "children" processes to perform
# DNS lookups. If used, the DNS cache filename MUST be specified as
# well. The default value is zero (0), which disables DNS cache file
# creation/updates at run time. The number of children processes to
# run may be anywhere from 1 to 100, however a large number may effect
# normal system operations. Reasonable values should be between 5 and
# 20. See the DNS.README file for additional information.
#DNSChildren 0
# CacheIPs allows unresolved IP addresses to be cached in the DNS
# database. Normally, only resolved addresses are saved. At some
# sites, particularly those with a large number of unresolvable IP
# addresses visiting, it may be useful to enable this feature so
# those addresses are not constantly looked up each time the program
# is run. Values can be 'yes' or 'no', with 'no' being the default.
#CacheIPs no
# CacheTTL specifies the time to live (TTL) value for cached DNS
# entries, in days. This value may be anywhere between 1 and 100
# with the default being 7 days (1 week).
#CacheTTL 7
# The GeoDB option enables or disabled the use of the native
# Webalizer GeoDB geolocation services. This is the preferred
# geolocation option. Values may be 'yes' or 'no', with 'no'
# being the default.
#GeoDB no
# GeoDBDatabase specifies an alternate database to use. The
# default database is /usr/share/GeoDB/GeoDB.dat (however the
# path may be changed at compile time; use the -vV command
# line option to determine where). If a different database is
# to be used, it may be specified here. The name is relative
# to the output directory being used unless an absolute name
# (ie: starts with a leading '/') is specified.
#GeoDBDatabase /usr/share/GeoDB/GeoDB.dat
# The GeoIP option enables or disables the use of geolocation
# services provided by the GeoIP library (http://www.maxmind.com),
# if available. Values may be 'yes' or 'no, with 'no' being the
# default. Note: if GeoDB is enabled, then this option will have
# no effect (GeoDB will be used regardless of this setting).
#GeoIP no
# GeoIPDatabase specifies an alternate database filename to use by the
# GeoIP library. If an absolute path is not given as part of the name
# (ie: starts with a leading '/'), then the name is relative to the
# default output directory. This option should not normally be needed.
#GeoIPDatabase /usr/share/GeoIP/GeoIP.dat
# HTMLPre defines HTML code to insert at the very beginning of the
# file. Default is the DOCTYPE line shown below. Max line length
# is 80 characters, so use multiple HTMLPre lines if you need more.
#HTMLPre <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">
# HTMLHead defines HTML code to insert within the <HEAD></HEAD>
# block, immediately after the <TITLE> line. Maximum line length
# is 80 characters, so use multiple lines if needed.
#HTMLHead <META NAME="author" CONTENT="The Webalizer">
#HTMLHead <META NAME="ROBOTS" CONTENT="NOINDEX, NOFOLLOW">
# HTMLBody defined the HTML code to be inserted, starting with the
# <BODY> tag. If not specified, the default is shown below. If
# used, you MUST include your own <BODY> tag as the first line.
# Maximum line length is 80 char, use multiple lines if needed.
#HTMLBody <BODY BGCOLOR="#E8E8E8" TEXT="#000000" LINK="#0000FF" VLINK="#FF0000">
# HTMLPost defines the HTML code to insert immediately before the
# first <HR> on the document, which is just after the title and
# "summary period"-"Generated on:" lines. If anything, this should
# be used to clean up in case an image was inserted with HTMLBody.
# As with HTMLHead, you can define as many of these as you want and
# they will be inserted in the output stream in order of apperance.
# Max string size is 80 characters. Use multiple lines if you need to.
#HTMLPost <BR CLEAR="all">
# HTMLTail defines the HTML code to insert at the bottom of each
# HTML document, usually to include a link back to your home
# page or insert a small graphic. It is inserted as a table
# data element (ie: <TD> your code here </TD>) and is right
# alligned with the page. Max string size is 80 characters.
#HTMLTail <IMG SRC="msfree.png" ALT="100% Micro$oft free!">
# HTMLEnd defines the HTML code to add at the very end of the
# generated files. It defaults to what is shown below. If
# used, you MUST specify the </BODY> and </HTML> closing tags
# as the last lines. Max string length is 80 characters.
#HTMLEnd </BODY></HTML>
# The LinkReferrer option determines if entries in the referrer table
# should be plain text or a HTML link to the referrer. Values can be
# either 'yes' or 'no', with 'no' being the default.
#LinkReferrer no
# The Quiet option suppresses output messages... Useful when run
# as a cron job to prevent bogus e-mails. Values can be either
# "yes" or "no". Default is "no". Note: this does not suppress
# warnings and errors (which are printed to stderr).
#Quiet no
# ReallyQuiet will supress all messages including errors and
# warnings. Values can be 'yes' or 'no' with 'no' being the
# default. If 'yes' is used here, it cannot be overriden from
# the command line, so use with caution. A value of 'no' has
# no effect.
#ReallyQuiet no
# TimeMe allows you to force the display of timing information
# at the end of processing. A value of 'yes' will force the
# timing information to be displayed. A value of 'no' has no
# effect.
#TimeMe no
# GMTTime allows reports to show GMT (UTC) time instead of local
# time. Default is to display the time the report was generated
# in the timezone of the local machine, such as EDT or PST. This
# keyword allows you to have times displayed in UTC instead. Use
# only if you really have a good reason, since it will probably
# screw up the reporting periods by however many hours your local
# time zone is off of GMT.
#GMTTime no
# Debug prints additional information for error messages. This
# will cause webalizer to dump bad records/fields instead of just
# telling you it found a bad one. As usual, the value can be
# either "yes" or "no". The default is "no". It shouldn't be
# needed unless you start getting a lot of Warning or Error
# messages and want to see why. (Note: warning and error messages
# are printed to stderr, not stdout like normal messages).
#Debug no
# FoldSeqErr forces the Webalizer to ignore sequence errors.
# This is useful for Netscape and other web servers that cache
# the writing of log records and do not guarentee that they
# will be in chronological order. The use of the FoldSeqErr
# option will cause out of sequence log records to be treated
# as if they had the same time stamp as the last valid record.
# Default is to ignore out of sequence log records. The use
# of this feature is strongly discouraged and rarely needed.
# (the webalizer already compensates for up to 60 minutes of
# difference between records).
#FoldSeqErr no
# VisitTimeout allows you to set the default timeout for a visit
# (sometimes called a 'session'). The default is 30 minutes,
# which should be fine for most sites.
# Visits are determined by looking at the time of the current
# request, and the time of the last request from the site. If
# the time difference is greater than the VisitTimeout value, it
# is considered a new visit, and visit totals are incremented.
# Value is the number of seconds to timeout (default=1800=30min)
#VisitTimeout 1800
# IgnoreHist shouldn't be used in a config file, but it is here
# just because it might be usefull in certain situations. If the
# history file is ignored, the main "index.html" file will only
# report on the current log files contents. Usefull only when you
# want to reproduce the reports from scratch. USE WITH CAUTION!
# Valid values are "yes" or "no". Default is "no".
#IgnoreHist no
# IgnoreState also shouldn't be used, but is here anyway. It is
# similar to the IgnoreHist option, but for the incremental data
# file. If this is set to 'yes', any existing incrememtal data
# will be ignored and a new data file will be written at the end
# of processing. USE WITH CAUTION. By ignoring an existing
# incremental data file, all previous processing for the current
# month will be lost, and those logs must be re-processed.
# Valid values are "yes" or "no". Default is "no".
#IgnoreState no
# CountryGraph allows the usage by country graph to be disabled.
# Values can be 'yes' or 'no', default is 'yes'.
#CountryGraph yes
# CountryFlags allows flags to be displayed in the top country
# table in monthly reports. Values can be 'yes' or 'no', with
# the default being 'no'.
#CountryFlags no
# FlagDir specifies the location of flag graphics which will be
# used in the top country table. If not specified, the default
# is to look in the 'flags' directory directly under the output
# directory being used for the reports. If this option is used,
# the display of flag graphics will be enabled by default.
#FlagDir flags
# DailyGraph and DailyStats allows the daily statistics graph
# and statistics table to be disabled (not displayed). Values
# may be "yes" or "no". Default is "yes".
#DailyGraph yes
#DailyStats yes
# HourlyGraph and HourlyStats allows the hourly statistics graph
# and statistics table to be disabled (not displayed). Values
# may be "yes" or "no". Default is "yes".
#HourlyGraph yes
#HourlyStats yes
# GraphLegend allows the color coded legends to be turned on or off
# in the graphs. The default is for them to be displayed. This only
# toggles the color coded legends, the other legends are not changed.
# If you think they are hideous and ugly, say 'no' here :)
#GraphLegend yes
# GraphLines allows you to have index lines drawn behind the graphs.
# I personally am not crazy about them, but a lot of people requested
# them and they weren't a big deal to add. The number represents the
# number of lines you want displayed. Default is 2, you can disable
# the lines by using a value of zero ('0'). [max is 20]
# Note, due to rounding errors, some values don't work quite right.
# The lower the better, with 1,2,3,4,6 and 10 producing nice results.
#GraphLines 2
# IndexMonths defines the number of months to display in the main index
# (yearly summary) table. Value can be between 12 and 120, with the
# default being 12 months (1 year).
#IndexMonths 12
# YearHeaders enables/disables the display of year headers in the main
# index (yearly summary) table. If enabled, year headers will be shown
# when the table is displaying more than 16 months worth of data. Values
# can be 'yes' or 'no', with 'yes' being the default.
#YearHeaders yes
# YearTotals enables/disables the display of yearly totals in the main
# index (yearly summary) table. If enabled, year totals will be shown
# when the table is displaying more than 16 months worth of data. Values
# can be 'yes' or 'no', with 'yes' being the default.
#YearTotals yes
# GraphMonths defines the number of months to display in the main index
# (yearly summary) graph. Value can be between 12 and 72 months, with
# the default being 12 months.
#GraphMonths 12
# The "Top" options below define the number of entries for each table.
# Defaults are Sites=30, URLs=30, Referrers=30 and Agents=15, and
# Countries=30. TopKSites and TopKURLs (by KByte tables) both default
# to 10, as do the top entry/exit tables (TopEntry/TopExit). The top
# search strings and usernames default to 20. Tables may be disabled
# by using zero (0) for the value.
#TopSites 30
#TopKSites 10
#TopURLs 30
#TopKURLs 10
#TopReferrers 30
#TopAgents 15
#TopCountries 30
#TopEntry 10
#TopExit 10
#TopSearch 20
#TopUsers 20
# The All* keywords allow the display of all URLs, Sites, Referrers
# User Agents, Search Strings and Usernames. If enabled, a seperate
# HTML page will be created, and a link will be added to the bottom
# of the appropriate "Top" table. There are a couple of conditions
# for this to occur.. First, there must be more items than will fit
# in the "Top" table (otherwise it would just be duplicating what is
# already displayed). Second, the listing will only show those items
# that are normally visable, which means it will not show any hidden
# items. Grouped entries will be listed first, followed by individual
# items. The value for these keywords can be either 'yes' or 'no',
# with the default being 'no'. Please be aware that these pages can
# be quite large in size, particularly the sites page, and seperate
# pages are generated for each month, which can consume quite a lot
# of disk space depending on the traffic to your site.
#AllSites no
#AllURLs no
#AllReferrers no
#AllAgents no
#AllSearchStr no
#AllUsers no
# The Webalizer normally strips the string 'index.' off the end of
# URLs in order to consolidate URL totals. For example, the URL
# /somedir/index.html is turned into /somedir/ which is really the
# same URL. This option allows you to specify additional strings
# to treat in the same way. You don't need to specify 'index.' as
# it is always scanned for by The Webalizer, this option is just to
# specify _additional_ strings if needed. If you don't need any,
# don't specify any as each string will be scanned for in EVERY
# log record... A bunch of them will degrade performance. Also,
# the string is scanned for anywhere in the URL, so a string of
# 'home' would turn the URL /somedir/homepages/brad/home.html into
# just /somedir/ which is probably not what was intended.
#IndexAlias home.htm
#IndexAlias homepage.htm
# The DefaultIndex option is used to enable/disable the use of
# "index." as the default index name to be stripped off the end of
# a URL (as described above). Most sites will not need to use this
# option, but some may, such as those whose default index file name
# is different, or those that use "index.php" or similar URLs in a
# dynamic environment. Values can be 'yes' or 'no', with the default
# being 'yes'. This option does not effect any names added using the
# IndexAlias option, and those names will still function as described
# regardless of this setting.
#DefaultIndex yes
# The Hide*, Group* and Ignore* and Include* keywords allow you to
# change the way Sites, URLs, Referrers, User Agents and Usernames
# are manipulated. The Ignore* keywords will cause The Webalizer to
# completely ignore records as if they didn't exist (and thus not
# counted in the main site totals). The Hide* keywords will prevent
# things from being displayed in the 'Top' tables, but will still be
# counted in the main totals. The Group* keywords allow grouping
# similar objects as if they were one. Grouped records are displayed
# in the 'Top' tables and can optionally be displayed in BOLD and/or
# shaded. Groups cannot be hidden, and are not counted in the main
# totals. The Group* options do not, by default, hide all the items
# that it matches. If you want to hide the records that match (so just
# the grouping record is displayed), follow with an identical Hide*
# keyword with the same value. (see example below) In addition,
# Group* keywords may have an optional label which will be displayed
# instead of the keywords value. The label should be seperated from
# the value by at least one 'white-space' character, such as a space
# or tab. If the match string contains whitespace (spaces or tabs),
# the string should be quoted with either single or double quotes.
#
# The value can have either a leading or trailing '*' wildcard
# character. If no wildcard is found, a match can occur anywhere
# in the string. Given a string "www.yourmama.com", the values "your",
# "*mama.com" and "www.your*" will all match.
# Your own site should be hidden
#HideSite *webalizer.org
#HideSite localhost
# Your own site gives most referrals
#HideReferrer webalizer.org/
# This one hides non-referrers ("-" Direct requests)
#HideReferrer Direct Request
# Usually you want to hide these
HideURL *.gif
HideURL *.GIF
HideURL *.jpg
HideURL *.JPG
HideURL *.png
HideURL *.PNG
HideURL *.ra
# Hiding agents is kind of futile
#HideAgent RealPlayer
# You can also hide based on authenticated username
#HideUser root
#HideUser admin
# Grouping options
#GroupURL /cgi-bin/* CGI Scripts
#GroupURL /images/* Images
#GroupSite *.aol.com
#GroupSite *.compuserve.com
#GroupReferrer yahoo.com/ Yahoo!
#GroupReferrer excite.com/ Excite
#GroupReferrer infoseek.com/ InfoSeek
#GroupReferrer webcrawler.com/ WebCrawler
#GroupUser root Admin users
#GroupUser admin Admin users
#GroupUser wheel Admin users
# The following is a great way to get an overall total
# for browsers, and not display all the detail records.
# (You should use MangleAgent to refine further...)
#GroupAgent Opera/ Opera
#HideAgent Opera/
#GroupAgent "MSIE 7" Microsoft Internet Exploder 7
#HideAgent MSIE 7
#GroupAgent "MSIE 6" Microsoft Internet Exploder 6
#HideAgent MSIE 6
#GroupAgent "MSIE " Older Microsoft Exploders
#HideAgent MSIE
#GroupAgent Firefox/2. Firefox 2
#HideAgent Firefox/2.
#GroupAgent Firefox/1. Firefox 1.x
#HideAgent Firefox/1.
#GroupAgent Konqueror Konqueror
#HideAgent Konqueror
#GroupAgent Safari Safari
#HideAgent Safari
#GroupAgent Lynx* Lynx
#HideAgent Lynx*
#GroupAgent Wget/ WGet
#HideAgent Wget/
#GroupAgent (compatible; Other Mozilla Compatibles
#HideAgent (compatible;
#GroupAgent Mozilla* Mozilla/Netscape
#HideAgent Mozilla*
# HideAllSites allows forcing individual sites to be hidden in the
# report. This is particularly useful when used in conjunction
# with the "GroupDomain" feature, but could be useful in other
# situations as well, such as when you only want to display grouped
# sites (with the GroupSite keywords...). The value for this
# keyword can be either 'yes' or 'no', with 'no' the default,
# allowing individual sites to be displayed.
#HideAllSites no
# The GroupDomains keyword allows you to group individual hostnames
# into their respective domains. The value specifies the level of
# grouping to perform, and can be thought of as 'the number of dots'
# that will be displayed. For example, if a visiting host is named
# cust1.tnt.mia.uu.net, a domain grouping of 1 will result in just
# "uu.net" being displayed, while a 2 will result in "mia.uu.net".
# The default value of zero disable this feature. Domains will only
# be grouped if they do not match any existing "GroupSite" records,
# which allows overriding this feature with your own if desired.
#GroupDomains 0
# The GroupShading allows grouped rows to be shaded in the report.
# Useful if you have lots of groups and individual records that
# intermingle in the report, and you want to diferentiate the group
# records a little more. Value can be 'yes' or 'no', with 'yes'
# being the default.
#GroupShading yes
# GroupHighlight allows the group record to be displayed in BOLD.
# Can be either 'yes' or 'no' with the default 'yes'.
#GroupHighlight yes
# The Ignore* keywords allow you to completely ignore log records based
# on hostname, URL, user agent, referrer or username. I hesitated in
# adding these, since the Webalizer was designed to generate _accurate_
# statistics about a web servers performance. By choosing to ignore
# records, the accuracy of reports become skewed, negating why I wrote
# this program in the first place. However, due to popular demand, here
# they are. Use the same as the Hide* keywords, where the value can have
# a leading or trailing wildcard '*'. Use at your own risk ;) Please
# remember, the use of these will MAKE YOUR STATS INACCURATE and you
# should consider using an equivalent 'Hide*' keyword instead.
#IgnoreSite bad.site.net
#IgnoreURL /test*
#IgnoreReferrer file:/*
#IgnoreAgent RealPlayer
#IgnoreUser root
# The Include* keywords allow you to force the inclusion of log records
# based on hostname, URL, user agent, referrer or username. They take
# precidence over the Ignore* keywords. Note: Using Ignore/Include
# combinations to selectivly process parts of a web site is _extremely
# inefficent_!!! Avoid doing so if possible (ie: grep the records to a
# seperate file if you really want that kind of report).
# Example: Only show stats on Joe User's pages...
#IgnoreURL *
#IncludeURL ~joeuser*
# Or based on an authenticated username
#IgnoreUser *
#IncludeUser someuser
# The MangleAgents allows you to specify how much, if any, The Webalizer
# should mangle user agent names. This allows several levels of detail
# to be produced when reporting user agent statistics. There are six
# levels that can be specified, which define different levels of detail
# supression. Level 5 shows only the browser name (MSIE or Mozilla)
# and the major version number. Level 4 adds the minor version number
# (single decimal place). Level 3 displays the minor version to two
# decimal places. Level 2 will add any sub-level designation (such
# as Mozilla/3.01Gold or MSIE 3.0b). Level 1 will attempt to also add
# the system type if it is specified. The default Level 0 displays the
# full user agent field without modification and produces the greatest
# amount of detail. User agent names that can't be mangled will be
# left unmodified.
#MangleAgents 0
# The SearchEngine keywords allow specification of search engines and
# their query strings on the URL. These are used to locate and report
# what search strings are used to find your site. The first word is
# a substring to match in the referrer field that identifies the search
# engine, and the second is the URL variable used by that search engine
# to define it's search terms.
#SearchEngine .google. q=
#SearchEngine yahoo.com p=
#SearchEngine altavista.com q=
#SearchEngine aolsearch. query=
#SearchEngine ask.co q=
#SearchEngine eureka.com q=
#SearchEngine lycos.com query=
#SearchEngine hotbot.com MT=
#SearchEngine msn.com q=
#SearchEngine infoseek.com qt=
#SearchEngine excite search=
#SearchEngine netscape.com query=
#SearchEngine mamma.com query=
#SearchEngine alltheweb.com q=
#SearchEngine northernlight.com qr=
# Normally, search strings are converted to lower case in order to
# increase accuracy. The SearchCaseI option allows them to maintain
# case sensitivity, useful for some sites. The value can be 'yes'
# or 'no', with 'yes' (case insensitive) being the default.
#SearchCaseI yes
# The Dump* keywords allow the dumping of Sites, URLs, Referrers
# User Agents, Usernames and Search strings to seperate tab delimited
# text files, suitable for import into most database or spreadsheet
# programs.
# DumpPath specifies the path to dump the files. If not specified,
# it will default to the current output directory. Do not use a
# trailing slash ('/').
#DumpPath /var/lib/httpd/logs
# The DumpHeader keyword specifies if a header record should be
# written to the file. A header record is the first record of the
# file, and contains the labels for each field written. Normally,
# files that are intended to be imported into a database system
# will not need a header record, while spreadsheets usually do.
# Value can be either 'yes' or 'no', with 'no' being the default.
#DumpHeader no
# DumpExtension allow you to specify the dump filename extension
# to use. The default is "tab", but some programs are pickey about
# the filenames they use, so you may change it here (for example,
# some people may prefer to use "csv").
#DumpExtension tab
# These control the dumping of each individual table. The value
# can be either 'yes' or 'no'.. the default is 'no'.
#DumpSites no
#DumpURLs no
#DumpReferrers no
#DumpAgents no
#DumpUsers no
#DumpSearchStr no
# The custom graph colors are defined here. Declare them
# in the standard hexadecimal way (as HTML, without the '#')
# If none are given, you will get the standard default colors.
#ColorHit 00805c
#ColorFile 0040ff
#ColorSite ff8000
#ColorKbyte ff0000
#ColorPage 00e0ff
#ColorVisit ffff00
#ColorMisc 00e0ff
#PieColor1 800080
#PieColor2 80ffc0
#PieColor3 ff00ff
#PieColor4 ffc080
# End of configuration file... Have a nice day!

110
database/webalizer/wcmgr.1 Normal file
View File

@@ -0,0 +1,110 @@
.TH wcmgr 1 "12-Jul-2008" "Version 1.00" "The Webalizer"
.SH NAME
wcmgr - Webalizer (DNS) Cache file Manager
.SH SYNOPSIS
.B wcmgr\fP [\fI option ... \fP] \fIcache-file\fP
.PP
.SH DESCRIPTION
\fIwcmgr\fP is a utility program which allows manipulation of the DNS cache
files used and produced by The \fIWebalizer\fP. Each record in the cache
file contains an IP address (either IPv4 or IPv6), a timestamp of when the
entry was added to the cache, a flag to indicate if the record contains
a resolved name or not, and either the same IP address or a resolved host
name. All records are accessed by their IP address.
.SH RUNNING WCMGR
\fIwcmgr\fP was designed to be run from the Unix shell command line. This
facilitates its use in shell scripts and other automated processes. A
valid DNS cache file \fBmust\fP be specified. Command line options are
optional, and if none are given, the default action is to list the
contents of the specified cache file.
.SH COMMAND LINE OPTIONS
Different functions are selected by using one or more of the following
command line options. If no options are given, the default is to display
the contents of the cache file to the screen (stdout).
.PP
.TP 8
.B \-h
Display all available command line options and exit.
.TP 8
.B \-v
Be verbose.
.TP 8
.B \-V
Display the program version and exit. Additional program specific
information will be displayed if \fIverbose\fP mode is also used
(e.g. '\fI-vV\fP'), which can be useful when submitting bug reports.
.TP 8
.B \-a \fIaddress\fP [\fI-n hostname\fP] [\fI-t0\fP]
Add a new record to the cache file. The IP \fIaddress\fP will be added to
the cache file using the current time as the timestamp and with a resolved
name \fIhostname\fP. If \fI-t0\fP is specified, the record will be
considered permanent, and will not be removed (during a purge) or expired.
If a \fIhostname\fP is not specified with the \fI-n\fP option, then the
\fIaddress\fP will be used instead, and the record will be flagged as
unresolved.
.TP 8
.B \-c
Create a new cache file. If used alone, this option will create a new,
empty cache file. If used with the \fIimport\fP option, a new cache
file will be created before importing the data. An error will occur
if the file \fIcache-file\fP already exists.
.TP 8
.B \-d \fIaddress\fP
Delete a record from the cache file using the specified \fIaddress\fP.
.TP 8
.B \-f \fIaddress\fP
Find and display information for \fIaddress\fP from the cache file.
A single line similar to that produced by the \fI-l\fP option will
be displayed unless \fIverbose\fP mode is enabled, in which case a
more detailed listing will be produced.
.TP 8
.B \-i \fIname\fP [\fI-c\fP]
Import data into the cache file from the file \fIname\fP. The import
file must be a valid tab delimited text file, such as that created by
the \fIexport\fP option. If the imported data contains records already
present in the cache file, those records will be overwritten by the
imported data. The cache file must exist unless the \fI-c\fP option
is specified, in which case, a new cache file will be created for the
imported data.
.TP 8
.B \-l
List the contents of the cache file. This is the default action of the
program, so does not necessarily need to be specified. If \fIverbose\fP
mode is enabled, a report title, column headers and summary totals will
also be displayed.
.TP 8
.B \-p \fInum\fP
Purge the cache file of entries older than \fInum\fP days. If \fInum\fP
is not specified, then a default of \fB7 days\fP will be used. if
\fIverbose\fP mode is enabled, each purged record will be printed and
the total number of purged records will be displayed.
.TP 8
.B \-s [\fI-t num\fP]
Display cache file information/statistics. If a TTL value (in days) is
specified using the \fI-t\fP option, it will be used to calculate how
many records are older than \fInum\fP days, otherwise, the default value
of \fB7 days\fP will be used.
.TP 8
.B \-n \fIname\fP
Specify the \fIname\fP to use as the resolved hostname when adding records
to the cache.
.TP 8
.B \-t \fInum\fP
Time to live (TTL) value. If used along with the \fI-p\fP (purge) option,
it specifies how many days a record will remain valid. Any record that is
older than \fInum\fP days is considered expired and will be purged. If
used with the \fI-a\fP (add) option, a zero value will cause the record
to be considered permanent.
.TP 8
.B \-x \fIname\fP
Export data from a cache file to a tab delimited text file named \fIname\fP.
If the text file \fIname\fP exists, it will be overwritten.
.SH BUGS
Please report bugs to the author.
.SH COPYRIGHT
Copyright (C) 1997-2011 by Bradford L. Barrett. Distributed under
the GNU GPL. See the files "\fICOPYING\fP" and "\fICopyright\fP",
supplied with all distributions for additional information.
.SH AUTHOR
Bradford L. Barrett <\fIbrad at mrunix dot net\fP>

Binary file not shown.

View File

@@ -0,0 +1,905 @@
.TH webalizer 1 "12-Jul-2008" "Version 2.20" "The Webalizer"
.SH NAME
webalizer - A web server log file analysis tool.
.SH SYNOPSIS
.B webalizer
[\fI option ... \fP] [\fI log-file \fP]
.PP
.B webazolver
[\fI option ... \fP] [\fI log-file \fP]
.PP
.SH DESCRIPTION
The \fIWebalizer\fP is a web server log file analysis program which produces
usage statistics in HTML format for viewing with a browser. The results
are presented in both columnar and graphical format, which facilitates
interpretation. Yearly, monthly, daily and hourly usage statistics are
presented, along with the ability to display usage by site, URL, referrer,
user agent (browser), username, search strings, entry/exit pages, and
country (some information may not be available if not present in the log
file being processed).
.PP
The \fIWebalizer\fP supports \fBCLF\fP (common log format) log files,
as well as \fBCombined\fP log formats as defined by NCSA and others,
and variations of these which it attempts to handle intelligently. In
addition, the \fIWebalizer\fP supports \fBxferlog\fP formatted (\fIFTP\fP)
log files, \fBsquid\fP proxy logs and \fBW3C\fP extended format logs.
Logs may also be compressed, via \fIgzip\fP (.gz) or, if enabled at compile
time, \fIbzip2\fP (.bz2). If a compressed log file is detected, it will be
automatically uncompressed while it is read. Compressed logs must have the
standard \fIgzip\fP extension of \fB.gz\fP or \fIbzip2\fP extension of
\fB.bz2\fP.
.PP
\fIwebazolver\fP is normally just a symbolic link to the \fIWebalizer\fP.
When run as \fIwebazolver\fP, only DNS file creation/updates are performed,
and the program will exit once complete. All normal options and
configuration directives are available, however many will not be used.
In addition, a DNS cache file must be specified. If the number of DNS
children processes to use are not specified, the \fIwebazolver\fP will
default to \fB5\fP.
.PP
This documentation applies to The Webalizer Version 2.20
.SH RUNNING THE WEBALIZER
The \fIWebalizer\fP was designed to be run from a Unix command line prompt or
as a \fBcrond(8)\fP job. Once executed, the general flow of the program is:
.TP 8
.B o
A default configuration file is scanned for. A file named
\fIwebalizer.conf\fP is searched for in the current directory, and if
found, it's configuration data is parsed. If the file is not
present in the current directory, the file \fI/etc/webalizer.conf\fP
is searched for and, if found, is used instead.
.TP 8
.B o
Any command line arguments given to the program are parsed. This
may include the specification of a configuration file, which is
processed at the time it is encountered.
.TP 8
.B o
If a log file was specified, it is opened and made ready for
processing. If no log file was given, \fISTDIN\fP is used for input.
If the log filename '\fB-\fP' is specified, \fISTDIN\fP will be forced.
.TP 8
.B o
If an output directory was specified, the program does a \fBchdir(2)\fP to
that directory in preparation for generating output. If no output
directory was given, the current directory is used.
.TP 8
.B o
If a non-zero number of DNS Children processes were specified, they will
be started, and the specified log file will be processed, creating or
updating the specified DNS cache file.
.TP 8
.B o
If no hostname was given, the program attempts to get the hostname
using a \fBuname(2)\fP system call. If that fails, \fIlocalhost\fP
is used.
.TP 8
.B o
A history file is searched for in the current directory (output
directory) and read if found. This file keeps totals for previous
months, which is used in the main \fIindex.html\fP HTML document.
.B Note:
The file location can now be specified with the \fIHistoryName\fP
configuration option.
.TP 8
.B o
If incremental processing was specified, a data file is searched for
and loaded if found, containing the 'internal state' data of the
program at the end of a previous run.
.B Note:
The file location can now be specified with the \fIIncrementalName\fP
configuration option.
.TP 8
.B o
Main processing begins on the log file. If the log spans multiple
months, a separate HTML document is created for each month.
.TP 8
.B o
After main processing, the main \fIindex.html\fP page is created, which
has totals by month and links to each months HTML document.
.TP 8
.B o
A new history file is saved to disk, which includes totals generated
by The \fIWebalizer\fP during the current run.
.TP 8
.B o
If incremental processing was specified, a data file is written that
contains the 'internal state' data at the end of this run.
.SH INCREMENTAL PROCESSING
The \fIWebalizer\fP supports incremental run capability. Simply
put, this allows processing large log files by breaking them up into
smaller pieces, and processing these pieces instead. What this means
in real terms is that you can now rotate your log files as often as you
want, and still be able to produce monthly usage statistics without the
loss of any detail. Basically, The \fIWebalizer\fP saves and restores all
internal data in a file named \fIwebalizer.current\fP. This allows the
program to 'start where it left off' so to speak, and allows the
preservation of detail from one run to the next. The data file is
placed in the current output directory, and is a plain ASCII text
file that can be viewed with any standard text editor. It's location
and name may be changed using the \fIIncrementalName\fP configuration
keyword.
.PP
Some special precautions need to be taken when using the incremental
run capability of The \fIWebalizer\fP. Configuration options should not be
changed between runs, as that could cause corruption of the internal
data stored. For example, changing the \fIMangleAgents\fP level will cause
different representations of user agents to be stored, producing invalid
results in the user agents section of the report. If you need to change
configuration options, do it at the end of the month after normal
processing of the previous month and before processing the current month.
You may also want to delete the \fIwebalizer.current\fP file as well.
.PP
The \fIWebalizer\fP also attempts to prevent data duplication by keeping
track of the timestamp of the last record processed. This timestamp
is then compared to current records being processed, and any records
that were logged previous to that timestamp are ignored. This, in
theory, should allow you to re-process logs that have already been
processed, or process logs that contain a mix of processed/not yet
processed records, and not produce duplication of statistics. The
only time this may break is if you have duplicate timestamps in two
separate log files... any records in the second log file that do have
the same timestamp as the last record in the previous log file processed,
will be discarded as if they had already been processed. There are
lots of ways to prevent this however, for example, stopping the web
server before rotating logs will prevent this situation. This setup
also necessitates that you always process logs in chronological order,
otherwise data loss will occur as a result of the timestamp compare.
.SH REVERSE DNS LOOKUPS
The \fIWebalizer\fP fully supports IPv4 and IPv6 DNS lookups, and
maintains a cache of those lookups to reduce processing the same
addresses in subsequent runs. The cache file can be created at
run-time, or may be created before running the webalizer using either
the stand alone '\fIwebazolver\fP' program, or The Webalizer (DNS)
Cache file manager program '\fIwcmgr\fP'. In order to perform reverse
lookups, a \fBDNSCache\fP file must be specified, either on the command
line or in a configuration file. In order to create/update the cache
file at run-time, the number of \fBDNSChildren\fP must also be specified,
and can be anything between 1 and 100. This specifies the number of
child processes to be forked, each of which will perform network DNS
queries in order to lookup up the addresses and update the cache.
Cached entries that are older than a specified TTL (time to live)
will be expired, and if encountered again in a log, will be looked
up at that time in order to 'freshen' them (verify the name is still
the same and update its timestamp). The default TTL is 7 days, however
may be set to anything between 1 and 100 days. Using the '\fIwcmgr\fP'
program, entries may also be marked as 'permanent', in which case
they will persist (with an infinite TTL) in the cache until manually
removed. See the file \fBDNS.README\fP for additional information
and examples.
.SH GEOLOCATION LOOKUPS
The \fIWebalizer\fP has the ability to perform geolocation lookups on
IP addresses using either it's own internal \fIGeoDB\fP database, or
optionally the \fIGeoIP\fP database from MaxMind, Inc. (www.maxmind.com).
If used, unresolved addresses will be searched for in the database and
its country of origin will be returned if found. This actually produces
more accurate \fICountry\fP information than DNS lookups, since the DNS
address space has additional \fIgcTLDs\fP that do not necessarily map
to a specific country (such as \fI.net\fP and \fI.com\fP). It is possible
to use both DNS lookups and geolocation lookups at the same time, which
will cause any addresses that could not be resolved using DNS lookups to
then be looked up in the database, greatly reducing the number of
\fIUnknown/Unresolved\fP entries in the generated reports. The native
\fIGeoDB\fP geolocation database provided by The \fIWebalizer\fP fully
supports both \fIIPv4\fP and \fIIPv6\fP lookups, is updated regularly and
is the preferred geolocation method for use with The \fIWebalizer\fP. The
most current version of the database can be obtained from our ftp site
(\fIftp://ftp.mrunix.net/\fP).
.SH COMMAND LINE OPTIONS
The \fIWebalizer\fP supports many different configuration options that will
alter the way the program behaves and generates output. Most of these
can be specified on the command line, while some can only be specified
in a configuration file. The command line options are listed below,
with references to the corresponding configuration file keywords.
.PP
.I General Options
.TP 8
.B \-h
Display all available command line options and exit program.
.TP 8
.B \-v
Be verbose. Will cause the program to output informational
and \fIDebug\fP messages at run-time.
.TP 8
.B \-V
Display the program version and exit. Additional program specific
information will be displayed if \fIverbose\fP mode is also used
(e.g. '\fI-vV\fP'), which can be useful when submitting bug reports.
.TP 8
.B \-d
\fBDebug\fP. Display debugging information for errors and warnings.
.TP 8
.B \-i
\fBIgnoreHist\fP. Ignore history. \fBUSE WITH CAUTION\fP. This
will cause The \fIWebalizer\fP to ignore any previous monthly history
file only. Incremental data (if present) is still processed.
.TP 8
.B \-b
\fBIgnoreState\fP. Ignore incremental data file. \fBUSE WITH CAUTION\fP.
This will cause The \fIWebalizer\fP to ignore any existing incremental
data file. By ignoring the incremental data file, all previous processing
for the current month will be lost and those logs must be re-processed.
.TP 8
.B \-p
\fBIncremental\fP. Preserve internal data between runs.
.TP 8
.B \-q
\fBQuiet\fP. Suppress informational messages. Does not suppress
warnings or errors.
.TP 8
.B \-Q
\fBReallyQuiet\fP. Suppress all messages including warnings and errors.
.TP 8
.B \-T
\fBTimeMe\fP. Force display of timing information at end of processing.
.TP 8
.B \-c \fIfile\fP
Use configuration file \fIfile\fP.
.TP 8
.B \-n \fIname\fP
\fBHostName\fP. Use the hostname \fIname\fP.
.TP 8
.B \-o \fIdir\fP
\fBOutputDir\fP. Use output directory \fIdir\fP.
.TP 8
.B \-t \fIname\fP
\fBReportTitle\fP. Use \fIname\fP for report title.
.TP 8
.B \-F \fP( \fBc\fPlf | \fBf\fPtp | \fBs\fPquid | \fBw\fP3c )
\fBLogType\fP. Specify log type to be processed. Value can be either
\fIc\fPlf, \fIf\fPtp, \fIs\fPquid or \fIw\fP3c format. If not specified,
will default to \fBCLF\fP format. \fIFTP\fP logs must be in standard
wu-ftpd \fIxferlog\fP format.
.TP 8
.B \-f
\fBFoldSeqErr\fP. Fold out of sequence log records back into analysis,
by treating as if they were the same date/time as the last good record.
Normally, out of sequence log records are simply ignored.
.TP 8
.B \-Y
\fBCountryGraph\fP. Suppress country graph.
.TP 8
.B \-G
\fBHourlyGraph\fP. Suppress hourly graph.
.TP 8
.B \-x \fIname\fP
\fBHTMLExtension\fP. Defines HTML file extension to use. If not
specified, defaults to \fIhtml\fP. Do not include the leading
period.
.TP 8
.B \-H
\fBHourlyStats\fP. Suppress hourly statistics.
.TP 8
.B \-K \fInum\fP
\fBIndexMonths\fP. Specify how many months should be displayed in the
main index (yearly summary) table. Default is 12 months. Can be set
to anything between 12 and 120 months (1 to 10 years).
.TP 8
.B \-k \fInum\fP
\fBGraphMonths\fP. Specify how many months should be displayed in the
main index (yearly summary) graph. Default is 12 months. Can be set
to anything between 12 and 72 months (1 to 6 years).
.TP 8
.B \-L
\fBGraphLegend\fP. Suppress color coded graph legends.
.TP 8
.B \-l \fInum\fP
\fBGraphLines\fP. Specify number of background lines. Default
is 2. Use zero ('0') to disable the lines.
.TP 8
.B \-P \fIname\fP
\fBPageType\fP. Specify file extensions that are considered \fIpages\fP.
Sometimes referred to as \fIpageviews\fP.
.TP 8
.B \-O \fIname\fP
\fBOmitPage\fP. Specify URLs to exclude from being counted as \fIpages\fP.
.TP 8
.B \-m \fInum\fP
\fBVisitTimeout\fP. Specify the Visit timeout period. Specified in
number of seconds. Default is 1800 seconds (30 minutes).
.TP 8
.B \-I \fIname\fP
\fBIndexAlias\fP. Use the filename \fIname\fP as an additional alias
for \fIindex.\fP.
.TP 8
.B \-M \fInum\fP
\fBMangleAgents\fP. Mangle user agent names according to the mangle
level specified by \fInum\fP. Mangle levels are:
.RS
.TP 12
.B 5
Browser name and major version.
.TP 12
.B 4
Browser name, major and minor version.
.TP 12
.B 3
Browser name, major version, minor version to two decimal places.
.TP 12
.B 2
Browser name, major and minor versions and sub-version.
.TP 12
.B 1
Browser name, version and machine type if possible.
.TP 12
.B 0
All information (left unchanged).
.RE
.TP 8
.B \-g \fInum\fP
\fBGroupDomains\fP. Automatically group sites by domain. The
grouping level specified by \fInum\fP can be thought of as 'the
number of dots' to display in the grouping. The default value
of \fB0\fP disables any domain grouping.
.TP 8
.B \-D \fIname\fP
\fBDNSCache\fP. Use the DNS cache file \fIname\fP.
.TP 8
.B \-N \fInum\fP
\fBDNSChildren\fP. Use \fInum\fP DNS children processes to perform DNS
lookups, either creating or updating the DNS cache file. Specify zero
(\fB0\fP) to disable cache file creation/updates. If given, a DNS cache
filename must be specified.
.TP 8
.B \-j
Enable \fIGeoDB\fP. This enables the internal GeoDB geolocation services
provided by The \fIWebalizer\fP.
.TP 8
.B \-J \fIname\fP
\fBGeoDBDatabase\fP. Use the alternate GeoDB database \fIname\fP.
.TP 8
.B \-w
Enable \fIGeoIP\fP. Enables GeoIP (by MaxMind Inc.) geolocation services.
If native \fIGeoDB\fP services are also enabled, then this option
will have no effect.
.TP 8
.B \-W \fIname\fP
\fBGeoIPDatabase\fP. Use the alternate GeoIP database \fIname\fP.
.TP 8
.B \-z \fIname\fP
\fBFlagDir\fP. Specify location of the country flag graphics and
enable their display in the top country table. The directory \fIname\fP
is relative to the output directory being used unless an absolute path
is given (ie: starts with a leading '/').
.PP
.I Hide Options
.TP 8
.B \-a \fIname\fP
\fBHideAgent\fP. Hide user agents matching \fIname\fP.
.TP 8
.B \-r \fIname\fP
\fBHideReferrer\fP. Hide referrer matching \fIname\fP.
.TP 8
.B \-s \fIname\fP
\fBHideSite\fP. Hide site matching \fIname\fP.
.TP 8
.B \-X
\fBHideAllSites\fP. Hide all individual sites (only display groups).
.TP 8
.B \-u \fIname\fP
\fBHideURL\fP. Hide URL matching \fIname\fP.
.PP
.I Table size options
.TP 8
.B \-A \fInum\fP
\fBTopAgents\fP. Display the top \fInum\fP user agents table.
.TP 8
.B \-R \fInum\fP
\fBTopReferrers\fP. Display the top \fInum\fP referrers table.
.TP 8
.B \-S \fInum\fP
\fBTopSites\fP. Display the top \fInum\fP sites table.
.TP 8
.B \-U \fInum\fP
\fBTopURLs\fP. Display the top \fInum\fP URLs table.
.TP 8
.B \-C \fInum\fP
\fBTopCountries\fP. Display the top \fInum\fP countries table.
.TP 8
.B \-e \fInum\fP
\fBTopEntry\fP. Display the top \fInum\fP entry pages table.
.TP 8
.B \-E \fInum\fP
\fBTopExit\fP. Display the top \fInum\fP exit pages table.
.SH CONFIGURATION FILES
Configuration files are standard \fBASCII(7)\fP text files that may be created
or edited using any standard editor. Blank lines and lines that begin
with a pound sign ('#') are ignored. Any other lines are considered to
be configuration lines, and have the form "Keyword Value", where the
\'Keyword\' is one of the currently available configuration keywords defined
below, and 'Value' is the value to assign to that particular option. Any
text found after the keyword up to the end of the line is considered the
keyword's value, so you should not include anything after the actual value
on the line that is not actually part of the value being assigned. The
file \fIsample.conf\fP provided with the distribution contains lots of useful
documentation and examples as well.
.I General Configuration Keywords
.TP 8
.B LogFile \fIname\fP
Use log file named \fIname\fP. If none specified, \fISTDIN\fP will be used.
.TP 8
.B LogType \fIname\fP
Specify log file type as \fIname\fP. Values can be either \fIclf\fP,
\fIsquid\fP, \fIftp\fP or \fIw3c\fP, with the default being \fBclf\fP.
.TP 8
.B OutputDir \fIdir\fP
Create output in the directory \fIdir\fP. If none specified, the current
directory will be used.
.TP 8
.B HistoryName \fIname\fP
Filename to use for history file. Relative to output directory unless
absolute name is given (ie: starts with '/'). Defaults to
\'\fBwebalizer.hist\fP' in the standard output directory.
.TP 8
.B ReportTitle \fIname\fP
Use the title string \fIname\fP for the report title. If none
specified, use the default of (in english) "\fIUsage Statistics for \fP".
.TP 8
.B HostName \fIname\fP
Set the hostname for the report as \fIname\fP. If none specified, an
attempt will be made to gather the hostname via a \fBuname(2)\fP system
call. If that fails, \fIlocalhost\fP will be used.
.TP 8
.B UseHTTPS \fP( yes | \fBno\fP )
Use \fIhttps://\fP on links to URLS, instead of the default \fIhttp://\fP,
in the '\fBTop URLs\fP' table.
.TP 8
.B HTAccess \fP( yes | \fBno\fP )
Enables the creation of a default .htaccess file in the output directory.
.TP 8
.B Quiet \fP( yes | \fBno\fP )
Suppress informational messages. Warning and Error messages will not be
suppressed.
.TP 8
.B ReallyQuiet \fP( yes | \fBno\fP )
Suppress all messages, including Warning and Error messages.
.TP 8
.B Debug \fP( yes | \fBno\fP )
Print extra debugging information on Warnings and Errors.
.TP 8
.B TimeMe \fP( yes | \fBno\fP )
Force timing information at end of processing.
.TP 8
.B GMTTime \fP( yes | \fBno\fP )
Use \fIGMT \fP(\fIUTC\fP) time instead of local timezone for reports.
.TP 8
.B IgnoreHist \fP( yes | \fBno\fP )
Ignore previous monthly history file. \fBUSE WITH CAUTION\fP. Does
not prevent \fIIncremental\fP file processing.
.TP 8
.B IgnoreState \fP( yes | \fBno\fP )
Ignore incremental data file. \fBUSE WITH CAUTION\fP. By ignoring
the incremental data file, all previous processing for the current
month will be lost and those logs must be re-processed.
.TP 8
.B FoldSeqErr \fP( yes | \fBno\fP )
Fold out of sequence log records back into analysis by treating them
as if they had the same date/time as the last good record. Normally,
out of sequence log records are ignored.
.TP 8
.B CountryGraph \fP( \fByes\fP | no )
Display Country Usage Graph in output report.
.TP 8
.B CountryFlags \fP( yes | \fBno\fP )
Enable or disable the display of flags in the top country table.
.TP 8
.B FlagDir \fIname\fP
Specifies the directory \fIname\fP where the flag graphics are located.
If not specified, the default is in the \fIflags\fP directory directly
under the output directory being used. If specified, the display of
country flags will be enabled by default. Using '\fIFlagDir flags\fP'
is identical to using '\fICountryFlags yes\fP'.
.TP 8
.B DailyGraph \fP( \fByes\fP | no )
Display Daily Graph in output report.
.TP 8
.B DailyStats \fP( \fByes\fP | no )
Display Daily Statistics in output report.
.TP 8
.B HourlyGraph \fP( \fByes\fP | no )
Display Hourly Graph in output report.
.TP 8
.B HourlyStats \fP( \fByes\fP | no )
Display Hourly Statistics in output report.
.TP 8
.B PageType \fIname\fP
Define the file extensions to consider as a \fIpage\fP. If a file
is found to have the same extension as \fIname\fP, it will be counted
as a \fIpage\fP (sometimes called a \fIpageview\fP).
.TP 8
.B PagePrefix \fIname\fP
Allows URLs with the prefix \fIname\fP to be counted as a \fIpage\fP
type regardless of actual file type. This allows you to treat contents
under specified directories as pages no matter what their extension is.
.TP 8
.B OmitPage \fIname\fP
Specifies URLs which should not be counted as pages, regardless of their
extension (or lack thereof).
.TP 8
.B GraphLegend \fP( \fByes\fP | no )
Allows the color coded graph legends to be enabled/disabled.
.TP 8
.B GraphLines \fInum\fP
Specify the number of background reference lines displayed on the
graphs produced. Disable by using zero ('\fB0\fP'), default is \fB2\fP.
.TP 8
.B IndexMonths \fInum\fP
Specify the number of months to display in the main index (yearly summary)
table. Default is 12 months. Can be set to anything between 12 and 120
months (1 to 10 years).
.TP 8
.B YearHeaders \fP( \fByes\fP | no )
Enable/disable the display of year headers in the main index (yearly
summary) table. If enabled, year headers will be shown when the table
is displaying more than 16 months worth of data. Values can be 'yes'
or 'no'. Default is 'yes'.
.TP 8
.B YearTotals \fP( \fByes\fP | no )
Enable/disable the display of year totals in the main index (yearly
summary) table. If enabled, year totals will be shown when the table
is displaying more than 16 months worth of data. Values can be 'yes'
or 'no'. Default is 'yes'.
.TP 8
.B GraphMonths \fInum\fP
Specify the number of months to display in the main index (yearly
summary) graph. Default is 12 months. Can be set to anything between
12 and 72 months (1 to 6 years).
.TP 8
.B VisitTimeout \fInum\fP
Specifies the visit timeout value. Default is \fI1800 seconds\fP (30
minutes). A visit is determined by looking at the difference in time
between the current and last request from a specific site. If the
difference is greater or equal to the timeout value, the request is
counted as a new visit. Specified in seconds.
.TP 8
.B IndexAlias \fIname\fP
Use \fIname\fP as an additional alias for \fIindex.*\fP.
.TP 8
.B DefaultIndex \fP( \fByes\fP | no )
Enables or disables the use of '\fBindex.\fP' as a default index name
to be stripped from the end of URLs. This does not effect any index
names that may be defined with the \fIIndexAlias\fP option.
.TP 8
.B MangleAgents \fInum\fP
Mangle user agent names based on mangle level \fInum\fP. See the
\fI-M\fP command line switch for mangle levels and their meaning.
The default is \fB0\fP, which doesn't mangle user agents at all.
.TP 8
.B StripCGI \fP( \fByes\fP | no )
Determines if URL CGI variables should be stripped from the end of
URLs. Values may be 'yes' or 'no', with the default being 'yes'.
.TP 8
.B TrimSquidURL \fInum\fP
Allows squid log URLs to be reduced in granularity by truncating
them after \fInum\fP slashes ('/') after the http:// prefix. A
setting of one (1) will cause all URLs to be summarized by domain
only. The default value is zero (0), which will disable any URL
modifications and leave them exactly as found in the log file.
.TP 8
.B SearchEngine\fP \fIname\fP \fIvariable\fP
Allows the specification of search engines and their query strings.
The \fIname\fP is the name to match against the referrer string for
a given search engine. The \fIvariable\fP is the cgi variable that
the search engine uses for queries. See the \fBsample.conf\fP file
for example usage with common search engines.
.TP 8
.B SearchCaseI\fP ( \fByes\fP | no )
Determines if search strings should be treated case insensitive or
not. The default is 'yes', which lowercases all search strings
(treat as case insensitive).
.TP 8
.B Incremental \fP( yes | \fBno\fP )
Enable Incremental mode processing.
.TP 8
.B IncrementalName \fIname\fP
Filename to use for incremental data. Relative to output directory unless
an absolute name is given (ie: starts with '/'). Defaults to
\'\fBwebalizer.current\fP' in the standard output directory.
.TP 8
.B DNSCache \fIname\fP
Filename to use for the DNS cache. Relative to output directory unless
an absolute name is given (ie: starts with '/').
.TP 8
.B DNSChildren \fInum\fP
Number of children DNS processes to run in order to create/update the
DNS cache file. Specify zero (\fB0\fP) to disable.
.TP 8
.B CacheIPs \fP( yes | \fBno\fP )
Cache unresolved IP addresses in the DNS database. Default is '\fBno\fP'.
.TP 8
.B CacheTTL \fInum\fP
DNS cache entry time to live (TTL) in days. Default is 7 days. May
be any value between 1 and 100.
.TP 8
.B GeoDB \fP( yes | \fBno\fP )
Allows native GeoDB geolocation services to be enabled or disabled.
Default value is '\fBno\fP'.
.TP 8
.B GeoDBDatabase \fIname\fP
Allows the use of an alternate GeoDB database \fIname\fP. If not
specified, the default database will be used.
.TP 8
.B GeoIP \fP( yes | \fBno\fP )
Allows GeoIP (by MaxMind Inc.) geolocation services to be enabled or
disabled. Default is '\fBno\fP'. If native \fIGeoDB\fP geolocation
services are also enabled, then this option will have no effect (and
the native \fIGeoDB\fP services will be used).
.TP 8
.B GeoIPDatabase \fIname\fP
Allows the use of an alternate GeoIP database \fIname\fP. If not
specified, the default database will be used.
.PP
.I Top Table Keywords
.TP 8
.B TopAgents \fInum\fP
Display the top \fInum\fP User Agents table. Use zero to disable.
.TP 8
.B AllAgents \fP( yes | \fBno\fP )
Create separate HTML page with \fBAll\fP User Agents.
.TP 8
.B TopReferrers \fInum\fP
Display the top \fInum\fP Referrers table. Use zero to disable.
.TP 8
.B AllReferrers \fP( yes | \fBno\fP )
Create separate HTML page with \fBAll\fP Referrers.
.TP 8
.B TopSites \fInum\fP
Display the top \fInum\fP Sites table. Use zero to disable.
.TP 8
.B TopKSites \fInum\fP
Display the top \fInum\fP Sites (by KByte) table. Use zero to disable.
.TP 8
.B AllSites \fP( yes | \fBno\fP )
Create separate HTML page with \fBAll\fP Sites.
.TP 8
.B TopURLs \fInum\fP
Display the top \fInum\fP URLs table. Use zero to disable.
.TP 8
.B TopKURLs \fInum\fP
Display the top \fInum\fP URLs (by KByte) table. Use zero to disable.
.TP 8
.B AllURLs \fP( yes | \fBno\fP )
Create separate HTML page with \fBAll\fP URLs.
.TP 8
.B TopCountries \fInum\fP
Display the top \fInum\fP Countries in the table. Use zero to disable.
.TP 8
.B TopEntry \fInum\fP
Display the top \fInum\fP Entry Pages in the table. Use zero to disable.
.TP 8
.B TopExit \fInum\fP
Display the top \fInum\fP Exit Pages in the table. Use zero to disable.
.TP 8
.B TopSearch \fInum\fP
Display the top \fInum\fP Search Strings in the table. Use zero to disable.
.TP 8
.B AllSearchStr \fP( yes | \fBno\fP )
Create separate HTML page with \fBAll\fP Search Strings.
.TP 8
.B TopUsers \fInum\fP
Display the top \fInum\fP Usernames in the table. Use zero to disable.
Usernames are only available if using http based authentication.
.TP 8
.B AllUsers \fP( yes | \fBno\fP )
Create separate HTML page with \fBAll\fP Usernames.
.PP
.I Hide/Ignore/Group/Include Keywords
.TP 8
.B HideAgent \fIname\fP
Hide User Agents that match \fIname\fP.
.TP 8
.B HideReferrer \fIname\fP
Hide Referrers that match \fIname\fP.
.TP 8
.B HideSite \fIname\fP
Hide Sites that match \fIname\fP.
.TP 8
.B HideAllSites \fP( yes | \fBno\fP )
Hide all individual sites. This causes only grouped sites to be displayed.
.TP 8
.B HideURL \fIname\fP
Hide URLs that match \fIname\fP.
.TP 8
.B HideUser \fIname\fP
Hide Usernames that match \fIname\fP.
.TP 8
.B IgnoreAgent \fIname\fP
Ignore User Agents that match \fIname\fP.
.TP 8
.B IgnoreReferrer \fIname\fP
Ignore Referrers that match \fIname\fP.
.TP 8
.B IgnoreSite \fIname\fP
Ignore Sites that match \fIname\fP.
.TP 8
.B IgnoreURL \fIname\fP
Ignore URLs that match \fIname\fP.
.TP 8
.B IgnoreUser \fIname\fP
Ignore Usernames that match \fIname\fP.
.TP 8
.B GroupAgent \fIname\fP [\fILabel\fP]
Group User Agents that match \fIname\fP. Display \fILabel\fP in 'Top Agent'
table if given (instead of \fIname\fP). \fIname\fP may be enclosed in quotes.
.TP 8
.B GroupReferrer \fIname\fP [\fILabel\fP]
Group Referrers that match \fIname\fP. Display \fILabel\fP in 'Top Referrer'
table if given (instead of \fIname\fP). \fIname\fP may be enclosed in quotes.
.TP 8
.B GroupSite \fIname\fP [\fILabel\fP]
Group Sites that match \fIname\fP. Display \fILabel\fP in 'Top Site'
table if given (instead of \fIname\fP). \fIname\fP may be enclosed in quotes.
.TP 8
.B GroupDomains \fInum\fP
Automatically group sites by domain. The value \fInum\fP specifies the
level of grouping, and can be thought of as the 'number of dots' to
be displayed. The default value of \fB0\fP disables domain grouping.
.TP 8
.B GroupURL \fIname\fP [\fILabel\fP]
Group URLs that match \fIname\fP. Display \fILabel\fP in 'Top URL'
table if given (instead of \fIname\fP). \fIname\fP may be enclosed in quotes.
.TP 8
.B GroupUser \fIname\fP [\fILabel\fP]
Group Usernames that match \fIname\fP. Display \fILabel\fP in 'Top
Usernames' table if given (instead of \fIname\fP). \fIname\fP may be
enclosed in quotes.
.TP 8
.B IncludeSite \fIname\fP
Force inclusion of sites that match \fIname\fP. Takes precedence
over \fBIgnore*\fP keywords.
.TP 8
.B IncludeURL \fIname\fP
Force inclusion of URLs that match \fIname\fP. Takes precedence
over \fBIgnore*\fP keywords.
.TP 8
.B IncludeReferrer \fIname\fP
Force inclusion of Referrers that match \fIname\fP. Takes precedence
over \fBIgnore*\fP keywords.
.TP 8
.B IncludeAgent \fIname\fP
Force inclusion of User Agents that match \fIname\fP. Takes precedence
over \fBIgnore*\fP keywords.
.TP 8
.B IncludeUser \fIname\fP
Force inclusion of Usernames that match \fIname\fP. Takes precedence
over \fBIgnore*\fP keywords.
.PP
.I HTML Generation Keywords
.TP 8
.B HTMLExtension \fItext\fP
Defines the HTML file extension to use. Default is \fIhtml\fP. Do not
include the leading period!
.TP 8
.B HTMLPre \fItext\fP
Insert \fItext\fP at the very beginning of the generated HTML file.
Defaults to a standard html 3.2 \fIDOCTYPE\fP record.
.TP 8
.B HTMLHead \fItext\fP
Insert \fItext\fP within the <HEAD></HEAD> block of the HTML file.
.TP 8
.B HTMLBody \fItext\fP
Insert \fItext\fP in HTML page, starting with the <BODY> tag. If used, the
first line must be a \fI<BODY ...>\fP tag. Multiple lines may be specified.
.TP 8
.B HTMLPost \fItext\fP
Insert \fItext\fP at top (before horiz. rule) of HTML pages. Multiple lines
may be specified.
.TP 8
.B HTMLTail \fItext\fP
Insert \fItext\fP at bottom of the HTML page. The \fItext\fP is top and
right aligned within a table column at the end of the report.
.TP 8
.B HTMLEnd \fItext\fP
Insert \fItext\fP at the very end of the HTML page. If not specified,
the default is to insert the ending </BODY> and </HTML> tags. If used,
you \fImust\fP supply these tags yourself.
.TP 8
.B LinkReferrer \fP( yes | \fBno\fP )
Determines if the referrers listed in the top referrers table should be
displayed as plain text, or as a link to the referrer URL.
.TP 8
.B ColorHit \fP( rrggbb | \fB00805c\fP )
Sets the graph's hit-color to the specified html color (no '#').
.TP 8
.B ColorFile \fP( rrggbb | \fB0040ff\fP )
Sets the graph's file-color to the specified html color (no '#').
.TP 8
.B ColorSite \fP( rrggbb | \fBff8000\fP )
Sets the graph's site-color to the specified html color (no '#').
.TP 8
.B ColorKbyte \fP( rrggbb | \fBff0000\fP )
Sets the graph's kilobyte-color to the specified html color (no '#').
.TP 8
.B ColorPage \fP( rrggbb | \fB00e0ff\fP )
Sets the graph's page-color to the specified html color (no '#').
.TP 8
.B ColorVisit \fP( rrggbb | \fBffff00\fP )
Sets the graph's visit-color to the specified html color (no '#').
.TP 8
.B ColorMisc \fP( rrggbb | \fB00e0ff\fP )
Sets the 'miscellaneous' color for table headers (not graphs) to
the specified html color (no '#').
.TP 8
.B PieColor1 \fP( rrggbb | \fB800080\fP )
Sets the pie's first optional color to the specified html color (no '#').
.TP 8
.B PieColor2 \fP( rrggbb | \fB80ffc0\fP )
Sets the pie's second optional color to the specified html color (no '#').
.TP 8
.B PieColor3 \fP( rrggbb | \fBff00ff\fP )
Sets the pie's third optional color to the specified html color (no '#').
.TP 8
.B PieColor4 \fP( rrggbb | \fBffc480\fP )
Sets the pie's fourth optional color to the specified html color (no '#').
.PP
.I Dump Object Keywords
.PP
The \fIWebalizer\fP allows you to export processed data to other programs by
using \fItab delimited\fP text files. The \fIDump*\fP commands specify
which files are to be written, and where.
.TP 8
.B DumpPath \fIname\fP
Save dump files in directory \fIname\fP. If not specified, the default
output directory will be used. Do not specify a trailing slash ('/').
.TP 8
.B DumpExtension \fIname\fP
Use \fIname\fP as the filename extension for dump files. If not given,
the default of \fBtab\fP will be used.
.TP 8
.B DumpHeader \fP( yes | \fBno\fP )
Print a column header as the first record of the file.
.TP 8
.B DumpSites \fP( yes | \fBno\fP )
Dump the sites data to a tab delimited file.
.TP 8
.B DumpURLs \fP( yes | \fBno\fP )
Dump the url data to a tab delimited file.
.TP 8
.B DumpReferrers \fP( yes | \fBno\fP )
Dump the referrer data to a tab delimited file. This data is only
available if using a log that contains referrer information
(ie: a combined format web log).
.TP 8
.B DumpAgents \fP( yes | \fBno\fP )
Dump the user agent data to a tab delimited file. This data is only
available if using a log that contains user agent information
(ie: a combined format web log).
.TP 8
.B DumpUsers \fP( yes | \fBno\fP )
Dump the username data to a tab delimited file. This data is only available
if processing a wu-ftpd xferlog or a web log that contains http authentication
information.
.TP 8
.B DumpSearchStr \fP( yes | \fBno\fP )
Dump the search string data to a tab delimited file. This data is only
available if processing a web log that contains referrer information and
had search string information present.
.SH FILES
.TP 20
.I webalizer.conf
Default configuration file. Is searched for in the current directory
and if not found, in the \fI/etc/\fP directory.
.TP 20
.I webalizer.hist
Monthly history file for previous months. (can be changed)
.TP 20
.I webalizer.current
Current state data file (Incremental processing). (can be changed)
.TP 20
.I xxxxx_YYYYMM.html
Various monthly \fIHTML\fP output files produced. (extension can be changed)
.TP 20
.I xxxxx_YYYYMM.png
Various monthly image files used in the reports.
.TP 20
.I xxxxx_YYYYMM.tab
Monthly tab delimited text files. (extension can be changed)
.SH BUGS
Please report bugs to the author.
.SH COPYRIGHT
Copyright (C) 1997-2011 by Bradford L. Barrett. Distributed under
the GNU GPL. See the files "\fICOPYING\fP" and "\fICopyright\fP",
supplied with all distributions for additional information.
.SH AUTHOR
Bradford L. Barrett <\fIbrad at mrunix dot net\fP>

View File

@@ -0,0 +1,26 @@
Begin3
Title: The Webalizer
Version: 2.20
Entered-date: 01JUN2008
Description: A fast, free web server log file analysis program. Produces
HTML output for viewing with a web browser. Written in C on
a Linux platform, however designed to be as ANSI/POSIX
compliant as possible so porting to other UNIX platforms should
be painless. Binary distributions for most popular platforms
are available. Features multiple language support, incremental
processing capabilities, reverse DNS lookup support, native
geolocation support as well as geolocation support via the
optional GeoIP library and database from MaxMind Inc., data
export via tab delimited ASCII files to popular databases and
spreadsheets, and much more. Supports all standard CLF and
combined web logs, wu-ftpd xferlog, squid proxy and extended
W3C format logs, all of which can be either in standard text
format or compressed using gzip or bzip2.
Keywords: Web Analysis, Log Analysis, Linux, Unix, apache, wcmgr, GeoDB
Author: Bradford L. Barrett
Maintained-by: Bradford L. Barrett
Primary-site: http://www.webalizer.org/
Original-site: ftp://ftp.webalizer.org/pub/webalizer/
Platforms: Linux/Unix, OS/2, Win32, MacOSX, POSIX
Copying-policy: GPL
End

View File

@@ -0,0 +1,2 @@
@echo off
webalizer.exe -c webalizer.conf

View File

@@ -0,0 +1,573 @@
#
# Sample Webalizer configuration file
# Copyright 1997-2000 by Bradford L. Barrett (brad@mrunix.net)
#
# Distributed under the GNU General Public License. See the
# files "Copyright" and "COPYING" provided with the webalizer
# distribution for additional information.
#
# This is a sample configuration file for the Webalizer (ver 2.01)
# Lines starting with pound signs '#' are comment lines and are
# ignored. Blank lines are skipped as well. Other lines are considered
# as configuration lines, and have the form "ConfigOption Value" where
# ConfigOption is a valid configuration keyword, and Value is the value
# to assign that configuration option. Invalid keyword/values are
# ignored, with appropriate warnings being displayed. There must be
# at least one space or tab between the keyword and its value.
#
# As of version 0.98, The Webalizer will look for a 'default' configuration
# file named "webalizer.conf" in the current directory, and if not found
# there, will look for "/etc/webalizer.conf".
# LogFile defines the web server log file to use. If not specified
# here or on on the command line, input will default to STDIN. If
# the log filename ends in '.gz' (ie: a gzip compressed file), it will
# be decompressed on the fly as it is being read.
LogFile \xampp\apache\logs\access.log
# LogType defines the log type being processed. Normally, the Webalizer
# expects a CLF or Combined web server log as input. Using this option,
# you can process ftp logs as well (xferlog as produced by wu-ftp and
# others), or Squid native logs. Values can be 'clf', 'ftp' or 'squid',
# with 'clf' the default.
LogType clf
# OutputDir is where you want to put the output files. This should
# should be a full path name, however relative ones might work as well.
# If no output directory is specified, the current directory will be used.
OutputDir \xampp\webalizer
# HistoryName allows you to specify the name of the history file produced
# by the Webalizer. The history file keeps the data for up to 12 months
# worth of logs, used for generating the main HTML page (index.html).
# The default is a file named "webalizer.hist", stored in the specified
# output directory. If you specify just the filename (without a path),
# it will be kept in the specified output directory. Otherwise, the path
# is relative to the output directory, unless absolute (leading /).
HistoryName webalizer.hist
# Incremental processing allows multiple partial log files to be used
# instead of one huge one. Useful for large sites that have to rotate
# their log files more than once a month. The Webalizer will save its
# internal state before exiting, and restore it the next time run, in
# order to continue processing where it left off. This mode also causes
# The Webalizer to scan for and ignore duplicate records (records already
# processed by a previous run). See the README file for additional
# information. The value may be 'yes' or 'no', with a default of 'no'.
# The file 'webalizer.current' is used to store the current state data,
# and is located in the output directory of the program (unless changed
# with the IncrementalName option below). Please read at least the section
# on Incremental processing in the README file before you enable this option.
Incremental no
# IncrementalName allows you to specify the filename for saving the
# incremental data in. It is similar to the HistoryName option where the
# name is relative to the specified output directory, unless an absolute
# filename is specified. The default is a file named "webalizer.current"
# kept in the normal output directory. If you don't specify "Incremental"
# as 'yes' then this option has no meaning.
#IncrementalName webalizer.current
# ReportTitle is the text to display as the title. The hostname
# (unless blank) is appended to the end of this string (seperated with
# a space) to generate the final full title string.
# Default is (for english) "Usage Statistics for".
ReportTitle Usage Statistics for
# HostName defines the hostname for the report. This is used in
# the title, and is prepended to the URL table items. This allows
# clicking on URL's in the report to go to the proper location in
# the event you are running the report on a 'virtual' web server,
# or for a server different than the one the report resides on.
# If not specified here, or on the command line, webalizer will
# try to get the hostname via a uname system call. If that fails,
# it will default to "localhost".
HostName localhost
# HTMLExtension allows you to specify the filename extension to use
# for generated HTML pages. Normally, this defaults to "html", but
# can be changed for sites who need it (like for PHP embeded pages).
HTMLExtension html
# PageType lets you tell the Webalizer what types of URL's you
# consider a 'page'. Most people consider html and cgi documents
# as pages, while not images and audio files. If no types are
# specified, defaults will be used ('htm*', 'cgi' and HTMLExtension
# if different for web logs, 'txt' for ftp logs).
PageType htm*
PageType cgi
PageType phtml
PageType php*
PageType pl
# UseHTTPS should be used if the analysis is being run on a
# secure server, and links to urls should use 'https://' instead
# of the default 'http://'. If you need this, set it to 'yes'.
# Default is 'no'. This only changes the behaviour of the 'Top
# URL's' table.
#UseHTTPS no
# DNSCache specifies the DNS cache filename to use for reverse DNS lookups.
# This file must be specified if you wish to perform name lookups on any IP
# addresses found in the log file. If an absolute path is not given as
# part of the filename (ie: starts with a leading '/'), then the name is
# relative to the default output directory. See the DNS.README file for
# additional information.
#
# Note that this is not yet supported in the Windows port of Webalizer.
#DNSCache dns_cache.db
# DNSChildren allows you to specify how many "children" processes are
# run to perform DNS lookups to create or update the DNS cache file.
# If a number is specified, the DNS cache file will be created/updated
# each time the Webalizer is run, immediately prior to normal processing,
# by running the specified number of "children" processes to perform
# DNS lookups. If used, the DNS cache filename MUST be specified as
# well. The default value is zero (0), which disables DNS cache file
# creation/updates at run time. The number of children processes to
# run may be anywhere from 1 to 100, however a large number may effect
# normal system operations. Reasonable values should be between 5 and
# 20. See the DNS.README file for additional information.
#DNSChildren 0
# HTMLPre defines HTML code to insert at the very beginning of the
# file. Default is the DOCTYPE line shown below. Max line length
# is 80 characters, so use multiple HTMLPre lines if you need more.
#HTMLPre <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">
# HTMLHead defines HTML code to insert within the <HEAD></HEAD>
# block, immediately after the <TITLE> line. Maximum line length
# is 80 characters, so use multiple lines if needed.
#TMLHead <META NAME="author" CONTENT="The Webalizer">
# HTMLBody defined the HTML code to be inserted, starting with the
# <BODY> tag. If not specified, the default is shown below. If
# used, you MUST include your own <BODY> tag as the first line.
# Maximum line length is 80 char, use multiple lines if needed.
#TMLBody <BODY BGCOLOR="#E8E8E8" TEXT="#000000" LINK="#0000FF" VLINK="#FF0000">
# HTMLPost defines the HTML code to insert immediately before the
# first <HR> on the document, which is just after the title and
# "summary period"-"Generated on:" lines. If anything, this should
# be used to clean up in case an image was inserted with HTMLBody.
# As with HTMLHead, you can define as many of these as you want and
# they will be inserted in the output stream in order of apperance.
# Max string size is 80 characters. Use multiple lines if you need to.
#TMLPost <BR CLEAR="all">
# HTMLTail defines the HTML code to insert at the bottom of each
# HTML document, usually to include a link back to your home
# page or insert a small graphic. It is inserted as a table
# data element (ie: <TD> your code here </TD>) and is right
# alligned with the page. Max string size is 80 characters.
HTMLTail <IMG SRC="msfree.png" ALT="100% Micro$oft free!">
# HTMLEnd defines the HTML code to add at the very end of the
# generated files. It defaults to what is shown below. If
# used, you MUST specify the </BODY> and </HTML> closing tags
# as the last lines. Max string length is 80 characters.
#HTMLEnd </BODY></HTML>
# The Quiet option suppresses output messages... Useful when run
# as a cron job to prevent bogus e-mails. Values can be either
# "yes" or "no". Default is "no". Note: this does not suppress
# warnings and errors (which are printed to stderr).
Quiet no
# ReallyQuiet will supress all messages including errors and
# warnings. Values can be 'yes' or 'no' with 'no' being the
# default. If 'yes' is used here, it cannot be overriden from
# the command line, so use with caution. A value of 'no' has
# no effect.
ReallyQuiet no
# TimeMe allows you to force the display of timing information
# at the end of processing. A value of 'yes' will force the
# timing information to be displayed. A value of 'no' has no
# effect.
#TimeMe no
# GMTTime allows reports to show GMT (UTC) time instead of local
# time. Default is to display the time the report was generated
# in the timezone of the local machine, such as EDT or PST. This
# keyword allows you to have times displayed in UTC instead. Use
# only if you really have a good reason, since it will probably
# screw up the reporting periods by however many hours your local
# time zone is off of GMT.
GMTTime no
# Debug prints additional information for error messages. This
# will cause webalizer to dump bad records/fields instead of just
# telling you it found a bad one. As usual, the value can be
# either "yes" or "no". The default is "no". It shouldn't be
# needed unless you start getting a lot of Warning or Error
# messages and want to see why. (Note: warning and error messages
# are printed to stderr, not stdout like normal messages).
Debug no
# FoldSeqErr forces the Webalizer to ignore sequence errors.
# This is useful for Netscape and other web servers that cache
# the writing of log records and do not guarentee that they
# will be in chronological order. The use of the FoldSeqErr
# option will cause out of sequence log records to be treated
# as if they had the same time stamp as the last valid record.
# Default is to ignore out of sequence log records.
FoldSeqErr no
# VisitTimeout allows you to set the default timeout for a visit
# (sometimes called a 'session'). The default is 30 minutes,
# which should be fine for most sites.
# Visits are determined by looking at the time of the current
# request, and the time of the last request from the site. If
# the time difference is greater than the VisitTimeout value, it
# is considered a new visit, and visit totals are incremented.
# Value is the number of seconds to timeout (default=1800=30min)
VisitTimeout 1800
# IgnoreHist shouldn't be used in a config file, but it is here
# just because it might be usefull in certain situations. If the
# history file is ignored, the main "index.html" file will only
# report on the current log files contents. Usefull only when you
# want to reproduce the reports from scratch. USE WITH CAUTION!
# Valid values are "yes" or "no". Default is "no".
IgnoreHist no
# Country Graph allows the usage by country graph to be disabled.
# Values can be 'yes' or 'no', default is 'yes'.
CountryGraph yes
# DailyGraph and DailyStats allows the daily statistics graph
# and statistics table to be disabled (not displayed). Values
# may be "yes" or "no". Default is "yes".
DailyGraph yes
DailyStats yes
# HourlyGraph and HourlyStats allows the hourly statistics graph
# and statistics table to be disabled (not displayed). Values
# may be "yes" or "no". Default is "yes".
HourlyGraph yes
HourlyStats yes
# GraphLegend allows the color coded legends to be turned on or off
# in the graphs. The default is for them to be displayed. This only
# toggles the color coded legends, the other legends are not changed.
# If you think they are hideous and ugly, say 'no' here :)
GraphLegend yes
# GraphLines allows you to have index lines drawn behind the graphs.
# I personally am not crazy about them, but a lot of people requested
# them and they weren't a big deal to add. The number represents the
# number of lines you want displayed. Default is 2, you can disable
# the lines by using a value of zero ('0'). [max is 20]
# Note, due to rounding errors, some values don't work quite right.
# The lower the better, with 1,2,3,4,6 and 10 producing nice results.
GraphLines 2
# The "Top" options below define the number of entries for each table.
# Defaults are Sites=30, URL's=30, Referrers=30 and Agents=15, and
# Countries=30. TopKSites and TopKURLs (by KByte tables) both default
# to 10, as do the top entry/exit tables (TopEntry/TopExit). The top
# search strings and usernames default to 20. Tables may be disabled
# by using zero (0) for the value.
TopSites 30
TopKSites 10
TopURLs 30
TopKURLs 10
TopReferrers 30
TopAgents 15
TopCountries 30
TopEntry 10
TopExit 10
TopSearch 20
TopUsers 20
# The All* keywords allow the display of all URL's, Sites, Referrers
# User Agents, Search Strings and Usernames. If enabled, a seperate
# HTML page will be created, and a link will be added to the bottom
# of the appropriate "Top" table. There are a couple of conditions
# for this to occur.. First, there must be more items than will fit
# in the "Top" table (otherwise it would just be duplicating what is
# already displayed). Second, the listing will only show those items
# that are normally visable, which means it will not show any hidden
# items. Grouped entries will be listed first, followed by individual
# items. The value for these keywords can be either 'yes' or 'no',
# with the default being 'no'. Please be aware that these pages can
# be quite large in size, particularly the sites page, and seperate
# pages are generated for each month, which can consume quite a lot
# of disk space depending on the traffic to your site.
AllSites no
AllURLs no
AllReferrers no
AllAgents no
AllSearchStr no
AllUsers no
# The Webalizer normally strips the string 'index.' off the end of
# URL's in order to consolidate URL totals. For example, the URL
# /somedir/index.html is turned into /somedir/ which is really the
# same URL. This option allows you to specify additional strings
# to treat in the same way. You don't need to specify 'index.' as
# it is always scanned for by The Webalizer, this option is just to
# specify _additional_ strings if needed. If you don't need any,
# don't specify any as each string will be scanned for in EVERY
# log record... A bunch of them will degrade performance. Also,
# the string is scanned for anywhere in the URL, so a string of
# 'home' would turn the URL /somedir/homepages/brad/home.html into
# just /somedir/ which is probably not what was intended.
#IndexAlias home.htm
#IndexAlias homepage.htm
# The Hide*, Group* and Ignore* and Include* keywords allow you to
# change the way Sites, URL's, Referrers, User Agents and Usernames
# are manipulated. The Ignore* keywords will cause The Webalizer to
# completely ignore records as if they didn't exist (and thus not
# counted in the main site totals). The Hide* keywords will prevent
# things from being displayed in the 'Top' tables, but will still be
# counted in the main totals. The Group* keywords allow grouping
# similar objects as if they were one. Grouped records are displayed
# in the 'Top' tables and can optionally be displayed in BOLD and/or
# shaded. Groups cannot be hidden, and are not counted in the main
# totals. The Group* options do not, by default, hide all the items
# that it matches. If you want to hide the records that match (so just
# the grouping record is displayed), follow with an identical Hide*
# keyword with the same value. (see example below) In addition,
# Group* keywords may have an optional label which will be displayed
# instead of the keywords value. The label should be seperated from
# the value by at least one 'white-space' character, such as a space
# or tab.
#
# The value can have either a leading or trailing '*' wildcard
# character. If no wildcard is found, a match can occur anywhere
# in the string. Given a string "www.yourmama.com", the values "your",
# "*mama.com" and "www.your*" will all match.
# Your own site should be hidden
#HideSite *mrunix.net
#HideSite localhost
# Your own site gives most referrals
#HideReferrer mrunix.net/
# This one hides non-referrers ("-" Direct requests)
#HideReferrer Direct Request
# Usually you want to hide these
HideURL *.gif
HideURL *.GIF
HideURL *.jpg
HideURL *.JPG
HideURL *.png
HideURL *.PNG
HideURL *.ra
# Hiding agents is kind of futile
#HideAgent RealPlayer
# You can also hide based on authenticated username
#HideUser root
#HideUser admin
# Grouping options
#GroupURL /cgi-bin/* CGI Scripts
#GroupURL /images/* Images
#GroupSite *.aol.com
#GroupSite *.compuserve.com
#GroupReferrer yahoo.com/ Yahoo!
#GroupReferrer excite.com/ Excite
#GroupReferrer infoseek.com/ InfoSeek
#GroupReferrer webcrawler.com/ WebCrawler
#GroupUser root Admin users
#GroupUser admin Admin users
#GroupUser wheel Admin users
# The following is a great way to get an overall total
# for browsers, and not display all the detail records.
# (You should use MangleAgent to refine further...)
#GroupAgent MSIE Micro$oft Internet Exploder
#HideAgent MSIE
#GroupAgent Mozilla Netscape
#HideAgent Mozilla
#GroupAgent Lynx* Lynx
#HideAgent Lynx*
# HideAllSites allows forcing individual sites to be hidden in the
# report. This is particularly useful when used in conjunction
# with the "GroupDomain" feature, but could be useful in other
# situations as well, such as when you only want to display grouped
# sites (with the GroupSite keywords...). The value for this
# keyword can be either 'yes' or 'no', with 'no' the default,
# allowing individual sites to be displayed.
#HideAllSites no
# The GroupDomains keyword allows you to group individual hostnames
# into their respective domains. The value specifies the level of
# grouping to perform, and can be thought of as 'the number of dots'
# that will be displayed. For example, if a visiting host is named
# cust1.tnt.mia.uu.net, a domain grouping of 1 will result in just
# "uu.net" being displayed, while a 2 will result in "mia.uu.net".
# The default value of zero disable this feature. Domains will only
# be grouped if they do not match any existing "GroupSite" records,
# which allows overriding this feature with your own if desired.
#GroupDomains 0
# The GroupShading allows grouped rows to be shaded in the report.
# Useful if you have lots of groups and individual records that
# intermingle in the report, and you want to diferentiate the group
# records a little more. Value can be 'yes' or 'no', with 'yes'
# being the default.
#GroupShading yes
# GroupHighlight allows the group record to be displayed in BOLD.
# Can be either 'yes' or 'no' with the default 'yes'.
#GroupHighlight yes
# The Ignore* keywords allow you to completely ignore log records based
# on hostname, URL, user agent, referrer or username. I hessitated in
# adding these, since the Webalizer was designed to generate _accurate_
# statistics about a web servers performance. By choosing to ignore
# records, the accuracy of reports become skewed, negating why I wrote
# this program in the first place. However, due to popular demand, here
# they are. Use the same as the Hide* keywords, where the value can have
# a leading or trailing wildcard '*'. Use at your own risk ;)
#IgnoreSite bad.site.net
#IgnoreURL /test*
#IgnoreReferrer file:/*
#IgnoreAgent RealPlayer
#IgnoreUser root
# The Include* keywords allow you to force the inclusion of log records
# based on hostname, URL, user agent, referrer or username. They take
# precidence over the Ignore* keywords. Note: Using Ignore/Include
# combinations to selectivly process parts of a web site is _extremely
# inefficent_!!! Avoid doing so if possible (ie: grep the records to a
# seperate file if you really want that kind of report).
# Example: Only show stats on Joe User's pages...
#IgnoreURL *
#IncludeURL ~joeuser*
# Or based on an authenticated username
#IgnoreUser *
#IncludeUser someuser
# The MangleAgents allows you to specify how much, if any, The Webalizer
# should mangle user agent names. This allows several levels of detail
# to be produced when reporting user agent statistics. There are six
# levels that can be specified, which define different levels of detail
# supression. Level 5 shows only the browser name (MSIE or Mozilla)
# and the major version number. Level 4 adds the minor version number
# (single decimal place). Level 3 displays the minor version to two
# decimal places. Level 2 will add any sub-level designation (such
# as Mozilla/3.01Gold or MSIE 3.0b). Level 1 will attempt to also add
# the system type if it is specified. The default Level 0 displays the
# full user agent field without modification and produces the greatest
# amount of detail. User agent names that can't be mangled will be
# left unmodified.
#MangleAgents 0
# The SearchEngine keywords allow specification of search engines and
# their query strings on the URL. These are used to locate and report
# what search strings are used to find your site. The first word is
# a substring to match in the referrer field that identifies the search
# engine, and the second is the URL variable used by that search engine
# to define it's search terms.
SearchEngine yahoo.com p=
SearchEngine altavista.com q=
SearchEngine google.com q=
SearchEngine eureka.com q=
SearchEngine lycos.com query=
SearchEngine hotbot.com MT=
SearchEngine msn.com MT=
SearchEngine infoseek.com qt=
SearchEngine webcrawler searchText=
SearchEngine excite search=
SearchEngine netscape.com search=
SearchEngine mamma.com query=
SearchEngine alltheweb.com query=
SearchEngine northernlight.com qr=
# The Dump* keywords allow the dumping of Sites, URL's, Referrers
# User Agents, Usernames and Search strings to seperate tab delimited
# text files, suitable for import into most database or spreadsheet
# programs.
# DumpPath specifies the path to dump the files. If not specified,
# it will default to the current output directory. Do not use a
# trailing slash ('/').
#DumpPath /var/lib/httpd/logs
# The DumpHeader keyword specifies if a header record should be
# written to the file. A header record is the first record of the
# file, and contains the labels for each field written. Normally,
# files that are intended to be imported into a database system
# will not need a header record, while spreadsheets usually do.
# Value can be either 'yes' or 'no', with 'no' being the default.
#DumpHeader no
# DumpExtension allow you to specify the dump filename extension
# to use. The default is "tab", but some programs are pickey about
# the filenames they use, so you may change it here (for example,
# some people may prefer to use "csv").
#DumpExtension tab
# These control the dumping of each individual table. The value
# can be either 'yes' or 'no'.. the default is 'no'.
#DumpSites no
#DumpURLs no
#DumpReferrers no
#DumpAgents no
#DumpUsers no
#DumpSearchStr no
# End of configuration file... Have a nice day!

Binary file not shown.

View File

@@ -0,0 +1,22 @@
<?php
$webalizer = "webalizer.bat";
system($webalizer);
?>
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"
"http://www.w3.org/TR/html4/loose.dtd">
<html>
<head>
<meta name="author" content="Kai Oswald Seidler, Kay Vogelgesang, Carsten Wiedmann">
<link href="/xampp/xampp.css" rel="stylesheet" type="text/css">
<title></title>
</head>
<body>
&nbsp;<p>
<pre>
<script language="JavaScript" type="text/javascript">
document.location = "/webalizer/";
</script>
</pre>
</body>
</html>

Binary file not shown.

After

Width:  |  Height:  |  Size: 1.2 KiB

Binary file not shown.

Binary file not shown.