Name

sum-stats — generates proxy usage statistics from Kernun logs

Synopsis

sum-stats [-p period] [-t type] [-n name] [-l field=limit...] [--filter field=filter...] [--spam-threshold value] [--shift time_offset] [--start time_spec...] [--finish time_spec] [--entitle label] [–info list] [--db] -o outfile

Description

The sum-stats script reads a Kernun log from the standard input and generates proxy usage statistics. The exact contents of the output depend on the proxy type. However, the generated output always retains the following structure:

  • Summary: totals + Kernun Clear Web database hit-rate (for http-proxy and icap-server)

  • Histograms: per-hour, per-day, per-weekday (depends on period)

  • Hitparades: per-client, per-server, ... (depends on type)

Options

-p period

Sets the period (daily, weekly, monthly). Log items outside the date interval based on this period are filtered out.

Use --shift for specifing which period to be generated. The current period (day/week/month) is generated by default. For example, use -p weekly --shift -1w for generating the statistics for the last week.

-t type

Sets the type of the proxy. If not set, the default value is proxy (does not assume any particular proxy type). A list of recognized proxy types can be found below.

-n name

Sets the name of the proxy (altname) to be included in the statistics (other proxies are filtered out). If not set, all proxies are included.

-l field=limit

Sets the limit for the given field (top N clients, servers, ...). If not set, the field is excluded from the statistics.

The special value 0 means not to limit this field at all, All the values are included in the statistics, regardless of their total count. Note that using field limit 0 can result in a VERY BIG statistics that can lead to problems when viewing them.

A list of available fields can be found below.

--filter field=limit

Sets the filter for the given field (clients, servers, ...). If set, only the log records that match the filter are taken into account. If set, the statistics for the field that is being filtered are supressed, since it would be degenerate.

--spam-threshold value

Sets the spam-threshold; mails with spam score above this level are considered SPAM. If not set, the default value is 5000.

--shift time_offset

Behaves as if the processing day was executed earlier/later, given by time_offset. The form of the time_offset is [<SIGN>]<COUNT>[<UNIT>][_<ROUND>]

  • SIGN: '-' for shift to the history, + for shift to the future. Defaults to '+'

  • COUNT: the number of days/weeks/months. Can be 0 for no shift, which can be useful in conjunction with ROUND.

  • UNIT: 'h' for hours, 'd' for days, 'w' for weeks, 'm' for months.

    If ommited, UNIT default depends on the period selected by --period: 'm' for monthly period, 'w' for weekly period and 'd' for daily period. If no period is selected, 'd' is used as the default value for UNIT.

  • ROUND: if given, the result is rounded up or down within the given unit. Use 'up' for round up, 'down' for round down.

For example, --shift -2w_up shifts two weeks back, to the Sunday 23:59:59. The option can be given more than once in which case the time in sequence shifted more times.

See also environmental variable TIME Setting the environmental variable TIME has the similar effect as using –shift. The time is given as the system time when the script is executed by default. This can be overriden by the TIME environmental variable. The resulting value is then used as the base for the –shift options.

--start time_spec, --finish time_spec

Explicitly sets the time interval to be used. The timespec is one of the following:

  • iso timestamp: one of YYYY-MM-DDTHH:MM:SS, YYYY-MM-DDTHH:MM, YYYY-MM-DDTHH, YYYY-MM-DD

  • unix timestamp: the number of seconds since 1970

  • time_offset: time is given as an offset to the current time (possibly affected by option shift

Options --start and --finish are mutually exclusive with option period, which sets the interval implicitly.

--info list

Instead of creating the statistics, reports some information, given as a comma separated list of desired info:

  • fields: print the fields valid for the particular type

  • types: print the available types

  • results: print the available results

  • interval: print the time interval that would be used

  • log_files: list the filenames that likely contain the desired time interval without the eventual compression suffix.

  • log_files: list the filenames that likely contain the desired time interval.

  • log_files_ts: print the shell script that cats the files that likely contain the desired time interval.

  • period_inst_name: period instance name. Prints the suggested name of the periodic statistics, if generated with the current arguments. Based on the beginning of the interval, it is used 'YYYYMM' for monthly, 'YYYYWW' for weekly and 'YYYYMMDD' for daily statistics.

  • oldest_log: print the timestamp of the oldest line in the available logs.

--db

If present, the newly created statistics is also indexed in the statistics index database.

-o outfile

The output will be saved to outfile.html, accompanied by its data file outfile.json.

Proxy Types

proxy

Fields: client, user, server

http-proxy, icap-server

Fields: client, user, group, server, category

smtp-proxy

Fields: client, server, sender, recipient, mime

dns-proxy

Fields: client, server, qname, qtype

sip-proxy

Fields: caller, receiver

Environment Variables

TIME

The timestamp used to calculate the interval of dates to be included in the statistics (affected by the period, shift). If not set, the current time is used.

Notes

Computing per-client, per-server, ... statistics (hitparades) can consume a large amount of memory. Memory usage can only be reduced by turning off individual fields (skip -l field, or set -l field=0). Mere setting the number of top values reported does not reduce memory consumption.

See Also

log-ts(1), switchlog(1), logging(7)

Authors

This man page is a part of Kernun Firewall.
Copyright © 2000–2023 Trusted Network Solutions, a. s.
All rights reserved.