data-matching — generic data matching and processing in proxies
In addition to checking compliance with an application protocol specification, a proxy can also scan protocol payload data. This capability comprises features (described elsewhere), such as HTML filtration or MIME processing, and, in some proxies (only http-proxy(8) at the time of this writing), generic configurable data processing (described in this manual page).
Generic data processing is performed by the software module
mod-match
. Its
parameters are specified in the configuration section
data-match
. The section can be referenced by an ACL of
a proxy that should use the data matching feature. The matching module is
inserted to the data flow between a client and the server, separately in
each direction. The module scans the initial part of the data; its size can be
set in the module configuration using the max-size
item.
As blocks of data are received by the proxy, the scanning process is
repeated for newly arriving data, in accordance with the parameters
step-size
and step-match
. A sequence of
checks, defined by repeatable items test
, is executed.
Each test can accept or deny the data. A test of the
html-alert
type can either report a match to the proxy
log only, or report and deny, depending on the deny
flag.
The decisions are based on regular expression matching. There are also some
more complex tests, suitable particularly for processing of submitted HTML
form values in http-proxy. For detailed description of
avaliable test types, see
mod-match(5).
Database files used by the tests html-hash
,
html-alert
, and html-replace
are
managed by the program html-match-db(1).
The test type html-save
saves values of HTML forms to
a text file as hexadecimal strings in a format compatible with Snort rule
syntax. The test type html-hash
saves hashes of HTML form
values, so that it is later possible to test whether a value is stored in
the database, but it is impossible to get the original values from the
database. The test type html-replace
uses a database with
encrypted replacement data and decrypts them by a key obtained from the HTML
form values being replaced; therefore, the replacement data cannot be
obtained from the database without knowing the corresponding data to be
replaced.
It is possible to scan HTTP request and response body. Body
processing is enabled and configured by the configuration items
request-acl.request-body-match
and
doc-acl.response-body-match
. Actions
html-save
, html-hash
,
html-alert
, and html-replace
are most useful when used for processing HTTP request body.