HTTP Messages Processing in Lua

In this section

General Information

Script for Message Processing

Tables Used in Scripts

Available Auxiliary Modules

General Information

The Dr.Web ICAPD component supports interaction via the Lua program interpreter (5.3.4 version is used; it is supplied together with Dr.Web for UNIX Internet Gateways). Scripts written in Lua can be used by the component for the analysis and processing of the HTTP protocol messages.

The analysis of a HTTP message (request or response), received for scanning from the proxy server via the ICAP protocol, is performed by means of the Lua script specified in the Dr.Web ICAPD settings as the value of the MessageHook parameter (it can be specified either as a snippet of Lua code or a path to the file that contains the required processing program).

Script for Message Processing

Requirements for the Script

The script must contain a global function which is an entry point to the message scanning module (Dr.Web ICAPD will call this function for processing newly received message). The processing function should match the following call conventions:

1.Function name is message_hook;

2.The only argument is the MessageContext table (provides access from the function to the information about the processed email message);

3.The single return value is a string. The return value determines the verdict for the scanned message, i.e. to skip it or to block it. Possible values:

"pass"—the message will be passed to the recipient (HTTP request to the server, HTTP response to the client).

"block"—HTTP message will not go through to the recipient, the client will receive a HTTP response with a blocked webpage.

The situation in which the function returns a different value or an error occurs when executing the function is treated as a scanning error; the response to the client depends in this case on the value of the BlockUnchecked configuration parameter.

Below is an example of a correct function definition, which will always return the Pass verdict to Dr.Web ICAPD for all HTTP messages received for scanning (the ctx argument is hereinafter an instance of the MessageContext table):

function message_hook(ctx)
  return "pass"
end

The script from the next example will block access to all resources except Dr.Web documentation website to all users except the members of the Web Admins group in Active Directory:

local dwl = require "drweb.lookup"

function message_hook(ctx)

  -- Not to block access to resources at the document website
  -- of Doctor Web
  if ctx.req.url.in_list{"download.geo.drweb.com"} then
      return "pass"
  end

  -- To allow access to users from the WebAdmins group
  -- in Active Directory
  if dwl.check("WebAdmins", "AD@WinRoot", ctx.icap.user) then
      return "pass"
  end

  -- Block access for all the others (to all resources)
  return "block"

end

Tables Used in Scripts

Table MessageContext

It is used as an input argument of the message_hook function. It provides access to the information about the processed HTTP message (its type, headers, body, information about the sender and the recipients, if available).

Field

Description

Data type

direction

The HTTP message type. It can have the following values:

"request"—an HTTP request.

"response"—an HTTP response.

String

icap

Information on ICAP request headers.

Table ICAP

request

Information on HTTP request headers.

Table Request

response

Information on HTTP response headers.

Table Response

body

Information on HTTP message body.

Table Body

Overridden metamethods: None

Table ICAP

The table is used as the icap field of the MessageContext table. It contains the data on ICAP requests from a HTTP proxy server.

Field

Description

Data type

user

Information on user obtained from the X-Client-Username header of the ICAP request.

Table User

src

IP address of a client that sent the request (the address is obtained from the X-Client-IP header of the ICAP request sent by the proxy server), or nil, if the address is unknown.

Table IpAddress

field

Array of ICAP request headers.

Array of HeaderField tables

search

The function for searching a header by a regular expression. It requires a patterns argument, i.e. search patterns, i.e. one (string) or several (array of strings) regular expressions in the Perl syntax (PCRE). It searches for all available headers. Note that when using quoted strings, the slash character must be escaped.

Returns a Boolean value:

true—if the field.name .. ": " .. field.value.decoded string matches at least one of the specified regular expression for at least one of the headers;

false—if no matches have been found.

Function

value

The function that requires the only argument, the name of the header (string). It returns the value of the first header with the specified name or nil, if there is no header with this name.

Function

Overridden metamethods: None

Table User

The table contains the name and the domain of the user; both fields are optional.

Field

Description

Data type

user

Username

String

domain

User domain.

String

Overridden metamethods:

__toString—the function returns the User content as a string (in the UTF-8);

__concat—the function concatenates the User string value and another string.

Table HeaderField

The table describes the HTTP or ICAP message header.

Field

Description

Data type

name

Header name.

String

value

Header value.

String

Overridden metamethods: None

Table Request

The table describes the headers of the HTTP request.

Field

Description

Data type

method

HTTP protocol method used in the request (for example, "POST") or nil, if the ICAP request does not contain the HTTP request header.

String

url

URL of the resource to which the HTTP request is sent.

Table Url

content_type

Information from the Content-Type header of the HTTP request.

The ContentType table

field

Array of HTTP request headers.

Array of HeaderField tables

search

The function for searching a header by a regular expression. It requires a patterns argument, i.e. search patterns, i.e. one (string) or several (array of strings) regular expressions in the Perl syntax (PCRE). It searches for all available headers. Note that when using quoted strings, the slash character must be escaped.

Returns a Boolean value:

true—if the field.name .. ": " .. field.value.decoded string matches at least one of the specified regular expression for at least one of the headers;

false—if no matches have been found.

Function

value

The function that requires the only argument, the name of the header (string). It returns the value of the first header with the specified name or nil, if there is no header with this name.

Function

Overridden metamethods: None

Table ContentType

The table describes a value obtained from the Content-Type header.

Field

Description

Data type

type

MIME type of the message part

String

subtype

Subtype of the message part

String

param

Header parameters in the form of a table array with the following fields:

name is the name of a parameter (string);

value is the value of a parameter (string).

Table array

match

The function that requires the only argument media_types, i.e. the array of strings describing MIME types. Each string in the list must have one of the following forms: "type/subtype", "type/*" or "*/*".

Returns a Boolean value:

true—if MIME type of the body matches to any of the specified strings (not case-sensitive) or the array contains the string "*/*".

false—otherwise.

Function

Overridden metamethods:

__tostring is the function that returns the decoded header value;

__concat is the function that concatenates the decrypted value of the header and a string.

Table Url

The table describes a URL.

Field

Description

Data type

scheme

Scheme (protocol) prefix, for example, "http". If the prefix is missings, the value is nil.

String

host

Host name or IP address, for example, "example.com". If the host name is misssing, nil.

String

port

Port number, for example, 80. If the number is missing, the value is nil.

Number

path

Path to a resource, for example, "index.html". If the path is missing, the value is nil.

String

query

Decoded request parameters. If the parameters are missing, the value is nil.

String

legal_url

If the URL belongs to the owners_notice category, the field contains a URL to the owner’s website; otherwise, it is nil.

String

in_list

The function that requires the only argument—hosts, i.e. the host list (an array of strings). It returns a Boolean value:

true—if host is a subdomain of one of the specified domains or matches one of them;

falseif host is not a subdomain of one of the specified domains or does not match anyone of them.

Function

categories

The function that receives one optional argument filter, i.e. the UrlCategoryFilter table (no argument is equivalent of using an empty table). It returns an iterator function using which you can iterate over all categories that meet the URL conditions speficied by filter.

Function

in_categories

The function that requires the only argument—categories, i.e. the URL category list (an array of strings). It returns a Boolean value:

true—if the URL falls under at least one of the specified categories;

false—if the URL does not fall under at least one of the specified categories;

If the categories array is empty, it always returns false. See the possible category values in the description of the category field in the UrlCategoryFilter table.

Function

raw

A raw, undecoded URL.

Table RawUrl

Overridden metamethods:

__toString—the function returns the Url content as a string (in the UTF-8);

__concat—the function concatenates the Url string value and another string.

Table RawUrl

The table contains the undecoded URL data.

Field

Description

Data type

scheme

Scheme (protocol) prefix, for example, "http". If the prefix is missing, the value is, nil.

String

host

Host name or IP address, for example, "example.com". If it is missing, the value is, nil.

String

port

Port number, for example, 80. If the port number is missing, the value is nil.

Number

path

Path to a resource, for example, "index.html". If the pass is missing, the value is nil.

String

query

Decoded request parameters. If the parameters are missing, the value is, nil.

String

Overridden metamethods:

__toString—the function returns the RawUrl content as a string (in the UTF-8);

__concat—the function concatenates the RawUrl string value and another string.

Table UrlCategoryFilter

A table describing a filter for URL categories (allfields are optional):

Field

Description

Data type

category

The list of categories that the URL must match (not case-sensitive). The list may contain the following values:

"infection_source"—an infection source;

"not_recommended"—a source that is not recommended for visiting;

"adult_content"—adult content;

"violence"—violence;

"weapons"—weapons;

"gambling"—gambling;

"drugs"—drugs;

"obscene_language"—obscene language;

"chats"—chats;

"terrorism"—terrorism;

"free_email"—free email;

"social_networks"—social networks;

"owners_notice"—websites added due to a notice from copyright owner;

"online_games"—online games;

"anonymizers"—anonymizers;

"cryptocurrency_mining_pools"—cryptocurrency mining pools;

"jobs"—job search websites;

"black_list"—black list.

String or table of strings

category_not

The list of categories that the URL may not match (not case-sensitive).

String or table of strings

Overridden metamethods: None

If the filter field is not specified (the value is nil), any threat matches the filter.
If several filter fields are specified, the condition is combined by a conjunction (logical AND). If the filter field is a table (list), the object must match at least one of the table (list) items.

Table Response

The table describes the HTTP response headers.

Field

Description

Data type

status

HTTP response code or nil, if the ICAP request does not contain the HTTP response header.

Number

reason

Comment to the response code, or nil if the comment missing.

String

content_type

Information obtained from the Content-Type header of the HTTP response.

Table ContentType

field

Array of HTTP response headers.

Array of HeaderField tables

search

The function for searching the header by a regular expression. Accepts a mandatory patterns argument, i.e. search patterns, i.e. one (a string) or several (array of strings) regular expressions in the Perl syntax (PCRE). Searches for all available headers. Note that when using quoted strings, the slash character must be escaped.

Returns a Boolean value:

true—if the field.name .. ": " .. field.value.decoded string matches at least one of the specified regular expression for at least one of the headers;

false—if the field.name .. ": " .. field.value.decoded string matches does not match any regular expression for any of the headers;

Function

value

The function that requires the only argument—the header name (string). It returns the value of the first header with the specified name or nil, if there is no header with this name.

Function

Overridden metamethods: None

Table Body

A table describing the HTTP message body.

Field

Description

Data type

has_threat

The function that receives one optional argument filter, i.e. the ThreatFilter table (absence of argument equals using an empty table). It returns a Boolean value

true—if the HTTP message body contains a threat that meets specified filter condition;

false—if the there is no threat that meets the specified condition in the message body.

Function

threats

The function that receives one optional argument filter, i.e. the ThreatFilter table (absence of argument equals using an empty table). It returns an iterator function using which you can iterate over all threats detected in the HTTP message body. The threats are described using the Virus table.

Function

content_type

Contains the information on MIME type of the body obtained from the Content-Type header of the HTTP request or response (depending on what type of message is being analyzed).

Table ContentType

scan_error

Body scan error, if occurred, otherwise nil. Possible values:

"path_not_absolute"—the path indicated is not absolute;

"file_not_found"—file was not found;

"file_not_regular"—the file is not a regular file;

"file_not_block_device"—it is not a block device;

"name_too_long"—the name is too long;

"no_access"—access denied;

"read_error"—reading error occurred;

"write_error"—a writing error;

"file_too_large"—the file is too large;

"file_busy"—file is being used;

"unpacking_error"—an unpacking error;

"password_protected"—the archive is password protected;

"arch_crc_error"—CRC archive error;

"arch_invalid_header"—invalid archive header;

"arch_no_memory"—not enough memory to unpack archive;

"arch_incomplete"—incomplete archive;

"can_not_be_cured"—file cannot be cured;

"packer_level_limit"—packed object nesting level limit exceeded;

"archive_level_limit"—archive nesting level limit exceeded;

"mail_level_limit"—mail file nesting level limit exceeded;

"container_level_limit"—container nesting level limit exceeded;

"compression_limit"—compression rate limit exceeded;

"report_size_limit"—report size limit exceeded;

"scan_timeout"—scan timeout limit exceeded;

"engine_crash"—scan engine failure;

"engine_hangup"—scan engine hangup;

"engine_error"—scan engine error;

"no_license"—no active license found;

"multiscan_too_late"—multiscanning error;

"curing_limit_reached"—cure attempts limit exceeded;

"non_supported_disk"—disk type is not supported;

"unexpected_error"—an unexpected error.

String

Overridden metamethods: None

Table Virus

Table that describes a threat.

Field

Description

Data type

name

Threat type (according to the Doctor Web classification)

String

type

Threat type (according to the Doctor Web classification). Possible values:

"known_virus"—a known threat (a threat that has a description in the virus databases);

"virus_modification"—a modification of the known threat;

"unknown_virus"—an unknown threat, suspicious object;

"adware"—an advertising program;

"dialer"—a dialer program;

"joke"—a joke program;

"riskware"—a potentially dangerous program;

"hacktool"—a hacktool.

String

Overridden metamethods: None

Table ThreatFilter

The table describes the filter for threats; all fields are optional.

Field

Description

Data type

category

List of categories that the threat must match (not case-sensitive). See the list of categories in description of the type field of the Virus table.

String or table of strings

category_not

List of categories that the threat cannot match (not case-sensitive).

String or table of strings

Overridden metamethods: None

If the filter field is not specified (the value is nil), any threat matches the filter. If several filter fields are specified, then the condition is combined by a conjunction (logical AND). If the filter field is a table (list), the object must match at least one of the table (list) items.

Usage examples:

1. Writing to the log the names of all threats found in the HTTP message:

local dw = require "drweb"

function message_hook(ctx)
  for virus in ctx.body.threats() do
   dw.notice("threat found: " .. virus.name)
  end
  return "pass"
end

2. Writing to the log the threat names that match the category filter, and the names of the message parts where the threats have been detected:

local dw = require "drweb"

function message_hook(ctx)
  for v in ctx.body.threats({category = "known_virus"}) do
   dw.notice("found known virus: " .. v.name)
  end
  return "pass"
end

Available Auxiliary Modules

For interconnection with Dr.Web for UNIX Internet Gateways in program space in Lua the following specific modules can be imported.

Name of the module

Function

The module that provides functions to record messages from the Lua program to the log of the Dr.Web for UNIX Internet Gateways component which has launched the Lua program and the means of asynchronous execution of Lua procedures.

The module that provides tools to request data from external sources by calling the Dr.Web LookupD module.

The module that provides an interface to match strings and regular expressions.

The module that provides a table with the Dr.Web ICAPD configuration parameter values.

Contents of the drweb Module

1.Functions

The module provides a set of functions.

Saving messages from the Lua program in the Dr.Web for UNIX Internet Gateways component log:

log(<level>, <message>) writes the <message> string to the Dr.Web for UNIX Internet Gateways log on the <level> level (the required level is defined using the “debug”, “info”, “notice”, “warning”, and “error”);

debug(<message>) writes the <message> string to the Dr.Web for UNIX Internet Gateways log at the DEBUG level;

info(<message>) writes the <message> string to the Dr.Web for UNIX Internet Gateways log at the INFO level;

notice(<message>) writes the <message> string to the Dr.Web for UNIX Internet Gateways log at the NOTICE level;

warning(<message>) writes the <message> string to the Dr.Web for UNIX Internet Gateways log at the WARNING level;

error(<message>) writes the <message> string to the Dr.Web for UNIX Internet Gateways log at the ERROR level.

Managing the synchronization of Lua procedures:

sleep(<sec.>) pauses the execution of a Lua procedure instance for a specified number of seconds.

async(<Lua function>[, <argument list>]) launches the specified function asynchronously and passes to it the specified argument list. The async function call completes immediately, and the return value (the table Future) allows you to obtain the result of the <Lua function>.

Adding IP addresses to the IpAddress table:

ip(<address>) indicates an IP address, sent as the <address> string in the form of an IpAddress table. Either IPv4 or IPv6 addresses can be used.

Uploading external data from a text file:

load_set(<file path>) generates a table with the true values from the contents of the specified text file; strings read from a file are used as keys. Empty strings as well as strings with blank spaces will be ignored;

load_array(<file path>) generates a string array from the contents of the specified text file. Empty strings and strings consisting of whitespace characters only, are ignored and are not included in the array.

2.Tables

The Future table describes the pending result of performing a function using the async function.

Field

Description

Data type

wait

A function that returns the result of the function started using the async.function If the function has not completed its execution yet, it waits for the completion and returns the result. If the function is completed before wait is called, the result is returned immediately. If the started function fails, the wait call generates the same error.

Function

Overridden metamethods: None

The IpAddress table describes an IP address.

Field

Description

Data type

belongs

Function checks an IP address stored in the IpAddress table for belonging to the specified subnets (IP address ranges).

Receives the only argument—a string that looks like: "<IP address>" or "<IP address>/<mask>", where <IP address>—a host address or a network address (for example, "127.0.0.1"), and <mask>—a subnetwork mask (can be specified as an IP address, for example, "255.0.0.0", or in the numerical form, for example, "8").

Returns a Boolean value:

true indicates that the address equals to at least one of the specified addresses or belongs at least one of the specified subnets (range of IP addresses);

false—otherwise.

Function

Overridden metamethods:

__tostring is a function that modifies IpAddress in a string, for example: "127.0.0.1" (IPv4) or "::1" (IPv6);

__concat is a function that performs joining IpAddress to a string;

__eq is a function that checks the equality of two IpAddress;

__band—function that allows to apply a mask, for example: dw.ip('192.168.1.2') & dw.ip('255.255.254.0')

3.Examples

Writing the messages generated by a procedure initiating asynchronously to the log:

local dw = require "drweb"

-- This function waits two seconds and returns a string,
-- received as an argument
function out_msg(message)
 dw.sleep(2)
 return message
end

-- "Main" function
function intercept(ctx)
 -- Output of a string at the NOTICE level to the Dr.Web for UNIX Internet Gateways log
 dw.notice("Intercept function started.")

 -- An asynchronous start of two copies of the out_msg function
 local f1 = dw.async(out_msg, "Hello,")
 local f2 = dw.async(out_msg, " world!")

 -- Waiting for the completion of the copies of the function
 -- out_msg and output its results to log
 -- the Dr.Web for UNIX Internet Gateways log at the DEBUG level
 dw.log("debug", f1.wait() .. f2.wait())
end

Creating a scheduled procedure:

local dw = require "drweb"

-- Save the table Future in the future global variable in order
-- to preven the removal by the garbage collector
future = dw.async(function()
   while true do
     -- Everyday, the following message is displayed in the log
     dw.sleep(60 * 60 * 24)
     dw.notice("A brand new day began")
   end
end)

Modifying an IP address represented as a string into an IpAddress table::

local dw = require "drweb"

local ipv4 = dw.ip("127.0.0.1")
local ipv6 = dw.ip("::1")
local mapped = dw.ip("::ffff:127.0.0.1")

 

Contents of the drweb.lookup Module

1.Functions

The module provides the following functions:

lookup(<request>, <parameters>) requests data from an external storage available via the Dr.Web LookupD module. The <request> argument must correspond to a section in the Dr.Web LookupD settings (the string <type>@<tag>). The <parameters> argument is optional. It describes substitutions that will be used to generate a request. The following automatically permitted markers can be used:

$u, $U is automatically replaced with user, the user name sent by the client component;

$d, $D is automatically replaced with domain, the domain sent by the client component.

These arguments are set as a table. Keys and values of this table must be strings. The function returns an array of strings that are results of the request;

check(<checked string>, <request>, <parameters>) returns true if <checked string> is found in the external repository, available via the Dr.Web LookupD module. The arguments <request> and <parameters> are equivalent to the arguments of the lookup function (see above). The <checked string> argument is supposed to be a string or a table with the __tostring metamethod (i.e. that can be formatted into a string).

2.Examples

Writing to the log list of users retrieved from the LookupD.LDAP.users data source:

local dw = require "drweb"
local dwl = require "drweb.lookup"

-- "Main" function
function intercept(ctx)
 -- Writing the string at the NOTICE level to the Dr.Web for UNIX Internet Gateways log
 dw.notice("Intercept function started.")

 -- Writing the request results to the Dr.Web for UNIX Internet Gateways log
 -- to the 'ldap@users' data source
 for _, s in ipairs(dwl.lookup("ldap@users", {user="username"})) do
   dw.notice("Result for request to 'ldap@users': " .. s)
 end

end

Contents of the drweb.regex Module

1. Function

The module provides the following functions:

search(<template>, <text>[, <flags>]) returns true if the <text> string contains a substring that matches the <template> regular expression. The optional <flags> parameter (integer) is a set of flags affecting the function behavior connected with the logical OR.

match(<template>, <text>[, <flags>])—the same as search except that the <template> regular expression must match the entire <text> string, not only its substring.

2. Available flags

ignore_case ignores text case.

3. Examples

local rx = require "drweb.regex"

rx.search("te.?t", "some TexT") -- false
rx.search("te.?t", "some TexT", rx.ignore_case) -- true

rx.match("some.+", "some TexT") -- true

Contents of the drweb.config Module

1. Function

The module does not provide any functions.

2. Available tables

The module provides a table with the following fields:

Field

Description

Data type

whitelist

Value of the Whitelist configuration parameter.

String array

blacklist

Value of the Blacklist configuration parameter.

String array

adlist

Value of the Adlist configuration parameter.

String array

block_url_categories

List of blocked URL categories (based in Block* parameter values, set to Yes).

String array

block_threats

List of blocked threat categories (based in Block* parameter values, set to Yes).

String array

block_unchecked

Value of the BlockUnchecked configuration parameter.

Logical

Overridden metamethods: None

3. Example

local cfg = require "drweb.config"

function message_hook(ctx)

  -- Block messages containing threats
  -- from the list of threats to be blocked
  if ctx.body.has_threat{category = cfg.block_threats} then
  return "block"
  end

  -- To permit access to all other resources
  return "pass"

end