please help to improve this documentation /!\ Further Notice: I already document some non existing planned features here RXPD Documentation Overview The RegexPolicyDaemon (rxpd) can be used to efficiently check data against different lists of regular expressions. This can be used to build whitelists/ blacklists to protect many kinds of Internet services. It uses a simple textual protocol that is easily implementable in scripting languages. Example usages are access and content control (spam filtering) for CGI scripts, wikis, email, revision control systems, IRC servers and clients, and so on. Building and Installing Release Tarballs Release tarballs are attached to the wiki at: * http://www.pipapo.org/pipawiki/RegexPolicyDaemon?action=AttachFile I am using gpg signed tarballs for distribution. As first step one has to check the signature $ gpg rxpd-X.Y.tar.gz.gpg This will produce a rxpd-X.Y.tar.gz and report if the signature could be validated. Since the package is built with gnu autotools, the usual build and install procedure works: $ tar xzvf rxpd-X.Y.tar.gz $ cd rxpd-X.Y $ mkdir build # using a build directory is optional & cd build $ ../configure $ make $ make install Development Version via git The development version is available via git from 'git://git.pipapo.org/rxpd' or mirrored at repo.or.cz 'git://repo.or.cz/rxpd.git'. After you cloned the repository you need to bootstrap the autotools first $ autoreconf -i Then the usual configure / make will work. There is a special makefile target make meta to bring several files (README, AUTHORS, NEWS, TODO) in sync with the Rxpd Documentation wiki and update the ChangeLog. Dependencies Rxpd requires libevent and its development headers. What gets installed A single executable called 'rxpd' will be installed in $prefix/bin. Concepts Rxpd targets to be simple and efficently validating data against Regular expressions. It has (yet) no configuration file for the daemon itself and is controlled by commandline options. Most management of regular expression lists can be done remotely over a simple protocol. By itself it has has no authentication but there is a 'policy' check which validates incoming requests against an special regex list which then defines if the client is allowed to do a certain task. Any further management like distributing the lists, authenticate sessions more strongly and so on should be done by other means and are not planned to be included in rxpd. The goal it to create a common place which applications can use to validate any kind of data. This works efficently because short lived programs like CGI scripts take the advantage of regular expressions which are precompiled in core and generally such lists might be shared between different applications. Commandline Options and Starting WIP Increase verbosity level. Verbosity levels correspondend to syslog -v levels, where LOG_WARNING is the default. Thus -v can be given up to three times to get LOG_DEBUG. -V Just shows version information and then exit. -d Detach from terminal and run in background. -D Run in debug mode, that is, increasing verbosity to at lest LOG_INFO and don't detach. An additional -v can be used for LOG_DEBUG. -b dir Give the basedir for rules. Rules have to be in a single directory, rxpd will never access any data outside of this directory. -q Turn rxpd quiet, only LOG_ALERT or worse will be logged. -t Give a port number for a tcp port to listen on. This option can be port appear multiple times. Only root can listen on ports less than 1024. -u Path for a unix local socket to listen on. This option can appear name multiple times. -p Define which rules-list will be be used to limit access to the rxpd policy itself. If not given, no access restrictions apply (everything allowed!). Policy matching will be descriped in detail later. -i Turn all regular expressions case insensitive. -4 Use only IPv4, default is IPv4 and IPv6 when compiled in. -6 Use only IPv6. -h Give a short usage notice and then exit. -U When started as root, switch to 'user', if not given, 'nobody' is used. user List Syntax There are only 2 things allowed in a list file: * Comments + Begining with a '#' at the first column followed by arbitary text. Comments are preserved and have semantic meaning as they can be used to organize the data. Comments beginning with #OK: and #ERROR: have special meaning, see below. * Rules + Starting with an optional accesstime entry, then a name, followed by a regex. This three parts are delimited by colons. + 'atime' will be maintained by the daemon to reflect the last time the rule matched some data. This is time in seconds since epoch in UTC. + 'name' is an arbitary string which has not special meaning for the rxpd but will send back to the calling applications and be used there to classify results. + the regex is a POSIX extended regular expression, regex are currently case-insensitive this will become configureable later. Lines in can be at most 4095 bytes long. Example list file: # Free things are good! :accept:GNU|Linux 0:accept:FreeBSD # Bad things 0:reject:M.*soft Matches will later report the line matched, without the atime and first colon part. "Macrosoft" matches "M.*soft" thus "reject:M.*soft" will be returned. Note that the first 'accept' rule has no atime, to initiate atimes they can be initalized with '0' the daemon will update them on access and rewrite the List files with the 'SAVE' command or when it recieves a SIGTERM. When there is an error in a regular expression, it will be replaced with # ERROR:, followed by the cause of the error, followed by the rules string in quote. Protocol Rxpd uses a simple line based text protocol. The first line is always the command and list which will be used on the following data, it is not possible to change the command throughout a session. Each session will generate at least one line of response. When no other output is available '#OK:' is send, in case of an error a line starting with '#ERROR:' is send. Lines end with any combination of the 'newline' and/or 'carriage return' character. The protocol is line based where lines which are longer than 4095 characters are broken (may be word-wraped on the last whitespace character in the line in future). Commands: * 'CHECK:list\n..data..' + check all following data against the list. Returns the first matching rule (excluding the 'atime' field), if any. When a empty line is send, the daemon answers with "#OK:". This can be used to syncronize the queries before sending new data. * 'APPEND:list\n..rules..' + append the following lines to list. * 'PREPEND:list\n..rules..' + prepend the following lines to list. * 'REMOVE:list\n..rules..' + remove all matching lines from list. * 'REPLACE:list\nrule\n..replacements..' + find the position matching the first line, which can be a rule or a comment and replaces it with the following rules. Updates are atomic and done when either an empty line is send or when the connection gets closed. * 'LOAD:list\n' + reload list from disk, this resets the 'atime' to the values stored on disk. * 'SAVE:list\n' + save list to disk, saves new atime records. * 'EXPIRE:list\nseconds' + removes all rules from list which are subject of atime updates and where not touched for some (much) seconds. * 'SYNC:list\nremote' + fetches a list from remote which has the form address/listname where address is either 'ip:port' or a path to a unix domain socket. Then updates 'list' atimes to newer ones from the remote list. Idea: do we want 'SYNC:list\ nremote:policylist' which gives a local list filtering remote first? * 'MERGE:list\nremote' + fetches a list from remote which has the form address/listname where address is either 'ip:port' or a path to a unix domain socket. Then merges new rules from remote. Idea: do we want 'SYNC:list\ nremote:policylist' which gives a local list filtering remote first? * 'DUMP:list\n' + dump the content of list. * 'LIST:\n' + list all loaded lists. * 'SHUTDOWN:\n' + exits the daemon gracefully, pending connections will still be served but no new connections are accepted. * 'VERSION:\n' + prints package and version information. * 'HELP:\n' + gives a short list of available commands. Using the rxpd WIP Access Policies One list of rules can be used to define access policies for the rxpd itself (-p option). Each command will be extended with access protocol (one of tcp4, tcp6 or unix) and the peer address and then checked against this policy list. When this check yields in an 'ACCEPT:..' rule, the command is allowed, for everything else will result in an error and drop the connection. For example if '-p policy' is used: # Syntax: # [atime]:rulename:command:list:proto:address # # Allow dumping of the 'policy' list itself :ACCEPT:DUMP:policy # Clients from local network are allowed to do anything :ACCEPT:.*:tcp.:10\..*$ # Forbid all others to do anything else with the policy :REJECT:.*:policy # Finally allow anything else :ACCEPT:.* Example We want to protect a wiki or such against vandalism: blacklists are in $blacklists.d/ lets say /etc/blacklists.d/ The wiki engine builds a tuple hostname;ip which is checked against a blacklist which classify the 'hosts' this is /etc/blacklist.d/hosts # local access are always trusted, thats localhost any my local network :allow:localhost;127.0.0.1 :allow:mydomain.org;10.10. # some really bad guys are put on a blacklist which never ever shall get access :deny:.*aol.com; # everyone else shall just get the content checked :check: so printf("CHECK:hosts\n%s;%s\n", hostname, ipaddr) send to the blacklist daemon will result in either 'allow', 'deny' or 'check' send back. The first both (allow/deny) results are handled obliviously. With the 'check' result the edited content will be filtered against another list '/etc/blacklists.d/ content' #example .. see BadContent on this wiki :deny:sex.com :deny:warez Demonstation WIP Not always running Note: there is an almost unrestricted rxpd running here as demo: $ echo "LIST:" | nc www.pipapo.org 2345 policy $ echo "DUMP:policy" | nc www.pipapo.org 2345 # syntax: # rule:command:list:proto:address # # reject this policy file itself to external users, except dumping ACCEPT:DUMP:policy REJECT:.*:policy # accept all other for tests ACCEPT:.* This Documentation is maintained at: * http://www.pipapo.org/pipawiki/RegexPolicyDaemon/Documentation RegexPolicyDaemon/Documentation (last edited 2007-10-16 08:15:35 by ct)