bawk is a wrapper script around GNU awk (gawk) that makes it easier to run awk scripts on standard logs generated by Bro NSM/IDS.
This software is currently considered alpha/beta and you are recommended against using it on production infrastructure. At this point I'm releasing it to seek review and input from users. If you have comments, questions or input, please create an issue. Right now this has only been tested with the default Bro format logs.
Copyright 2017 Mark Krenz ([email protected]) This program is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.
This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.
You should have received a copy of the GNU General Public License along with this program. If not, see <http://www.gnu.org/licenses/>.
I'm looking into making this available through bro-pkg, but for now, just follow these directions.
- Download this repo to the host where your Bro logs are located.
- Copy the bin/bawk script to /opt/bro/bin or to somewhere in your path
- Copy the lib directory and its contents to /opt/bro/lib/bawk or some accessible library directory
- If you've changed the library directory from /opt/bro/lib/bawk to something else, then adjust the directory paths in the lib/load file to match the new path.
The bawk script handles the format of the bro logs on your behalf. You don't need to specify the field separator and you can reference the fields in a Bro formatted log by their name.
For instance, to access the id.orig_h field, just use $_b["id.orig_h"]
Here is an example of running bawk against an ssh.log file to print lines where the origin IP is 10.0.5.7 and the authentication was successful:
bawk '$_b["id.orig_h"]=="10.0.5.7" && $_b["auth_success"]=="T"' ssh.log
In addition to the _b[] array containing the names of each field. There are variables set as defined at the top of the log file.
- _b[] field array
- _log_separator
- _log_set_separator
- _log_empty_field
- _log_unset_field
- _log_path
- _log_open_time
- _log_close_time
There are a few functions available through the included library files.
Usage: geoip(IP)
This is available if you install the geoiplookup binary. It keeps a cache file in your home directory called .gawk-geoip-cache that stores the results of lookups so that the next time you run the program it doesn't have to run the geoiplookup program again.
Usage: addrtype(address)
Determines if the IP is an IPv4 IP or IPv6 IP address. It returns the string v4 or v6 depending on the type. If it can't determine the type it returns an empty string.
Usage: classC(ipv4address)
These functions are useful for getting the old class style networks. The functions return either the first 3 (classC), first 2 (classB) or first (classA) parts of the IPv4 dotted quad.
Example: classC(192.0.2.150) --> 192.0.2
It can be useful to consolidate several IPs from the same class network into one.
Usage: addr_in_cidr(ipv4address, cidr_netblock)
This function will return true if the given IPv4 address is in the given CIDR formatted netblock. You can use any CIDR netmask from /0 to /32.
Example: addr_in_cidr($_b["id.resp_h"], "192.168.0.0/24")