Skip to content

Commit

Permalink
initial drop of zq
Browse files Browse the repository at this point in the history
  • Loading branch information
Andrew Swan committed Nov 11, 2019
1 parent c65d65b commit 5fc47f7
Show file tree
Hide file tree
Showing 103 changed files with 9,151 additions and 0 deletions.
138 changes: 138 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,138 @@
# zq

zq is a command-line tool for processing
[zeek](https://www.zeek.org) logs. If you are familiar with
[zeek-cut](https://github.com/zeek/zeek-aux/tree/master/zeek-cut),
you can think of zq as zeek-cut on steroids. If you missed
[the name change](https://blog.zeek.org/2018/10/renaming-bro-project_11.html),
zeek was formerly known as "bro".

zq is comprised of
* an [execution engine](pkg/proc) for log pattern search and analytics,
* a [query language](pkg/zql/README.md) that compiles into a program that runs on
the execution engine, and
* an open specification for structured logs called [zson](pkg/zson/docs/spec.md).

zq takes zeek/zson logs as input and filters, transforms, and performs
analytics using the
[zq log query language](pkg/zql/README.md),
producing a log stream as its output.

## Install

We don't yet distribute pre-built binaries, so to install zq, you must
clone the repo and compile the source.
To install the binaries in `$GOPATH/bin`, grab this repo and
execute a good old-fashioned make install:

```
git clone https://github.com/mccanne/zq
cd zq
make install
```
## Usage

For zq command usage, see the built-in help by running
```
zq help
```
zq program syntax and semantics are documented in
[zq language README](pkg/zql/README.md)

### Examples

Here are a few examples.

To cut the columns of a conn log like
[zeek-cut](https://github.com/zeek/zeek-aux/tree/master/zeek-cut) does, run:
```
zq conn.log "* | cut ts,id.orig_h,id.orig_p"
```
The "*" tells zq to match every line, which is sent to the cut processor
using the unix-like pipe syntax.

The default output is a zson file. If you want just the tab-separated lines
like zeek-cut, you can specify text output:
```
zq -f text conn.log "* | cut ts,id.orig_h,id.orig_p"
```
If you want the old-style zeek log format, run the command with the -f flag
specifying "zeek" for the output format:
```
zq -f zeek conn.log "* | cut ts,id.orig_h,id.orig_p"
```
To summarize data, you can use an aggregate function to summarize data over one or
more fields, e.g., summing field, counting, or computing an average.

TBD: keep going here... explain _path, zq *.log > all.zson

## Development

Zq is a [Go module](https://github.com/golang/go/wiki/Modules), so
dependencies are specified in the [go.mod file](/go.mod) and managed
automatically by commands like `go build` and `go test`. No explicit
fetch commands are necessary. However, you must set the environment
variable `GO111MODULE=on` if your repo is at
`$GOPATH/src/github.com/mccanne/zq`.

Zq currently requires Go 1.13 or later so make sure your install is up to date.

When go.mod or its companion go.sum is modified during development, run
`go mod tidy` and then commit the changes to both files.

To use a local checkout of a dependency, use `go mod edit`:
```
go mod edit -replace=github.com/org/repo=../repo
```

Note that local checkouts must have a go.mod file, so it may be
necessary to create a temporary one:
```
echo 'module github.com/org/repo' > ../repo/go.mod
```

### Testing

Before any PRs are merged to master, all tests must pass.

To run unit tests in your local repo, execute
```
make test-unit
```

And to run system tests, execute
```
make test-system
```

### Profiling

To use the [Go profiler ](https://golang.org/pkg/net/http/pprof/) to see where CPU
is being used, see the built-in help for the profiling command *-P*.

This will output a pprof command that you can view as follows:

```
go tool pprof -http localhost:8081 localhost:9867
open http://localhost:8081/ui/
```

The flame graph is usually pretty helpful.

## Contributing

zq is developed on github by its community. We welcome contributions.

Feel free to
[post an issue](https://github.com/mccanne/zq/issues),
fork the repo, or send us a pull request.

zq is early in its life cycle and will be expanding quickly. Please star and/or
watch the repo so you can follow and track our progress.

In particular, we will be adding many more processors and aggregate functions.
If you want a fun small project to help out, pick some functionality that is missing and
add a processor in
[zq/pkg/proc](pkg/proc)
or an aggregate function in
[zq/pkg/reducer](pkg/reducer).
215 changes: 215 additions & 0 deletions cmd/zq.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,215 @@
package cmd

import (
"context"
"flag"
"fmt"
"os"
"path/filepath"

"github.com/mccanne/zq/emitter"
"github.com/mccanne/zq/pkg/zsio"
"github.com/mccanne/zq/pkg/zson"
"github.com/mccanne/zq/pkg/zson/resolver"
"github.com/mccanne/zq/proc"
"github.com/mccanne/zq/scanner"
"github.com/looky-cloud/lookytalk/ast"
"github.com/looky-cloud/lookytalk/parser"
"github.com/mccanne/charm"
"go.uber.org/zap"
)

type errInvalidFile string

func (reason errInvalidFile) Error() string {
return fmt.Sprintf("invalid file %s", string(reason))
}

var Zq = &charm.Spec{
Name: "zq",
Usage: "zq [options] <search> [file...]",
Short: "command line zeek processor",
Long: "",
New: func(parent charm.Command, flags *flag.FlagSet) (charm.Command, error) {
return New(flags)
},
}

func init() {
Zq.Add(charm.Help)
}

type Command struct {
dt *resolver.Table
format string
dir string
path string
outputFile string
verbose bool
reverse bool
stats bool
warnings bool
showTypes bool
showFields bool
epochDates bool
}

func New(f *flag.FlagSet) (charm.Command, error) {
cwd, _ := os.Getwd()
c := &Command{dt: resolver.NewTable()}
f.StringVar(&c.format, "f", "text", "format for output data [text,table,zeek,json,ndjson,raw]")
f.StringVar(&c.path, "p", cwd, "path for input")
f.StringVar(&c.dir, "d", "", "directory for output data files")
f.StringVar(&c.outputFile, "o", "", "write data to output file")
f.BoolVar(&c.verbose, "v", false, "show verbose details")
f.BoolVar(&c.reverse, "R", false, "reverse search order (from oldest to newest)")
f.BoolVar(&c.stats, "S", false, "display search stats on stderr")
f.BoolVar(&c.warnings, "W", false, "display warnings on stderr")
f.BoolVar(&c.showTypes, "T", false, "display field types in text output")
f.BoolVar(&c.showFields, "F", false, "display field names in text output")
f.BoolVar(&c.epochDates, "E", false, "display epoch timestamps in text output")
return c, nil
}

func (c *Command) compile(p ast.Proc, reader zson.Reader) (*proc.MuxOutput, error) {
ctx := &proc.Context{
Context: context.Background(),
Resolver: resolver.NewTable(),
Logger: zap.NewNop(),
Reverse: c.reverse,
Warnings: make(chan string, 5),
}
scr := scanner.NewScanner(reader)
leaves, err := proc.CompileProc(nil, p, ctx, scr)
if err != nil {
return nil, err
}
return proc.NewMuxOutput(ctx, leaves), nil
}

func (c *Command) Run(args []string) error {
if len(args) == 0 {
return Zq.Exec(c, []string{"help"})
}

query, err := parser.ParseProc(args[0])
if err != nil {
return fmt.Errorf("parse error: %s", err)
}
// XXX c.format should really implement the flag.Value interface.
if err := checkFormat(c.format); err != nil {
return err
}
paths := args[1:]
var reader zson.Reader
if len(paths) > 0 {
if reader, err = c.loadFiles(paths); err != nil {
return err
}
} else {
// XXX lookup reader based on specified input type or just
// use a TBD zsio.Peeker to delay creation of the reader until it reads
// a few lines and infers the right type
reader = zsio.LookupReader("zeek", os.Stdin, c.dt)
}
writer, err := c.openOutput()
if err != nil {
return err
}
defer writer.Close()
output := emitter.NewEmitter(writer)
mux, err := c.compile(query, reader)
if err != nil {
return err
}
return output.Run(mux)
}

func extension(format string) string {
switch format {
case "zeek":
return ".log"
case "ndson":
return ".ndson"
case "json":
return ".json"
default:
return ".txt"
}
}

func (c *Command) loadFile(path string) (zson.Reader, error) {
info, err := os.Stat(path)
if err != nil {
return nil, err
}
if info.IsDir() {
return nil, errInvalidFile("is a directory")
}
// XXX this should go away soon
if filepath.Ext(path) != ".log" {
return nil, errInvalidFile("does not have .log extension")
}
f, err := os.Open(path)
if err != nil {
return nil, err
}
return zsio.LookupReader("zeek", f, c.dt), nil
}

func (c *Command) errorf(format string, args ...interface{}) {
_, _ = fmt.Fprintf(os.Stderr, format, args...)
}

func (c *Command) loadFiles(paths []string) (zson.Reader, error) {
var readers []zson.Reader
for _, path := range paths {
r, err := c.loadFile(path)
if err != nil {
if _, ok := err.(errInvalidFile); ok {
c.errorf("skipping file: %s\n", err)
continue
}
return nil, err
}
readers = append(readers, r)
}
if len(readers) == 1 {
return readers[0], nil
}
return scanner.NewCombiner(readers), nil
}

func (c *Command) openOutput() (zson.WriteCloser, error) {
if c.dir != "" {
return c.openOutputDir()
}
file, err := c.openOutputFile()
if err != nil {
return nil, err
}
// XXX need to create writer based on output format flag
writer := zsio.LookupWriter("zeek", file)
return writer, nil
}

func (c *Command) openOutputFile() (*os.File, error) {
if c.outputFile == "" {
return os.Stdout, nil
}
flags := os.O_WRONLY | os.O_CREATE | os.O_EXCL
return os.OpenFile(c.outputFile, flags, 0600)
}

func (c *Command) openOutputDir() (*emitter.Dir, error) {
ext := extension(c.format)
return emitter.NewDir(c.dir, c.outputFile, ext, os.Stderr)
}

func checkFormat(f string) error {
switch f {
case "zson", "zeek", "ndjson", "json", "text", "table", "raw":
return nil
}
return fmt.Errorf("invalid format: %s", f)
}
Loading

0 comments on commit 5fc47f7

Please sign in to comment.