miniexpect - A very simple expect library
#include <errno.h>
#include <sys/wait.h>
#define PCRE2_CODE_UNIT_WIDTH 8
#include <pcre2.h>
#include <miniexpect.h>
mexp_h *h;
h = mexp_spawnl ("ssh", "ssh", "host", NULL);
switch (mexp_expect (h, regexps, match_data)) {
...
}
mexp_close (h);
cc prog.c -o prog -lminiexpect -lpcre2-8
Miniexpect is a very simple expect-like library for C. Expect is a way to control an external program that wants to be run interactively.
Miniexpect has a saner interface than libexpect, and doesn't depend on Tcl. It is also thread safe, const-correct and uses modern C standards.
Miniexpect is a standalone library, except for a single dependency: it requires the PCRE2 (Perl Compatible Regular Expressions) library from http://www.pcre.org/. The PCRE2 dependency is fundamental because we want to offer the most powerful regular expression syntax to match on, but more importantly because PCRE2 has a convenient way to detect partial matches which made this library very simple to implement.
This manual page documents the API. Examples of how to use the API can be found in the source directory.
Miniexpect lets you start up an external program, control it (by sending commands to it), and close it down gracefully. Two things make this different from other APIs like popen(3) and system(3): Firstly miniexpect creates a pseudoterminal (pty). Secondly miniexpect lets you match the output of the program using regular expressions. Both of these are handy for controlling interactive programs that might (for example) ask for passwords, but you can use miniexpect on just about any external program.
You can control multiple programs at the same time.
There are four calls for creating a subprocess:
mexp_h *mexp_spawnl (const char *file, const char *arg, ...);
This creates a subprocess running the external program file
(the current $PATH
is searched unless you give an absolute path). arg, ...
are the arguments to the program. You should terminate the list of arguments with NULL
. Usually the first argument should be the name of the program.
The return value is a handle (see next section).
If there was an error running the subprocess, NULL
is returned and the error is available in errno
.
For example, to run an ssh subprocess you could do:
h = mexp_spawnl ("ssh", "ssh", "-l", "root", "host", NULL);
or to run a particular ssh binary:
h = mexp_spawnl ("/usr/local/bin/ssh", "ssh", "-l", "root", "host", NULL);
An alternative to mexp_spawnl
is:
mexp_h *mexp_spawnv (const char *file, char **argv);
This is the same as mexp_spawnl
except that you pass the arguments in a NULL-terminated array.
There are also two versions of the above calls which take flags:
mexp_h *mexp_spawnlf (unsigned flags, const char *file, const char *arg, ...);
mexp_h *mexp_spawnvf (unsigned flags, const char *file, char **argv);
The flags may contain the following values, logically ORed together:
- MEXP_SPAWN_KEEP_SIGNALS
-
Do not reset signal handlers to
SIG_DFL
in the subprocess. - MEXP_SPAWN_KEEP_FDS
-
Do not close file descriptors ≥ 3 in the subprocess.
- MEXP_SPAWN_COOKED_MODE or MEXP_SPAWN_RAW_MODE
-
Configure the pty in cooked mode or raw mode. Raw mode is the default.
After spawning a subprocess, you get back a handle which is a pointer to a struct:
struct mexp_h;
typedef struct mexp_h mexp_h;
Various methods can be used on the handle:
int mexp_get_fd (mexp_h *h);
Return the file descriptor of the pty of the subprocess. You can read and write to this if you want, although convenience functions are also provided (see below).
pid_t mexp_get_pid (mexp_h *h);
Return the process ID of the subprocess. You can send it signals if you want.
int mexp_get_timeout_ms (mexp_h *h);
void mexp_set_timeout_ms (mexp_h *h, int millisecs);
void mexp_set_timeout (mexp_h *h, int secs);
Get or set the timeout used by mexp_expect
[see below]. The resolution is milliseconds (1/1000th of a second). Set this before calling mexp_expect
. Passing -1 to either of the set_
methods means no timeout. The default setting is 60000 milliseconds (60 seconds).
size_t mexp_get_read_size (mexp *h);
void mexp_set_read_size (mexp *h, size_t read_size);
Get or set the natural size (in bytes) for reads from the subprocess. The default is 1024. Most callers will not need to change this.
int mexp_get_pcre_error (mexp *h);
When mexp_expect
[see below] calls the PCRE function pcre2_match(3), it stashes the return value in the pcre_error
field in the handle, and that field is returned by this method.
If mexp_expect
returns MEXP_PCRE_ERROR
, then the actual PCRE error code returned by pcre2_match(3) is available by calling this method. For a list of PCRE error codes, see pcre2api(3).
void mexp_set_debug_file (mexp *h, FILE *fp);
FILE *mexp_get_debug_file (mexp *h);
Set or get the debug file of the handle. To enable debugging, pass a non-NULL
file handle, eg. stderr
. To disable debugging, pass NULL
. Debugging messages are printed on the file handle.
Note that all output and input gets printed, including passwords. To prevent passwords from being printed, modify your code to call mexp_printf_password
instead of mexp_printf
.
The following fields in the handle do not have methods, but can be accessed directly instead:
char *buffer;
size_t len;
size_t alloc;
If mexp_expect
returns a match then these variables contain the read buffer. Note this buffer does not contain the full input from the process, but it will contain at least the part matched by the regular expression (and maybe some more). buffer
is the read buffer and len
is the number of bytes of data in the buffer.
ssize_t next_match;
If mexp_expect
returns a match, then next_match
points to the first byte in the buffer after the fully matched expression. (It may be -1
which means it is invalid). The next time that mexp_expect
is called, it will start by consuming the data buffer[next_match...len-1]
. Callers may also need to read from that point in the buffer before calling read(2) on the file descriptor. Callers may also set this, for example setting it to -1
in order to ignore the remainder of the buffer. In most cases callers can ignore this field, and mexp_expect
will just do the right thing when called repeatedly.
void *user1;
void *user2;
void *user3;
Opaque pointers for use by the caller. The library will not touch these.
To close the handle and clean up the subprocess, call:
int mexp_close (mexp_h *h);
This returns the status code from the subprocess. This is in the form of a waitpid(2)/system(3) status so you have to use the macros WIFEXITED
, WEXITSTATUS
, WIFSIGNALED
, WTERMSIG
etc defined in <sys/wait.h>
to parse it.
If there was a system call error, then -1
is returned. The error will be in errno
.
Notes:
Even in error cases, the handle is always closed and its memory is freed by this call.
It is normal for the kernel to send SIGHUP to the subprocess.
If the subprocess doesn't catch the SIGHUP, then it will die with status:
WIFSIGNALED (status) && WTERMSIG (status) == SIGHUP
This case should not necessarily be considered an error.
This is how code should check for and print errors from mexp_close
:
status = mexp_close (h);
if (status == -1) {
perror ("mexp_close");
return -1;
}
if (WIFSIGNALED (status) && WTERMSIG (status) == SIGHUP)
goto ignore; /* not an error */
if (!WIFEXITED (status) || WEXITSTATUS (status) != 0)
/* You could use the W* macros to print a better error message. */
fprintf (stderr, "error: subprocess failed, status = %d", status);
return -1;
}
ignore:
/* no error case */
Miniexpect contains a powerful regular expression matching function based on pcre2(3):
int mexp_expect (mexp_h *h, const mexp_regexp *regexps, pcre2_match_data *match_data);
The output of the subprocess is matched against the list of PCRE regular expressions in regexps
. regexps
is a list of regular expression structures:
struct mexp_regexp {
int r;
const pcre2_code *re;
int options;
};
typedef struct mexp_regexp mexp_regexp;
r
is the integer code returned from mexp_expect
if this regular expression matches. It must be > 0. r == 0
indicates the end of the list of regular expressions. re
is the compiled regular expression.
Possible return values are:
MEXP_TIMEOUT
-
No input matched before the timeout (
h->timeout
) was reached. MEXP_EOF
-
The subprocess closed the connection.
MEXP_ERROR
-
There was a system call error (eg. from the read call). The error is returned in
errno
. MEXP_PCRE_ERROR
-
There was a
pcre2_match
error.h->pcre_error
is set to the error code. See pcreapi(3) for a list of thePCRE_*
error codes and what they mean. r
> 0-
If any regexp matches, the associated integer code (
regexps[].r
) is returned.
Notes:
regexps
may be NULL or an empty list, which means we don't match against a regular expression. This is useful if you just want to wait for EOF or timeout.regexps[].re
,regexps[].options
andmatch_data
are passed through to the pcre2_match(3) function.If multiple regular expressions are passed, then they are checked in turn and the first regular expression that matches is returned even if the match happens later in the input than another regular expression.
For example if the input is
"hello world"
and you pass the two regular expressions:regexps[0].re = world regexps[1].re = hello
then the first regular expression (
"world"
) may match and the"hello"
part of the input may be ignored.In some cases this can even lead to unpredictable matching. In the case above, if we only happened to read
"hello wor"
, then the second regular expression ("hello"
) would match.If this is a concern, combine your regular expressions into a single one, eg.
(hello)|(world)
.
It is easier to understand mexp_expect
by considering a simple example.
In this example we are waiting for ssh to either send us a password prompt, or (if no password was required) a command prompt, and based on the output we will either send back a password or a command.
The unusual (mexp_regexp[]){...}
syntax is called a "compound literal" and is available in C99. If you need to use an older compiler, you can just use a local variable instead.
mexp_h *h;
int errcode;
int offset;
pcre2_code *password_re, *prompt_re;
pcre2_match_data *match_data = pcre2_match_data_create (4, NULL);
password_re = pcre2_compile ("assword", PCRE2_ZERO_TERMINATED,
0, &errcode, &offset, NULL);
prompt_re = pcre2_compile ("[$#] ", PCRE2_ZERO_TERMINATED,
0, &errcode, &offset, NULL);
switch (mexp_expect (h,
(mexp_regexp[]) {
{ 100, .re = password_re },
{ 101, .re = prompt_re },
{ 0 },
}, match_data)) {
case 100:
/* server printed a password prompt, so send a password */
mexp_printf_password (h, "%s", password);
break;
case 101:
/* server sent a shell prompt, send a command */
mexp_printf (h, "ls\n");
break;
case MEXP_EOF:
fprintf (stderr, "error: ssh closed the connection unexpectedly\n");
exit (EXIT_FAILURE);
case MEXP_TIMEOUT:
fprintf (stderr, "error: timeout before reaching the prompt\n");
exit (EXIT_FAILURE);
case MEXP_ERROR:
perror ("mexp_expect");
exit (EXIT_FAILURE);
case MEXP_PCRE_ERROR:
fprintf (stderr, "error: PCRE error: %d\n", h->pcre_error);
exit (EXIT_FAILURE);
}
You can write to the subprocess simply by writing to h->fd
. However we also provide a convenience function:
int mexp_printf (mexp_h *h, const char *fs, ...);
int mexp_printf_password (mexp_h *h, const char *fs, ...);
This returns the number of bytes, if the whole message was written OK. If there was an error, -1 is returned and the error is available in errno
.
Notes:
mexp_printf
will not do a partial write. If it cannot write all the data, then it will return an error.This function does not write a newline automatically. If you want to send a command followed by a newline you have to do something like:
mexp_printf (h, "exit\n");
mexp_printf_password
works identically tomexp_printf
except that the output is not sent to the debugging file if debugging is enabled. As the name suggests, use this for passwords so that they don't appear in debugging output.
int mexp_send_interrupt (mexp_h *h);
Send the interrupt character (^C
, Ctrl-C, \003
). This is like pressing ^C
- the subprocess (or remote process, if using ssh
) is gracefully killed.
Note this only works if the pty is in cooked mode (ie. MEXP_SPAWN_COOKED_MODE
was passed to mexp_spawnlf
or mexp_spawnvf
). In raw mode, all characters are passed through without any special interpretation.
Source is available from: http://git.annexia.org/?p=miniexpect.git;a=summary
pcre2(3), pcre2_match(3), pcre2api(3), waitpid(2), system(3).
Richard W.M. Jones (rjones at redhat dot com
)
The library is released under the Library GPL (LGPL) version 2 or at your option any later version.
Copyright (C) 2014-2022 Red Hat Inc.