Skip to content

batch pandoc

Benct Philip Jonsson edited this page Jul 6, 2015 · 3 revisions

This is the documentation for batch-pandoc.pl.

NAME

batch-pandoc.pl - batch-convert files under a directory with pandoc.

SYNOPSIS

perl batch-pandoc.pl -i INPUT_DIR [OPTIONS]  [PANDOC_OPTIONS]

DESCRIPTION

batch-pandoc.pl is a perl script to batch-convert files with a certain extension in a directory and its descendants with pandoc, creating a mirror directory structure with converted files.

Suppose you have a directory-structure like:

site-md/
    index.md
    introduction/
        bits.md
        image.png
        index.md
        pieces.md
    usage/
        index.md
        that.md
        this.md
        other/
            index.md
            more.md

Then the invocation

perl batch-pandoc.pl -i site-md -o site -f md -t html -c '\.png\z'

will create a directory tree under site which is identical to site-md, except that each .md file has been replaced by a .html file which has been converted with pandoc.

Any commandline argument which is not recognised as a batch-pandoc.pl option is passed on to pandoc, so that if you for example want to use a certain pandoc template in the conversion you can just add the pandoc option --template=mytemplate.html to your commandline.

This script doesn't try to be smarter than that. If you for example want to include a navigation bar or document-specific CSS you will have to solve that with a pandoc template and document metadata.

OPTIONS

batch-pandoc.pl recognises the following options. All other commandline arguments are passed along to pandoc. Where there is a name-clash between batch-pandoc.pl and pandoc short options you have to use the long pandoc option names, or in the case of -f and -t the synonymous -r and -w. Note that it is useless to specify an -o/--output option as one will be added automatically to the end of the pandoc commandline, overriding it.

-i path/to/dir, --input-dir=path/to/dir

(Required)

The directory containing the input files. This option has no default, and thus it is an error to omit it.

-o path/to/dir, --output-dir=path/to/dir

The directory below which to put the output files. If omitted a sibling to --input-dir with -TO-EXTENSION appended will be used. This directory and its descendant directories will be created as needed.

-f .ext, --from-extension=.ext

(Default: .md)

The file extension for input files. The leading dot will be added if missing.

-t .ext, --to-extension=.ext

(Default: .html)

The file extension for output files. The leading dot will be added if missing.

-c regex, --copy-matching=regex

A Perl regular expression. If supplied any files below --input-dir with a name matching the regex will be copied to the corresponding position below --output-dir. Typically it should match one or more file extensions and anchor to the end-of-string, e.g. \.(?:jpe?g|png|gif)\z.

-W [sprintf-format], --wikilinks[=sprintf-format]

If present, with or without an argument any input file containing wikilinks of the form [[LINK TEXT|WIKILINK]] or [[WIKILINK]] will be copied to a temporary file with those links substituted by the return value of sprintf($sprintf_format, $linktext, $wikilink), and the temporary file will be used as input file instead of the original.

The sprintf-format defaults to "[%s](%s.$to_extension)" and the link text defaults to the wikilink text. The wikilink text will have any whitespace replaced with hyphens. Thus a wikilink like [[normal usage]] will become [normal usage](normal-usage.html) and a wikilink like [[when used normally|normal usage]] will become [when used normally](normal-usage.html) with the defaults.

This option is useful for example if you have cloned a GitHub wiki and want to convert it to some other format:

perl batch-pandoc.pl -i myproject.wiki -t pdf -W -r markdown_github

will create a directory myproject.wiki-pdf containing the wiki pages in PDF format.

--titles[=0|1]

If the argument is true (!=0) or missing a title will be constructed from the input filename by removing the --from-extension, replacing hyphens with spaces and capitalizing the first word, and included in the pandoc arguments as -M title=TITLE. This option is useful when converting a cloned GitHub wiki but should not be used when the source files contain their own title information e.g. as Pandoc metadata or as HTML <title> elements.

-C [0|1], --include-css[=0|1]

If the argument is true (!=0) or missing and there is a sibling file with the same basename as the source file but a .css extension that file will be copied and linked by including --css=FILENAME in the pandoc argument list.

-Y [0|1], --include-yaml[=0|1]

If the argument is true (!=0) or missing and there is a sibling file with the same basename as the source file but a .yaml extension that file will be included as an input file on the pandoc argument list. This is useful if you want to make metadata accessible to external tools by keeping them in a separate file. To work as a pandoc metadata block the file has to begin and end with the --- and ... delimiters.

On the other hand you can copy YAML metadata from markdown files to external files by saving a file yaml.markdown with the contents

$if(titleblock)$
$titleblock$
$else$
--- {}
$endif$

and then run with the commandline

$ perl batch-pandoc.pl -i sourcedir -o sourcedir -f .md -t .yaml \
-w markdown --template=yaml.markdown
-P path/to/pandoc, --pandoc=path/to/pandoc

(Default: pandoc)

Gives the path to the pandoc executable. Useful if you have several versions installed or use a wrapper script (in which case you would give the name or path of the wrapper).

AUTHOR

Benct Philip Jonsson [email protected]

COPYRIGHT

Copyright 2015- Benct Philip Jonsson

LICENSE

This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself.

SEE ALSO

Pandoc