forked from jgm/pandoc
-
Notifications
You must be signed in to change notification settings - Fork 0
/
changelog
10832 lines (8259 loc) · 449 KB
/
changelog
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
pandoc (1.19.1)
* Set `PANDOC_VERSION` environment variable for filters (#2640).
This allows filters to check the pandoc version that produced
the JSON they are receiving.
* Docx reader: Ensure one-row tables don't have header (#3285,
Jesse Rosenthal). Tables in MS Word are set by default to have
special first-row formatting, which pandoc uses to determine whether
or not they have a header. This means that one-row tables will, by
default, have only a header -- which we imagine is not what people
want. This change ensures that a one-row table is not understood to
be a header only. Note that this means that it is impossible to
produce a header-only table from docx, even though it is legal
pandoc. But we believe that in nearly all cases, it will be an
accidental (and unwelcome) result
* HTML reader:
+ Fixed some bad regressions in HTML table parser (#3280).
This regression leads to the introduction of empty rows
in some circumstances.
+ Understand `style=width:` as well as `width` in `col` (#3286).
* RST reader:
+ Print warnings when keys, substitition, notes not found.
Previously the parsers failed and we got raw text. Now we get a
link with an empty URL, or empty inlines in the case of a note or
substitution.
+ Fix hyperlink aliases (#3283).
* Man writer: Ensure that periods are escaped at beginning of line
(#3270).
* LaTeX writer: Fix unnumbered headers when used with `--top-level`
(#3272, Albert Krewinkel). Fix interaction of top-level divisions
`part` or `chapter` with unnumbered headers when emitting LaTeX. Headers
are ensured to be written using stared commands (like `\subsection*{}`).
* LaTeX template: use comma not semicolon to separate keywords for
`pdfkeywords`. Thanks to Wandmalfarbe.
* Markdown writer: Fixed incorrect word wrapping (#3277). Previously pandoc
would sometimes wrap lines too early due to this bug.
* Text.Pandoc.Pretty: Added `afterBreak` [API change]. This makes it
possible to insert escape codes for content that needs escaping at the
beginning of a line.
* Removed old MathMLInHTML.js from 2004, which should no longer
be needed for MathML with modern browsers.
* Fixed tests with dynamic linking (#2709).
* Makefile: Use stack instead of cabal for targets. This is just
a convenience for developers.
* Fixed bash completion of filenames with space (#2749).
* MANUAL: improved documentation on how to create a custom `reference.docx`.
* Fix minor spelling typos in the manual (#3273, Anthony Geoghegan)
pandoc (1.19)
* Changed resolution of filter paths.
+ We now first treat the argument of `--filter` as a full (absolute
or relative) path, looking for a program there. If it's found, we
run it.
+ If not, and if it is a simple program name or a relative path, we
try resolving it relative to `$DATADIR/filters`.
+ If this fails, then we treat it as a program name and look in the
user's PATH.
+ Removed a hardcoded '/' that may have caused problems with
Windows paths.
Previously if you did `--filter foo` and you had `foo` in your path and
also an executable `foo` in your working directory, the one in the path
would be used. Now the one in the working directory is used.
In addition, when you do `--filter foo/bar.hs`, pandoc will now find a
filter `$DATADIR/filters/foo/bar.hs` -- assuming there isn't a
`foo/bar.hs` relative to the working directory.
* Allow `file://` URIs as arguments (#3196). Also improved default reader
format detection. Previously with a URI ending in .md or .markdown,
pandoc would assume HTML input. Now it treats these as markdown.
* Allow to overwrite top-level division type heuristics (#3258,
Albert Krewinkel). Pandoc uses heuristics to determine the most
reasonable top-level division type when emitting LaTeX or
Docbook markup. It is now possible to overwrite this implicitly set
top-level division via the `top-level-division` command line parameter.
* Text.Pandoc.Options \[API changes\]:
+ Removed `writerStandalone` field in `WriterOptions`, made
`writerTemplate` a `Maybe` value. Previously setting
`writerStandalone = True` did nothing unless a template was provided
in writerTemplate. Now a fragment will be generated if
`writerTemplate` is `Nothing`; otherwise, the specified template
will be used and standalone output generated.
+ `Division` has been renamed `TopLevelDivision` (#3197). The
`Section`, `Chapter`, and `Part` constructors were renamed to
`TopLevelSection`, `TopLevelChapter`, and
`TopLevelPart`, respectively. An additional `TopLevelDefault`
constructor was added, which is now also the new default value of
the `writerTopLevelDivision` field in `WriterOptions`.
* Improved error if they give wrong arg to `--top-level-division`.
* Use new module from texmath to lookup MS font codepoints in Docx reader.
Removed unexported module Text.Pandoc.Readers.Docx.Fonts. Its code now
lives in texmath (0.9).
* DocBook reader: Fixed xref lookup (#3243). It previously only worked
when the qnames lacked the docbook namespace URI.
* HTML reader:
+ Improved table parsing (#3027). We now check explicitly for non-1
rowspan or colspan attributes, and fail when we encounter them.
Previously we checked that each row had the same number of cells,
but that could be true even with rowspans/colspans. And there are
cases where it isn't true in tables that we can handle fine -- e.g.
when a tr element is empty. So now we just pad rows with empty cells
when needed.
+ Treat `<math>` as MathML by default unless something else is
explicitly specified in xmlns. Provided it parses as MathML,
of course. Also fixed default which should be to inline math if no
display attribute is used.
+ Only treat "a" element as link if it has href (#3226). Otherwise
treat as span.
* Docx reader (Jesse Rosenthal):
+ Add a placeholder value for CHART. We wrap `[CHART]` in a
`<span class="chart">`. Note that it maps to inlines because, in
docx, anything in a drawing tag can be part of a larger paragraph.
+ Be more specific in parsing images We not only want `w:drawing`,
because that could also include charts. Now we specify
`w:drawing/pic:pic`. This shouldn't change behavior at all, but it's
a first step toward allowing other sorts of drawing data as well.
+ Abstract out function to avoid code repetition.
+ Update tests for img title and alt (#3204).
+ Handle Alt text and titles in images. We use the "description" field
as alt text and the "title" field as title. These can be accessed
through the "Format Picture" dialog in Word.
+ Docx reader utils: handle empty namespace in `elemName`. Previously,
if given an empty namespace `(elemName ns "" "foo")` `elemName`
would output a QName with a `Just ""` namespace. This is never what
we want. Now we output a `Nothing`. If someone *does* want a
`Just ""` in the namespace, they can enter the QName
value explicitly.
* ODT reader/writer:
+ Inline code when text has a special style (Hubert Plociniczak). When
a piece of text has a text `Source_Text` then we assume that this is
a piece of the document that represents a code that needs to
be inlined. Adapted the writer to also reflect that change.
Previously it was just writing a 'preformatted' text using a
non-distinguishable font style. Code blocks are still not recognized
by the ODT reader. That's a separate issue.
+ Infer table's caption from the paragraph (#3224,
Hubert Plociniczak). ODT's reader always put empty captions for the
parsed tables. This commit
1. checks paragraphs that follow the table definition
2. treats specially a paragraph with a style named 'Table'
3. does some postprocessing of the paragraphs that combines tables
followed immediately by captions
The ODT writer used the `TableCaption` style for the caption
paragraph. This commit follows the OpenOffice approach which allows
for appending captions to table but uses a built-in style named
`Table` instead of `TableCaption`. Users of a custom `reference.odt`
should change the style's name from `TableCaption` to `Table`.
* ODT reader: Infer tables' header props from rows (#3199,
Hubert Plociniczak). ODT reader simply provided an empty header list
which meant that the contents of the whole table, even if not empty, was
simply ignored. While we still do not infer headers we at least have to
provide default properties of columns.
* Markdown reader:
+ Allow reference link labels starting with `@...` if `citations`
extension disabled (#3209). Example: in
\[link text\]\[@a\]
`link text` isn't hyperlinked because `[@a]` is parsed as
a citation. Previously this happened whether or not the `citations`
extension was enabled. Now it happens only if the `citations`
extension is enabled.
+ Allow alignments to be specified in Markdown grid tables. For
example,
+-------+---------------+--------------------+
| Right | Left | Centered |
+=========:+:=================+:=============:+
| Bananas | $1.34 | built-in wrapper |
+-------+---------------+--------------------+
+ Allow Small Caps elements to be created using bracketed spans (as
they already can be using HTML-syntax spans) (#3191, Kolen Cheung).
* LaTeX reader:
+ Don't treat `\vspace` and `\hspace` as block commands (#3256).
Fixed an error which came up, for example, with `\vspace` inside
a caption. (Captions expect inlines.)
+ Improved table handling. We can now parse all of the tables emitted
by pandoc in our tests. The only thing we don't get yet are
alignments and column widths in more complex tables. See #2669.
+ Limited support for minipage.
+ Allow for `[]`s inside LaTeX optional args. Fixes cases like:
+ Handle BVerbatim from fancyvrb (#3203).
+ Handle hungarumlaut (#3201).
+ Allow beamer-style `<...>` options in raw LaTeX (also in Markdown)
(#3184). This allows use of things like `\only<2,3>{my content}` in
Markdown that is going to be converted to beamer.
* Use pre-wrap for code in dzslides template (Nicolas Porcel). Otherwise
overly long code will appear on every slide.
* Org reader (Albert Krewinkel):
+ Respect column width settings (#3246). Table column properties can
optionally specify a column's width with which it is displayed in
the buffer. Some exporters, notably the ODT exporter in org-mode
v9.0, use these values to calculate relative column widths. The org
reader now implements the same behavior. Note that the org-mode
LaTeX and HTML exporters in Emacs don't support this feature yet,
which should be kept in mind by users who use the column
widths parameters.
+ Allow HTML attribs on non-figure images (#3222). Images which are
the only element in a paragraph can still be given HTML attributes,
even if the image does not have a caption and is hence not a figure.
The following will add set the `width` attribute of the image to
`50%`:
+ATTR\_HTML: :width 50%
=======================
\[\[file:image.jpg\]\]
+ Support `ATTR_HTML` for special blocks (#3182). Special
blocks (i.e. blocks with unrecognized names) can be prefixed with an
`ATTR_HTML` block attribute. The attributes defined in that
meta-directive are added to the `Div` which is used to represent the
special block.
+ Support the `todo` export option. The `todo` export option allows to
toggle the inclusion of TODO keywords in the output. Setting this to
`nil` causes TODO keywords to be dropped from headlines. The default
is to include the keywords.
+ Add support for todo-markers. Headlines can have optional
todo-markers which can be controlled via the `#+TODO`, `#+SEQ_TODO`,
or `#+TYP_TODO` meta directive. Multiple such directives can be
given, each adding a new set of recognized todo-markers. If no
custom todo-markers are defined, the default `TODO` and `DONE`
markers are used. Todo-markers are conceptually separate from
headline text and are hence excluded when autogenerating
headline IDs. The markers are rendered as spans and labelled with
two classes: One class is the markers name, the other signals the
todo-state of the marker (either `todo` or `done`).
* LaTeX writer:
+ Use `\autocites*` when "suppress-author" citation used.
+ Ensure that simple tables have simple cells (#2666). If cells
contain more than a single Plain or Para, then we need to set
nonzero widths and put contents into minipages.
+ Remove invalid inlines in sections (#3218, Hubert Plociniczak).
* Markdown writer:
+ Fix calculation of column widths for aligned multiline tables
(#1911, Björn Peemöller). This also fixes excessive CPU and memory
usage for tables when `--columns` is set in such a way that cells
must be very tiny. Now cells are guaranteed to be big enough so that
single words don't need to line break, even if this pushes the line
length above the column width.
+ Use bracketed form for native spans when `bracketed_spans`
enabled (#3229).
+ Fixed inconsistent spacing issue (#3232). Previously a tight bullet
sublist got rendered with a blank line after, while a tight ordered
sublist did not. Now we don't get the blank line in either case.
+ Fix escaping of spaces in super/subscript (#3225). Previously two
backslashes were inserted, which gave a literal backslash.
+ Adjust widths in Markdown grid tables so that they match
on round-trip.
* Docx writer:
+ Give full detail when there are errors converting tex math.
+ Handle title text in images (Jesse Rosenthal). We already handled
alt text. This just puts the image "title" into the docx
"title" attr.
+ Fixed XML markup for empty cells (#3238). Previously the Compact
style wasn't being applied properly to empty cells.
* HTML writer:
+ Updated `renderHtml` import from blaze-html.
* Text.Pandoc.Pretty:
+ Fixed some bugs that caused blank lines in tables (#3251). The bugs
caused spurious blank lines in grid tables when we had things like
`blankline $$ blankline`.
+ Add exported function `minOffet` \[API change\] (Björn Peemöller).
+ Added error message for illegal call to `block` (Björn Peemöller).
* Text.Pandoc.Shared:
+ Put `warn` in MonadIO.
+ `fetchItem`: Better handling of protocol-relative URL (#2635). If
URL starts with `//` and there is no "base URL" (as there would be
if a URL were used on the command line), then default to http:.
* Export Text.Pandoc.getDefaultExtensions \[API change\] (#3178).
* In --version, trap error in `getAppUserDataDirectory` (#3241). This
fixes a crash with `pandoc --version` on unusual systems with no real
user (e.g. SQL Server 2016).
* Added weigh-pandoc for memory usage diagnostics (#3169).
* Use correct mime types for woff and woff2 (#3228).
* Remove make\_travis\_yml.hs (#3235, Kolen Cheung).
* changelog: Moved an item that was misplaced in the 1.17.2 section to the
1.18 section where it belongs.
* CONTRIBUTING.md: minor change in wording and punctuation (#3252,
Kolen Cheung).
* Further revisions to manual for `--version` changes (#3244).
pandoc (1.18)
* Added `--list-input-formats`, `--list-output-formats`,
`--list-extensions`, `--list-highlight-languages`, and
`--list-highlight-styles` (#3173). Removed list of highlighting
languages from `--version` output. Removed list of input and output
formats from default `--help` output.
* Added `--reference-location=block|section|document` option
(Jesse Rosenthal). This determines whether Markdown link references
and footnotes are placed at the end of the document, the end of the
section, or the end of the top-level block.
* Added `--top-level-division=section|chapter|part` (Albert Krewinkel).
This determines what a level-1 header corresponds to in LaTeX,
ConTeXt, DocBook, and TEI output. The default is `section`.
The `--chapters` option has been deprecated in favor of
`--top-level-division=chapter`.
* Added `LineBlock` constructor for `Block` (Albert Krewinkel). This
is now used in parsing RST and Markdown line blocks, DocBook
`linegroup`/`line` combinations, and Org-mode `VERSE` blocks.
Previously `Para` blocks with hard linebreaks were used. `LineBlock`s
are handled specially in the following ouput formats: AsciiDoc
(as `[verse]` blocks), ConTeXt (`\startlines`/`\endlines`),
HTML (`div` with a style), Markdown (line blocks if `line_blocks`
is enabled), Org-mode (`VERSE` blocks), RST (line blocks). In
other output formats, a paragraph with hard linebreaks is emitted.
* Allow binary formats to be written to stdout (but not to tty) (#2677).
Only works on posix, since we use the unix library to check whether
output is to tty. On Windows, pandoc works as before and always requires
an output file parameter for binary formats.
* Changed JSON output format (Jesse Rosenthal). Previously we used
generically generated JSON, but this was subject to change depending
on the version of aeson pandoc was compiled with. To ensure stability,
we switched to using manually written ToJSON and FromJSON
instances, and encoding the API version. **Note:** pandoc filter
libraries will need to be revised to handle the format change.
Here is a summary of the essential changes:
+ The toplevel JSON format is now `{"pandoc-api-version" :
[MAJ, MIN, REV], "meta" : META, "blocks": BLOCKS}`
instead of `[{"unMeta": META}, [BLOCKS]]`.
Decoding fails if the major and minor version numbers don't
match.
+ Leaf nodes no longer have an empty array for their "c" value.
Thus, for example, a `Space` is encoded as `{"t":"Space"}`
rather than `{"t":"Space","c":[]}` as before.
* Removed `tests/Tests/Arbitrary.hs` and added a `Text.Pandoc.Arbitrary`
module to pandoc-types (Jesse Rosenthal). This makes it easier
to use QuickCheck with pandoc types outside of pandoc itself.
* Add `bracketed_spans` Markdown extension, enabled by default
in pandoc `markdown`. This allows you to create a native span
using this syntax: `[Here is my span]{#id .class key="val"}`.
* Added `angle_brackets_escapable` Markdown extension (#2846).
This is needed because github flavored Markdown has a slightly
different set of escapable symbols than original Markdown;
it includes angle brackets.
* Export `Text.Pandoc.Error` in `Text.Pandoc` [API change].
* Print highlighting-kate version in `--version`.
* `Text.Pandoc.Options`:
+ `Extension` has new constructors `Ext_brackted_spans` and
`Ext_angle_brackets_escapable` [API change].
+ Added `ReferenceLocation` type [API change] (Jesse Rosenthal).
+ Added `writerReferenceLocation` field to `WriterOptions` (Jesse
Rosenthal).
* `--filter`: we now check `$DATADIR/filters` for filters before
looking in the path (#3127, Jesse Rosenthal, thanks to Jakob
Voß for the idea). Filters placed in this directory need not
be executable; if the extension is `.hs`, `.php`, `.pl`, `.js`,
or `.rb`, pandoc will run the right interpreter.
* For `--webtex`, replace deprecated Google Chart API by CodeCogs as
default (Kolen Cheung).
* Removed `raw_tex` extension from `markdown_mmd` defaults (Kolen Cheung).
* Execute .js filters with node (Jakob Voß).
* Textile reader:
+ Support `bc..` extended code blocks (#3037). Also, remove trailing
newline in code blocks (consistently with Markdown reader).
+ Improve table parsing. We now handle cell and row attributes, mostly
by skipping them. However, alignments are now handled properly.
Since in pandoc alignment is per-column, not per-cell, we
try to devine column alignments from cell alignments.
Table captions are also now parsed, and textile indicators
for thead and tfoot no longer cause parse failure. (However,
a row designated as tfoot will just be a regular row in pandoc.)
+ Improve definition list parsing. We now allow multiple terms
(which we concatenate with linebreaks). An exponential parsing
bug (#3020) is also fixed.
+ Disallow empty URL in explicit link (#3036).
* RST reader:
+ Use Div instead of BlockQuote for admonitions (#3031).
The Div has class `admonition` and (if relevant) one of the
following: `attention`, `caution`, `danger`, `error`, `hint`,
`important`, `note`, `tip`, `warning`. **Note:** This will change
the rendering of some RST documents! The word ("Warning", "Attention",
etc.) is no longer added; that must be done with CSS or a filter.
+ A Div is now used for `sidebar` as well.
+ Skip whitespace before note (Jesse Rosenthal, #3163). RST requires a
space before a footnote marker. We discard those spaces so that footnotes
will be adjacent to the text that comes before it. This is in line with
what rst2latex does.
+ Allow empty lines when parsing line blocks (Albert Krewinkel).
* Markdown reader:
+ Allow empty lines when parsing line blocks (Albert Krewinkel).
+ Allow attributes on autolinks (#3183, Daniele D'Orazio).
* LaTeX reader:
+ More robust parsing of unknown environments (#3026).
We no longer fail on things like `^` inside options for tikz.
+ Be more forgiving of non-standard characters, e.g. `^` outside of math.
Some custom environments give these a meaning, so we should try not to
fall over when we encounter them.
+ Drop duplicate `*` in bibtexKeyChars (Albert Krewinkel)
* MediaWiki reader:
+ Fix for unquoted attribute values in mediawiki tables (#3053).
Previously an unquoted attribute value in a table row
could cause parsing problems.
+ Improved treatment of verbatim constructions (#3055).
Previously these yielded strings of alternating Code and Space
elements; we now incorporate the spaces into the Code. Emphasis
etc. is still possible inside these.
+ Properly interpret XML tags in pre environments (#3042). They are meant
to be interpreted as literal text.
* EPUB reader: don't add root path to data: URIs (#3150).
Thanks to @lep for the bug report and patch.
* Org reader (Albert Krewinkel):
+ Preserve indentation of verse lines (#3064). Leading spaces in verse
lines are converted to non-breaking spaces, so indentation is preserved.
+ Ensure image sources are proper links. Image sources as those in plain
images, image links, or figures, must be proper URIs or relative file
paths to be recognized as images. This restriction is now enforced
for all image sources. This also fixes the reader's usage of uncleaned
image sources, leading to `file:` prefixes not being deleted from
figure images. Thanks to @bsag for noticing this bug.
+ Trim verse lines properly (Albert Krewinkel).
+ Extract meta parsing code to module. Parsing of meta-data is well
separable from other block parsing tasks. Moving into new module to
get small files and clearly arranged code.
+ Read markup only for special meta keys. Most meta-keys should be read
as normal string values, only a few are interpreted as marked-up text.
+ Allow multiple, comma-separated authors. Multiple authors can be
specified in the `#+AUTHOR` meta line if they are given as a
comma-separated list.
+ Give precedence to later meta lines. The last meta-line of any given
type is the significant line. Previously the value of the first line
was kept, even if more lines of the same type were encounterd.
+ Read LaTeX_header as header-includes. LaTeX-specific header commands
can be defined in `#+LaTeX_header` lines. They are parsed as
format-specific inlines to ensure that they will only show up in LaTeX
output.
+ Set documentclass meta from LaTeX_class.
+ Set classoption meta from LaTeX_class_options.
+ Read HTML_head as header-includes. HTML-specific head content can be
defined in `#+HTML_head` lines. They are parsed as format-specific
inlines to ensure that they will only show up in HTML output.
+ Respect `author` export option. The `author` option controls whether
the author should be included in the final markup. Setting
`#+OPTIONS: author:nil` will drop the author from the final meta-data
output.
+ Respect `email` export option. The `email` option controls whether the
email meta-field should be included in the final markup. Setting
`#+OPTIONS: email:nil` will drop the email field from the final
meta-data output.
+ Respect `creator` export option. The `creator` option controls whether
the creator meta-field should be included in the final markup. Setting
`#+OPTIONS: creator:nil` will drop the creator field from the final
meta-data output. Org-mode recognizes the special value `comment` for
this field, causing the creator to be included in a comment. This is
difficult to translate to Pandoc internals and is hence interpreted the
same as other truish values (i.e. the meta field is kept if it's
present).
+ Respect unnumbered header property (#3095). Sections the `unnumbered`
property should, as the name implies, be excluded from the automatic
numbering of section provided by some output formats. The Pandoc
convention for this is to add an "unnumbered" class to the header. The
reader treats properties as key-value pairs per default, so a special
case is added to translate the above property to a class instead.
+ Allow figure with empty caption (Albert Krewinkel, #3161).
A `#+CAPTION` attribute before an image is enough to turn an image into
a figure. This wasn't the case because the `parseFromString` function,
which processes the caption value, would fail on empty values. Adding
a newline character to the caption value fixes this.
* Docx reader:
+ Use XML convenience functions (Jesse Rosenthal).
The functions `isElem` and `elemName` (defined in Docx/Util.hs) make
the code a lot cleaner than the original XML.Light functions, but they
had been used inconsistently. This puts them in wherever applicable.
+ Handle anchor spans with content in headers. Previously, we would only
be able to figure out internal links to a header in a docx if the
anchor span was empty. We change that to read the inlines out of the
first anchor span in a header.
+ Let headers use exisiting id. Previously we always generated an id for
headers (since they wouldn't bring one from Docx). Now we let it use an
existing one if possible. This should allow us to recurs through anchor
spans.
+ Use all anchor spans for header ids. Previously we only used the first
anchor span to affect header ids. This allows us to use all the anchor
spans in a header, whether they're nested or not (#3088).
+ Test for nested anchor spans in header. This ensures that anchor spans
in header with content (or with other anchor spans inside) will resolve
to links to a header id properly.
* ODT reader (Hubert Plociniczak)
+ Include list's starting value. Previously the starting value of
the lists' items has been hardcoded to 1. In reality ODT's list
style definition can provide a new starting value in one of its
attributes.
+ Infer caption from the text following the image.
Frame can contain other frames with the text boxes.
+ Add `fig:` to title for Image with a caption (as expected
by pandoc's writers).
+ Basic support for images in ODT documents.
+ Don't duplicate text for anchors (#3143). When creating an anchor
element we were adding its representation as well as the original
content, leading to text duplication.
* DocBook writer:
+ Include an anchor element when a div or span has an id (#3102).
Note that DocBook does not have a class attribute, but at least this
provides an anchor for internal links.
* LaTeX writer:
+ Don't use * for unnumbered paragraph, subparagraph. The starred
variants don't exist. This helps with part of #3058...it gets rid of
the spurious `*`s. But we still have numbers on the 4th and 5th level
headers.
+ Properly escape backticks in verbatim (#3121, Jesse Rosenthal).
Otherwise they can cause unintended ligatures like `` ?` ``.
+ Handle NARRAOW NO-BREAK SPACE into LaTeX (Vaclav Zeman) as `\,`.
+ Don't include `[htbp]` placement for figures (#3103, Václav Haisman).
This allows figure placement defaults to be changed by the user
in the template.
* HTML writer (slide show formats): In slide shows, don't change slide title
to level 1 header (#2221).
* TEI writer: remove heuristic to detect book template (Albert Krewinkel).
TEI doesn't have `<book>` elements but only generic `<divN>` division
elements. Checking the template for a trailing `</book>` is nonsensical.
* MediaWiki writer: transform filename with underscores in images (#3052).
`foo bar.jpg` becomes `foo_bar.jpg`. This was already done
for internal links, but it also needs to happen for images.
* ICML writer: replace partial function (!!) in table handling (#3175,
Mauro Bieg).
* Man writer: allow section numbers that are not a single digit (#3089).
* AsciiDoc writer: avoid unnecessary use of "unconstrained" emphasis
(#3068). In AsciiDoc, you must use a special form of emphasis
(double `__`) for intraword emphasis. Pandoc was previously using
this more than necessary.
* EPUB writer: use stringify instead of plain writer for metadata
(#3066). This means that underscores won't be used for emphasis,
or CAPS for bold. The metadata fields will just have unadorned
text.
* Docx Writer:
+ Implement user-defined styles (Jesse Rosenthal). Divs and Spans
with a `custom-style` key in the attributes will apply the corresponding
key to the contained blocks or inlines.
+ Add ReaderT env to the docx writer (Jesse Rosenthal).
+ Clean up and streamline RTL behavior (Jesse Rosenthal, #3140).
You can set `dir: rtl` in YAML metadata, or use `-M dir=rtl`
on the command line. For finer-grained control, you can set
the `dir` attribute in Div or Span elements.
* Org writer (Albert Krewinkel):
+ Remove blank line after figure caption. Org-mode only treats an image
as a figure if it is directly preceded by a caption.
+ Ensure blank line after figure. An Org-mode figure should be surrounded
by blank lines. The figure would be recognized regardless, but images
in the following line would unintentionally be treated as figures as
well.
+ Ensure link targets are paths or URLs. Org-mode treats links as
document internal searches unless the link target looks like a URL or
file path, either relative or absolute. This change ensures that this
is always the case.
+ Translate language identifiers. Pandoc and Org-mode use different
programming language identifiers. An additional translation between
those identifiers is added to avoid unexpected behavior. This fixes a
problem where language specific source code would sometimes be output
as example code.
+ Drop space before footnote markers (Albert Krewinkel, #3162).
The writer no longer adds an extra space before footnote markers.
* Markdown writer:
+ Don't emit HTML for tables unless `raw_html` extension is set (#3154).
Emit `[TABLE]` if no suitable table formats are enabled and raw HTML
is disabled.
+ Check for the `raw_html` extension before emiting a raw HTML block.
+ Abstract out note/ref function (Jesse Rosenthal).
+ Add ReaderT monad for environment variables (Jesse Rosenthal).
* HTML, EPUB, slidy, revealjs templates: Use `<p>` instead of `<h1>` for
subtitle, author, date (#3119). Note that, as a result of this change,
authors may need to update CSS.
* revealjs template: Added `notes-server` option
(jgm/pandoc-templates#212, Yoan Blanc).
* Beamer template:
+ Restore whitespace between paragraphs. This was
a regression in the last release (jgm/pandoc-templates#207).
+ Added `themeoptions` variable (Carsten Gips).
+ Added `beamerarticle` variable. This causes the `beamerarticle`
package to be loaded in beamer, to produce an article from beamer
slides. (Carsten Gips)
+ Added support for `fontfamilies` structured variable
(Artem Klevtsov).
+ Added hypersetup options (Jake Zimmerman).
* LaTeX template:
+ Added dummy definition for `\institute`.
This isn't a standard command, and we want to avoid a crash when
`institute` is used with the default template.
+ Define default figure placement (Václav Haisman), since pandoc
no longer includes `[htbp]` for figures. Users with custom templates
will want to add this. See #3103.
+ Use footnote package to fix notes in tables (jgm/pandoc-templates#208,
Václav Haisman).
* Moved template compiling/rendering code to a separate library.
`doctemplates`. This allows the pandoc templating system to be
used independently.
* Text.Pandoc.Error: Fix out of index error in `handleError`
(Matthew Pickering). The fix is to not try to show the exact line when
it would cause an out-of-bounds error as a result of included files.
* Text.Pandoc.Shared: Add `linesToBlock` function (Albert Krewinkel).
* Text.Pandoc.Parsing.emailAddress: tighten up parsing of email
addresses. Technically `**@user` is a valid email address, but if we
allow things like this, we get bad results in markdown flavors
that autolink raw email addresses (see #2940). So we exclude a few
valid email addresses in order to avoid these more common bad cases.
* Text.Pandoc.PDF: Don't crash with nonexistent image (#3100). Instead,
emit the alt text, emphasized. This accords with what the ODT writer
currently does. The user will still get a warning about a nonexistent
image.
* Fix example in API documentation (#3176, Thomas Weißschuh).
* Tell where to get tarball in INSTALL (#3062).
* Rename README to MANUAL.txt and add GitHub-friendly README.md
(Albert Krewinkel, Kolen Cheung).
* Replace COPYING with Markdown version COPYING.md from GNU (Kolen Cheung).
* MANUAL.txt:
+ Put note on structured vars in separate paragraph (#2148, Albert
Krewinkel). Make it clearer that structured author variables require a
custom template
+ Note that `--katex` works best with `html5` (#3077).
+ Fix the LaTeX and EPUB links in manual (Morton Fox).
+ Document `biblio-title` variable.
* Improve spacing of footnotes in `--help` output (Waldir Pimenta).
* Update KaTeX to v0.6.0 (Kolen Cheung).
* Allow latest dependencies.
* Use texmath 0.8.6.6 (#3040).
* Allow http-client 0.4.30, which is the version in stackage lts.
Previously we required 0.5.
Remove CPP conditionals for earlier versions.
* Remove support for GHC < 7.8 (Jesse Rosenthal).
+ Remove Compat.Monoid.
+ Remove an inline monad compatibility macro.
+ Remove Text.Pandoc.Compat.Except.
+ Remove directory compat.
+ Change constraint on mtl.
+ Remove unnecessary CPP condition in UTF8.
+ Bump base lower bound to 4.7.
+ Remove 7.6 build from .travis.yaml.
+ Bump supported ghc version in CONTRIBUTING.md.
+ Add note about GHC version support to INSTALL.
+ Remove GHC 7.6 from list of tested versions (Albert Krewinkel).
+ Remove TagSoup compat.
+ Add EOL note to time compat module. Because time 1.4 is a boot library
for GHC 7.8, we will support the compatibility module as long as we
support 7.8. But we should be clear about when we will no longer need
it.
+ Remove blaze-html CPP conditional.
+ Remove unnecessary CPP in custom Prelude.
pandoc (1.17.2)
* Added Zim Wiki writer, template and tests. `zimwiki` is now
a valid output format. (Alex Ivkin)
* Changed email-obfuscation default to no obfuscation (#2988).
+ `writerEmailObfuscation` in `defaultWriterOptions` is now
`NoObfuscation`.
+ the default for the command-line `--email-obfuscation` option is
now `none`.
* Docbook writer: Declare xlink namespace in Docbook5 output (Ivo Clarysse).
* Org writer:
+ Support arbitrary raw inlines (Albert Krewinkel).
Org mode allows arbitrary raw inlines ("export snippets" in Emacs
parlance) to be included as `@@format:raw foreign format text@@`.
+ Improve Div handling (Albert Krewinkel). Div blocks handling is
changed to make the output look more like idiomatic org mode:
- Div-wrapped content is output as-is if the div's attribute is the
null attribute.
- Div containers with an id but neither classes nor key-value pairs
are unwrapped and the id is added as an anchor.
- Divs with classes associated with greater block elements are
wrapped in a `#+BEGIN`...`#+END` block.
- The old behavior for Divs with more complex attributes is kept.
* HTML writer:
+ Better support for raw LaTeX environments (#2758).
Previously we just passed all raw TeX through when MathJax was used for
HTML math. This passed through too much. With this patch, only raw
LaTeX environments that MathJax can handle get passed through.
This patch also causes raw LaTeX environments to be treated
as math, when possible, with MathML and WebTeX output.
* Markdown writer: use raw HTML for simple, pipe tables with linebreaks
(#2993). Markdown line breaks involve a newline, and simple and pipe
tables can't contain one.
* Make --webtex work with the Markdown writer (#1177).
This is a convenient option for people using
websites whose Markdown flavors don't provide for math.
* Docx writer:
+ Set paragraph to FirstPara after display math (Jesse Rosenthal).
We treat display math like block quotes, and apply FirstParagraph style
to paragraphs that follow them. These can be styled as the user
wishes. (But, when the user is using indentation, this allows for
paragraphs to continue after display math without indentation.)
+ Use actual creation time as doc prop (Jesse Rosenthal).
Previously, we had used the user-supplied date, if available, for Word's
document creation metadata. This could lead to weird results, as in
cases where the user post-dates a document (so the modification might be
prior to the creation). Here we use the actual computer time to set the
document creation.
* LaTeX writer:
+ Don't URI-escape image source (#2825). Usually this is a local file,
and replacing spaces with `%20` ruins things.
+ Allow 'standout' as a beamer frame option (#3007).
`## Slide title {.standout}`.
* RST reader: Fixed links with no explicit link text. The link
`` `<foo>`_ `` should have `foo` as both its link text and its URL.
See RST spec at <http://docutils.sourceforge.net/docs/ref/rst/restructuredtext.html#embedded-uris-and-aliases>
Closes Debian #828167 -- reported by Christian Heller.
* Textile reader:
+ Fixed attributes (#2984). Attributes can't be followed by
a space. So, `_(class)emph_` but `_(noclass) emph_`.
+ Fixed exponential parsing bug (#3020).
+ Fix overly aggressive interpretation as images (#2998).
Spaces are not allowed in the image URL in textile.
* LaTeX reader:
+ Fix `\cite` so it is a NormalCitation not AuthorInText.
+ Strip off double quotes around image source if present (#2825).
Avoids interpreting these as part of the literal filename.
* Org reader:
+ Add semicolon to list of special chars (Albert Krewinkel)
Semicolons are used as special characters in citations syntax. This
ensures the correct parsing of Pandoc-style citations: `[prefix; @key;
suffix]`. Previously, parsing would have failed unless there was a space
or other special character as the last <prefix> character.
+ Add support for "Berkeley-style" cites (Albert Krewinkel, #1978).
A specification for an official Org-mode citation syntax was drafted by
Richard Lawrence and enhanced with the help of others on the orgmode
mailing list. Basic support for this citation style is added to the
reader.
+ Support arbitrary raw inlines (Albert Krewinkel).
Org mode allows arbitrary raw inlines ("export snippets" in Emacs
parlance) to be included as `@@format:raw foreign format text@@`.
+ Remove partial functions (Albert Krewinkel, #2991).
Partial functions like `head` lead to avoidable errors and should be
avoided. They are replaced with total functions.
+ Support figure labels (Albert Krewinkel, #2496, #2999).
Figure labels given as `#+LABEL: thelabel` are used as the ID of the
respective image. This allows e.g. the LaTeX to add proper `\label`
markup.
+ Improve tag and properties type safety (Albert Krewinkel).
Specific newtype definitions are used to replace stringly typing of tags
and properties. Type safety is increased while readability is improved.
+ Parse as headlines, convert to blocks (Albert Krewinkel).
Emacs org-mode is based on outline-mode, which treats documents as trees
with headlines are nodes. The reader is refactored to parse into a
similar tree structure. This simplifies transformations acting on
document (sub-)trees.
* Refactor comment tree handling (Albert Krewinkel).
Comment trees were handled after parsing, as pattern matching on lists
is easier than matching on sequences. The new method of reading
documents as trees allows for more elegant subtree removal.
* Support archived trees export options (Albert Krewinkel).
Handling of archived trees can be modified using the `arch` option.
Archived trees are either dropped, exported completely, or collapsed to
include just the header when the `arch` option is nil, non-nil, or
`headline`, respectively.
* Put export setting parser into module (Albert Krewinkel).
Export option parsing is distinct enough from general block parsing to
justify putting it into a separate module.
* Support headline levels export setting (Albert Krewinkel).
The depths of headlines can be modified using the `H` option. Deeper
headlines will be converted to lists.
* Replace ugly code with view pattern (Albert Krewinkel).
Some less-than-smart code required a pragma switching of overlapping
pattern warnings in order to compile seamlessly. Using view patterns
makes the code easier to read and also doesn't require overlapping
pattern checks to be disabled.
* Fix parsing of verbatim inlines (Albert Krewinkel, #3016).
Org rules for allowed characters before or after markup chars were not
checked for verbatim text. This resultet in wrong parsing outcomes of
if the verbatim text contained e.g. space enclosed markup characters as
part of the text (`=is_substr = True=`). Forcing the parser to update
the positions of allowed/forbidden markup border characters fixes this.
* LaTeX template: fix for obscure hyperref/xelatex issue.
Here's a minimal case:
\documentclass[]{article}
\usepackage{hyperref}
\begin{document}
\section{\%á}
\end{document}
Without this change, this fails on the second invocation of xelatex.
This affects inputs this like `# %á` with pdf output via xelatex.
* trypandoc: call results 'html' instead of 'result'.
This is for better compatibility with babelmark2.
* Document MultiMarkdown as input/output format (Albert Krewinkel, #2973).
MultiMarkdown was only mentioned as a supported Markdown dialect but not
as a possible input or output format. A brief mention is added
everywhere the other supported markdown dialects are mentioned.
* Document Org mode as a format containing raw HTML (Albert Krewinkel)
Raw HTML is kept when the output format is Emacs Org mode.
* Implement `RawInline` and `RawBlock` in sample lua custom writer (#2985).
* Text.Pandoc.Shared:
+ Introduce blocksToInlines function (Jesse Rosenthal).
This is a lossy function for converting `[Block] -> [Inline]`. Its main
use, at the moment, is for docx comments, which can contain arbitrary
blocks (except for footnotes), but which will be converted to spans.
This is, at the moment, pretty useless for everything but the basic
`Para` and `Plain` comments. It can be improved, but the docx reader
should probably emit a warning if the comment contains more than this.
+ Add BlockQuote to blocksToInlines (Jesse Rosenthal).
+ Add further formats for `normalizeDate` (Jesse Rosenthal).
We want to avoid illegal dates -- in particular years with greater than
four digits. We attempt to parse series of digits first as `%Y%m%d`, then
`%Y%m`, and finally `%Y`.
+ `normalizeDate` should reject illegal years (Jesse Rosenthal).
We only allow years between 1601 and 9999, inclusive. The ISO 8601
actually says that years are supposed to start with 1583, but MS Word
only allows 1601-9999. This should stop corrupted word files if the date
is out of that range, or is parsed incorrectly.
+ Improve year sanity check in normalizeDate (Jesse Rosenthal).
Previously we parsed a list of dates, took the first one, and then
tested its year range. That meant that if the first one failed, we
returned nothing, regardless of what the others did. Now we test for
sanity before running `msum` over the list of Maybe values. Anything
failing the test will be Nothing, so will not be a candidate.
* Docx reader:
+ Add simple comment functionality. (Jesse Rosenthal).
This adds simple track-changes comment parsing to the docx reader. It is
turned on with `--track-changes=all`. All comments are converted to
inlines, which can list some information. In the future a warning will be
added for comments with formatting that seems like it will be excessively
denatured. Note that comments can extend across blocks. For that reason
there are two spans: `comment-start` and `comment-end`. `comment-start`
will contain the comment. `comment-end` will always be empty. The two
will be associated by a numeric id.
+ Enable warnings in top-level reader (Jesse Rosenthal).
Previously we had only allowed for warnings in the parser. Now we allow
for them in the `Docx.hs` as well. The warnings are simply concatenated.
+ Add warning for advanced comment formatting. (Jesse Rosenthal).
We can't guarantee we'll convert every comment correctly, though we'll
do the best we can. This warns if the comment includes something other
than Para or Plain.
+ Add tests for warnings. (Jesse Rosenthal).
+ Add tests for comments (Jesse Rosenthal).
We test for comments, using all track-changes options. Note that we
should only output comments if `--track-changes=all`. We also test for
emitting warnings if there is complicated formatting.
* README: update to include track-changes comments. (Jesse Rosenthal)
* Improved Windows installer - don't ignore properties set on command-line.
See #2708. Needs testing to see if this resolves the issue.
Thanks to @nkalvi.