more update fr v1.18.7

pymupdf · Feb 2, 2021 · 60d5ad1 · 60d5ad1
1 parent 394bf7c
commit 60d5ad1
Show file tree

Hide file tree

Showing 40 changed files with 2,304 additions and 1,751 deletions.
diff --git a/docs/annot.rst b/docs/annot.rst
@@ -195,7 +195,7 @@ There is a parent-child relationship between an annotation and its page. If the
 
       Three overlapping 'Circle' annotations with each opacity set to 0.5:
 
-      .. image:: images/img-opacity.jpg
+      .. image:: images/img-opacity.*
 
    .. attribute:: blendmode
 
@@ -322,7 +322,7 @@ There is a parent-child relationship between an annotation and its page. If the
           * 'Line', 'Polyline', 'Polygon' annotations: use it to give applicable line end symbols a fill color other than that of the annotation *(changed in v1.16.16)*.
 
       :arg bool cross_out: *(new in v1.17.2)* add two diagonal lines to the annotation rectangle. 'Redact' annotations only. If not desired, *False* must be specified even if the annotation was created with *False*.
-      :arg int rotate: new rotation value. Default (-1) means no change. Supports 'FreeText' and several other annotation types (see :meth:`Annot.setRotation`), [#f1]_. Only choose 0, 90, 180, or 270 degrees for 'FreeText'. Otherwise any integer is acceptable.
+      :arg int rotate: new rotation value. Default (-1) means no change. Supports 'FreeText' and several other annotation types (see :meth:`Annot.set_rotation`), [#f1]_. Only choose 0, 90, 180, or 270 degrees for 'FreeText'. Otherwise any integer is acceptable.
 
       :rtype: bool
 
@@ -515,7 +515,7 @@ Annotation Icons in MuPDF
 -------------------------
 This is a list of icons referencable by name for annotation types 'Text' and 'FileAttachment'. You can use them via the *icon* parameter when adding an annotation, or use the as argument in :meth:`Annot.setName`. It is left to your discretion which item to choose when -- no mechanism will keep you from using e.g. the "Speaker" icon for a 'FileAttachment'.
 
-.. image:: images/mupdf-icons.jpg
+.. image:: images/mupdf-icons.*
 
 
 Example
@@ -547,7 +547,7 @@ This is how the circle annotation looks like before and after the change (pop-up
 
 |circle|
 
-.. |circle| image:: images/img-circle.png
+.. |circle| image:: images/img-circle.*
 
 
 .. rubric:: Footnotes

diff --git a/docs/app1.rst b/docs/app1.rst
@@ -12,7 +12,7 @@ Following are three sections that deal with different aspects of performance:
 
 In each section, the same fixed set of PDF files is being processed by a set of tools. The set of tools varies -- for reasons we will explain in the section.
 
-.. |fsizes| image:: images/img-filesizes.png
+.. |fsizes| image:: images/img-filesizes.*
 
 Here is the list of files we are using. Each file name is accompanied by further information: **size** in bytes, number of **pages**, number of bookmarks (**toc** entries), number of **links**, **text** size as a percentage of file size, **KB** per page, PDF **version** and remarks. **text %** and **KB index** are indicators for whether a file is text or graphics oriented.
 |fsizes|
@@ -72,8 +72,8 @@ This is how each of the tools was used:
 
 **Observations**
 
-.. |cpyspeed1| image:: images/img-copy-speed-1.png
-.. |cpyspeed2| image:: images/img-copy-speed-2.png
+.. |cpyspeed1| image:: images/img-copy-speed-1.*
+.. |cpyspeed2| image:: images/img-copy-speed-2.*
 
 These are our run time findings (in **seconds**, please note the European number convention: meaning of decimal point and comma is reversed):
 
@@ -115,7 +115,7 @@ All tools have been used with their most basic, fanciless functionality -- no la
 
 For demonstration purposes, we have included a version of *GetText(doc, output = "json")*, that also re-arranges the output according to occurrence on the page.
 
-.. |textperf| image:: images/img-textperformance.png
+.. |textperf| image:: images/img-textperformance.*
 
 Here are the results using the same test files as above (again: decimal point and comma reversed):
 
@@ -141,7 +141,7 @@ We have tested rendering speed of MuPDF against the *pdftopng.exe*, a command li
      print "processing:", datei
      doc=fitz.open(datei)
      for p in fitz.Pages(doc):
-         pix = p.getPixmap(matrix=mat, alpha = False)
+         pix = p.get_pixmap(matrix=mat, alpha = False)
          pix.writePNG("t-%s.png" % p.number)
          pix = None
      doc.close()
@@ -151,7 +151,7 @@ We have tested rendering speed of MuPDF against the *pdftopng.exe*, a command li
 ::
  pdftopng.exe file.pdf ./
 
-.. |renderspeed| image:: images/img-render-speed.png
+.. |renderspeed| image:: images/img-render-speed.*
 
 The resulting runtimes can be found here (again: meaning of decimal point and comma reversed):
 

diff --git a/docs/app2.rst b/docs/app2.rst
@@ -33,18 +33,18 @@ A **span** consists of adjacent characters with identical font properties: name,
 Plain Text
 ~~~~~~~~~~
 
-Function :meth:`TextPage.extractText` (or *Page.getText("text")*) extracts a page's plain **text in original order** as specified by the creator of the document (which may not equal a natural reading order).
+Function :meth:`TextPage.extractText` (or *Page.get_text("text")*) extracts a page's plain **text in original order** as specified by the creator of the document (which may not equal a natural reading order).
 
 An example output::
 
-    >>> print(page.getText("text"))
+    >>> print(page.get_text("text"))
     Some text on first page.
 
 
 BLOCKS
 ~~~~~~~~~~
 
-Function :meth:`TextPage.extractBLOCKS` (or *Page.getText("blocks")*) extracts a page's text blocks as a list of items like::
+Function :meth:`TextPage.extractBLOCKS` (or *Page.get_text("blocks")*) extracts a page's text blocks as a list of items like::
 
     (x0, y0, x1, y1, "lines in block", block_type, block_no)
 
@@ -54,15 +54,15 @@ This is a high-speed method with enough information to re-arrange the page's tex
 
 Example output::
 
-    >>> print(page.getText("blocks"))
+    >>> print(page.get_text("blocks"))
     [(50.0, 88.17500305175781, 166.1709747314453, 103.28900146484375,
     'Some text on first page.', 0, 0)]
 
 
 WORDS
 ~~~~~~~~~~
 
-Function :meth:`TextPage.extractWORDS` (or *Page.getText("words")*) extracts a page's text **words** as a list of items like::
+Function :meth:`TextPage.extractWORDS` (or *Page.get_text("words")*) extracts a page's text **words** as a list of items like::
 
     (x0, y0, x1, y1, "word", block_no, line_no, word_no)
 
@@ -72,7 +72,7 @@ This is a high-speed method with enough information to extract text contained in
 
 Example output::
 
-    >>> for word in page.getText("words"):
+    >>> for word in page.get_text("words"):
             print(word)
     (50.0, 88.17500305175781, 78.73200225830078, 103.28900146484375,
     'Some', 0, 0, 0)
@@ -88,9 +88,9 @@ Example output::
 HTML
 ~~~~
 
-:meth:`TextPage.extractHTML` (or *Page.getText("html")* output fully reflects the structure of the page's *TextPage* -- much like DICT / JSON below. This includes images, font information and text positions. If wrapped in HTML header and trailer code, it can readily be displayed by an internet browser. Our above example::
+:meth:`TextPage.extractHTML` (or *Page.get_text("html")* output fully reflects the structure of the page's *TextPage* -- much like DICT / JSON below. This includes images, font information and text positions. If wrapped in HTML header and trailer code, it can readily be displayed by an internet browser. Our above example::
 
-    >>> for line in page.getText("html").splitlines():
+    >>> for line in page.get_text("html").splitlines():
             print(line)
 
     <div id="page0" style="position:relative;width:300pt;height:350pt;
@@ -153,7 +153,7 @@ To address the font issue, you can use a simple utility script to scan through t
 DICT (or JSON)
 ~~~~~~~~~~~~~~~~
 
-:meth:`TextPage.extractDICT` (or *Page.getText("dict")*) output fully reflects the structure of a *TextPage* and provides image content and position details (*bbox* -- boundary boxes in pixel units) for every block and line. This information can be used to present text in another reading order if required (e.g. from top-left to bottom-right). Images are stored as *bytes* (*bytearray* in Python 2) for DICT output and base64 encoded strings for JSON output.
+:meth:`TextPage.extractDICT` (or *Page.get_text("dict")*) output fully reflects the structure of a *TextPage* and provides image content and position details (*bbox* -- boundary boxes in pixel units) for every block and line. This information can be used to present text in another reading order if required (e.g. from top-left to bottom-right). Images are stored as *bytes* (*bytearray* in Python 2) for DICT output and base64 encoded strings for JSON output.
 
 For a visuallization of the dictionary structure have a look at :ref:`textpagedict`.
 
@@ -183,7 +183,7 @@ Here is how this looks like::
 
 RAWDICT
 ~~~~~~~~~~~~~~~~
-:meth:`TextPage.extractRAWDICT` (or *Page.getText("rawdict")*) is an **information superset of DICT** and takes the detail level one step deeper. It looks exactly like the above, except that the *"text"* items (*string*) are replaced by *"chars"* items (*list*). Each *"chars"* entry is a character *dict*. For example, here is what you would see in place of item *"text": "Text in black color."* above::
+:meth:`TextPage.extractRAWDICT` (or *Page.get_text("rawdict")*) is an **information superset of DICT** and takes the detail level one step deeper. It looks exactly like the above, except that the *"text"* items (*string*) are replaced by *"chars"* items (*list*). Each *"chars"* entry is a character *dict*. For example, here is what you would see in place of item *"text": "Text in black color."* above::
 
     "chars": [{
         "origin": [50.0, 100.0],
@@ -216,9 +216,9 @@ RAWDICT
 XML
 ~~~
 
-The :meth:`TextPage.extractXML` (or *Page.getText("xml")*) version extracts text (no images) with the detail level of RAWDICT::
+The :meth:`TextPage.extractXML` (or *Page.get_text("xml")*) version extracts text (no images) with the detail level of RAWDICT::
   
-    >>> for line in page.getText("xml").splitlines():
+    >>> for line in page.get_text("xml").splitlines():
         print(line)
 
     <page id="page0" width="300" height="350">
@@ -249,7 +249,7 @@ The :meth:`TextPage.extractXML` (or *Page.getText("xml")*) version extracts text
 
 XHTML
 ~~~~~
-:meth:`TextPage.extractXHTML` (or *Page.getText("xhtml")*) is a variation of TEXT but in HTML format, containing the bare text and images ("semantic" output)::
+:meth:`TextPage.extractXHTML` (or *Page.get_text("xhtml")*) is a variation of TEXT but in HTML format, containing the bare text and images ("semantic" output)::
 
     <div id="page0">
     <p>Some text on first page.</p>
@@ -259,7 +259,7 @@ XHTML
 
 Text Extraction Flags Defaults
 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-*(New in version 1.16.2)* Method :meth:`Page.getText` supports a keyword parameter *flags* *(int)* to control the amount and the quality of extracted data. The following table shows the defaults settings (flags parameter omitted or None) for each extraction variant. If you specify flags with a value other than *None*, be aware that you must set **all desired** options. A description of the respective bit settings can be found in :ref:`TextPreserve`.
+*(New in version 1.16.2)* Method :meth:`Page.get_text` supports a keyword parameter *flags* *(int)* to control the amount and the quality of extracted data. The following table shows the defaults settings (flags parameter omitted or None) for each extraction variant. If you specify flags with a value other than *None*, be aware that you must set **all desired** options. A description of the respective bit settings can be found in :ref:`TextPreserve`.
 
 =================== ==== ==== ===== === ==== ======= ===== ======
 Indicator           text html xhtml xml dict rawdict words blocks
@@ -277,14 +277,14 @@ dehyphenate         0    0    0     0   0    0       0     0
 
 To show the effect of *TEXT_INHIBIT_SPACES* have a look at this example::
 
-    >>> print(page.getText("text"))
+    >>> print(page.get_text("text"))
     H a l l o !
     Mo r e  t e x t
     i s  f o l l o w i n g
     i n  E n g l i s h
     . . .  l e t ' s  s e e
     w h a t  h a p p e n s .
-    >>> print(page.getText("text", flags=fitz.TEXT_INHIBIT_SPACES))
+    >>> print(page.get_text("text", flags=fitz.TEXT_INHIBIT_SPACES))
     Hallo!
     More text
     is following

diff --git a/docs/app3.rst b/docs/app3.rst
@@ -29,4 +29,4 @@ PyMuPDF Support
 ------------------
 We continue to support the full old API with respect to embedded files -- with only minor, cosmetic changes.
 
-There even also is a new function, which delivers a list of all names under which embedded data are resgistered in a PDF, :meth:`Document.embeddedFileNames`.
+There even also is a new function, which delivers a list of all names under which embedded data are resgistered in a PDF, :meth:`Document.embfile_names`.
diff --git a/docs/app4.rst b/docs/app4.rst
@@ -113,7 +113,7 @@ Python on the other hand implements the OO-model in a very clean way. The interf
 
 When you use one of PyMuPDF's objects or methods, this will result in excution of some code in *fitz.py*, which in turn will call some C code compiled with *fitz_wrap.c*.
 
-Because SWIG goes a long way to keep the Python and the C level in sync, everything works fine, if a certain set of rules is being strictly followed. For example: **never access** a :ref:`Page` object, after you have closed (or deleted or set to *None*) the owning :ref:`Document`. Or, less obvious: **never access** a page or any of its children (links or annotations) after you have executed one of the document methods *select()*, *deletePage()*, *insert_page()* ... and more.
+Because SWIG goes a long way to keep the Python and the C level in sync, everything works fine, if a certain set of rules is being strictly followed. For example: **never access** a :ref:`Page` object, after you have closed (or deleted or set to *None*) the owning :ref:`Document`. Or, less obvious: **never access** a page or any of its children (links or annotations) after you have executed one of the document methods *select()*, *delete_page()*, *insert_page()* ... and more.
 
 But just no longer accessing invalidated objects is actually not enough: They should rather be actively deleted entirely, to also free C-level resources (meaning allocated memory).