diff --git a/.gitignore b/.gitignore index 5af2d1ec9..0096220ac 100644 --- a/.gitignore +++ b/.gitignore @@ -71,6 +71,14 @@ instance/ # Sphinx documentation docs/_build/ +# Sphinx examples rst files which is generated by the template +docs/source/quick_start/examples/BSTLD.rst +docs/source/quick_start/examples/DogsVsCats.rst +docs/source/quick_start/examples/LeedsSportsPose.rst +docs/source/quick_start/examples/Newsgroups20.rst +docs/source/quick_start/examples/NeolixOD.rst +docs/source/quick_start/examples/THCHS30.rst + # PyBuilder target/ diff --git a/docs/code/LeedsSportsPose.py b/docs/code/LeedsSportsPose.py index 67e43dca9..4bb983b83 100644 --- a/docs/code/LeedsSportsPose.py +++ b/docs/code/LeedsSportsPose.py @@ -43,6 +43,10 @@ dataset = Dataset("LeedsSportsPose", gas) """""" +"""Read Dataset / list segment names""" +dataset.keys() +"""""" + """Read Dataset / get segment""" segment = dataset[0] """""" diff --git a/docs/code/NeolixOD.py b/docs/code/NeolixOD.py index eb60f677b..5909bc30e 100644 --- a/docs/code/NeolixOD.py +++ b/docs/code/NeolixOD.py @@ -42,6 +42,10 @@ dataset = Dataset("NeolixOD", gas) """""" +"""Read Dataset / list segment names""" +dataset.keys() +"""""" + """Read Dataset / get segment""" segment = dataset[0] """""" diff --git a/docs/source/__init__.py b/docs/source/__init__.py new file mode 100644 index 000000000..aa384de5a --- /dev/null +++ b/docs/source/__init__.py @@ -0,0 +1,6 @@ +#!/usr/bin/env python3 +# +# Copyright 2021 Graviti. Licensed under MIT License. +# + +"""source.""" diff --git a/docs/source/_templates/__init__.py b/docs/source/_templates/__init__.py new file mode 100644 index 000000000..f0ff03ea1 --- /dev/null +++ b/docs/source/_templates/__init__.py @@ -0,0 +1,6 @@ +#!/usr/bin/env python3 +# +# Copyright 2021 Graviti. Licensed under MIT License. +# + +"""template.""" diff --git a/docs/source/_templates/examples.py b/docs/source/_templates/examples.py new file mode 100644 index 000000000..5256d8b98 --- /dev/null +++ b/docs/source/_templates/examples.py @@ -0,0 +1,176 @@ +"""The template for example rst files.""" + +EXAMPLES_TEMPLATE = ''' +################### + {dataset_name} +################### + +This topic describes how to manage the `{dataset_name} Dataset `_, which is a dataset with +:ref:`reference/label_format/{label_type}:{label_type}` label +{figure_description} + +***************************** + Authorize a Client Instance +***************************** + +An :ref:`reference/glossary:accesskey` is needed to authenticate identity when using TensorBay. + +.. literalinclude:: ../../../../docs/code/{file_name}.py + :language: python + :start-after: """Authorize a Client Instance""" + :end-before: """""" + +**************** + Create Dataset +**************** + +.. literalinclude:: ../../../../docs/code/{file_name}.py + :language: python + :start-after: """Create Dataset""" + :end-before: """""" + +****************** + Organize Dataset +****************** + +Normally, ``dataloader.py`` and ``catalog.json`` are required to organize the "{dataset_name}" +dataset into the :class:`~tensorbay.dataset.dataset.Dataset` instance. +In this example, they are stored in the same directory like:: + + {dataset_name}/ + catalog.json + dataloader.py + +Step 1: Write the Catalog +========================= + +A :ref:`reference/dataset_structure:catalog` contains all label information of one dataset, which +is typically stored in a json file like ``catalog.json``. +{catalog_description} + +{category_attribute_description} + +.. note:: + + By passing the path of the ``catalog.json``, :func:`~tensorbay.dataset.dataset.DatasetBase. + load_catalog` supports loading the catalog into dataset. + +.. important:: + + See :ref:`catalog table ` for more catalogs with different + label types. + +Step 2: Write the Dataloader +============================ + +A :ref:`reference/glossary:dataloader` is needed to organize the dataset into a :class:`~tensorbay. +dataset.dataset.Dataset` instance. + +.. literalinclude:: ../../../../tensorbay/opendataset/{file_name}/loader.py + :language: python + :name: {file_name}-dataloader + :linenos: + +See :ref:`{label_type} annotation ` for more +details. + +There are already a number of dataloaders in TensorBay SDK provided by the community. +Thus, instead of writing, importing an available dataloader is also feasible. + +.. literalinclude:: ../../../../docs/code/{file_name}.py + :language: python + :start-after: """Organize dataset / import dataloader""" + :end-before: """""" + +.. note:: + + Note that catalogs are automatically loaded in available dataloaders, users do not have to write + them again. + +.. important:: + + See :ref:`dataloader table ` for dataloaders with different label + types. + +******************* + Visualize Dataset +******************* + +Optionally, the organized dataset can be visualized by **Pharos**, which is a TensorBay SDK plug-in. +This step can help users to check whether the dataset is correctly organized. +Please see :ref:`features/visualization:Visualization` for more details. + +**************** + Upload Dataset +**************** + +The organized "{dataset_name}" dataset can be uploaded to TensorBay for sharing, reuse, etc. + +.. literalinclude:: ../../../../docs/code/{file_name}.py + :language: python + :start-after: """Upload Dataset""" + :end-before: """""" + +.. note:: + Set ``skip_uploaded_files=True`` to skip uploaded data. + The data will be skiped if its name and segment name is the same as remote data. + +Similar with Git, the commit step after uploading can record changes to the dataset as a version. +If needed, do the modifications and commit again. +Please see :ref:`features/version_control/index:Version Control` for more details. + +************** + Read Dataset +************** + +Now "{dataset_name}" dataset can be read from TensorBay. + +.. literalinclude:: ../../../../docs/code/{file_name}.py + :language: python + :start-after: """Read Dataset / get dataset""" + :end-before: """""" + +Get the segment names by listing them all. + +.. literalinclude:: ../../../../docs/code/{file_name}.py + :language: python + :start-after: """Read Dataset / list segment names""" + :end-before: """""" + +Get a segment by passing the required segment name. + +.. literalinclude:: ../../../../docs/code/{file_name}.py + :language: python + :start-after: """Read Dataset / get segment""" + :end-before: """""" + +In the :ref:`reference/dataset_structure:segment`, there is a sequence of +:ref:`reference/dataset_structure:data`, which can be obtained by index. + +.. literalinclude:: ../../../../docs/code/{file_name}.py + :language: python + :start-after: """Read Dataset / get data""" + :end-before: """""" + +In each :ref:`reference/dataset_structure:data`, +there is a sequence of :ref:`reference/label_format/{label_type}:{label_type}` annotations, +which can be obtained by index. + +.. literalinclude:: ../../../../docs/code/{file_name}.py + :language: python + :start-after: """Read Dataset / get label""" + :end-before: """""" + +There is only one label type in "{dataset_name}" dataset, which is ``{label_type}``. +{information_description} + +**************** + Delete Dataset +**************** + +.. literalinclude:: ../../../../docs/code/{file_name}.py + :language: python + :start-after: """Delete Dataset""" + :end-before: """""" +''' diff --git a/docs/source/conf.py b/docs/source/conf.py index 3bb1f2b56..bec568d55 100644 --- a/docs/source/conf.py +++ b/docs/source/conf.py @@ -15,11 +15,14 @@ # documentation root, use os.path.abspath to make it absolute, like shown here. # """Configuration file for the Sphinx documentation builder.""" +import os import sys from pathlib import Path sys.path.insert(0, str(Path(__file__).parents[2])) - +from docs.source._templates.examples import ( # noqa: E402 # pylint: disable=wrong-import-position + EXAMPLES_TEMPLATE, +) # -- Project information ----------------------------------------------------- @@ -79,3 +82,132 @@ # relative to this directory. They are copied after the builtin static files, # so a file named "default.css" will overwrite the builtin "default.css". # html_static_path = ["_static"] + +source_path = os.path.dirname(os.path.abspath(__file__)) +example_path = os.path.join(source_path, "quick_start", "examples") +dataset_names = ( + "Dogs Vs Cats", + "20 Newsgroups", + "BSTLD", + "Neolix OD", + "Leeds Sports Pose", + "THCHS-30", +) +label_types = ( + "Classification", + "Classification", + "Box2D", + "Box3D", + "Keypoints2D", + "Sentence", +) +file_names = ("DogsVsCats", "Newsgroups20", "BSTLD", "NeolixOD", "LeedsSportsPose", "THCHS30") + +dataset_with_images = ("BSTLD", "Neolix OD", "Leeds Sports Pose") + +figure_description = """(:numref:`Fig. %s `). + +.. _example-{file_name}: + +.. figure:: ../../images/example-{label_type}.png + :scale: 50 % + :align: center + + The preview of a cropped image with labels from "{dataset_name}". +""" + +category_attribute_descriptions = {} +category_attribute_descriptions[ + "BSTLD" +] = """ +The only annotation type for "{dataset_name}" is +:ref:`reference/label_format/{label_type}:{label_type}`, and there are 13 +:ref:`reference/label_format/CommonLabelProperties:category` types and one +:ref:`reference/label_format/CommonLabelProperties:attributes` type. +""" + +category_attribute_descriptions[ + "Dogs Vs Cats" +] = """ +The only annotation type for "{dataset_name}" is +:ref:`reference/label_format/{label_type}:{label_type}`, and there are 2 +:ref:`reference/label_format/CommonLabelProperties:category` types. +""" + +category_attribute_descriptions[ + "Leeds Sports Pose" +] = """ +The only annotation type for "{dataset_name}" is +:ref:`reference/label_format/{label_type}:{label_type}`. +""" + +category_attribute_descriptions[ + "Neolix OD" +] = """ +The only annotation type for "{dataset_name}" is +:ref:`reference/label_format/{label_type}:{label_type}`, and there are 15 +:ref:`reference/label_format/CommonLabelProperties:category` types and 3 +:ref:`reference/label_format/CommonLabelProperties:attributes` type. +""" + +category_attribute_descriptions[ + "20 Newsgroups" +] = """ +The only annotation type for "{dataset_name}" is +:ref:`reference/label_format/{label_type}:{label_type}`, and there are 20 +:ref:`reference/label_format/CommonLabelProperties:category` types +""" + +category_attribute_descriptions["THCHS-30"] = "" + +# from docs.source._templates.examples import EXAMPLES_TEMPLATE +for dataset_name, label_type, file_name in zip(dataset_names, label_types, file_names): + if dataset_name == "THCHS-30": + catalog_description = """However the catalog of THCHS-30 is too large, instead of +reading it from json file, we read it by mapping from subcatalog that is loaded by +the raw file. Check the :ref:`dataloader ` below for more details. +""" + information_description = """It contains ``sentence``, ``spell`` and ``phone`` information. +See :ref:`Sentence ` label format for +more details. +""" + else: + catalog_description = """ +.. literalinclude:: ../../../../tensorbay/opendataset/{file_name}/catalog.json + :language: json + :name: {file_name}-catalog + :linenos: +""" + information_description = """The information stored in +:ref:`reference/label_format/CommonLabelProperties:category` is one of the names in "categories" +list of :ref:`catalog.json <{file_name}-catalog>`. The information stored in +:ref:`reference/label_format/CommonLabelProperties:attributes` is one or several of +the attributes in "attributes" list of :ref:`catalog.json <{file_name}-catalog>`. +See :ref:`reference/label_format/{label_type}:{label_type}` label format for more details. +""" + + if dataset_name in dataset_with_images: + figure_description_tmp = figure_description.format( + dataset_name=dataset_name, file_name=file_name, label_type=label_type + ) + else: + figure_description_tmp = "" + catalog_description_tmp = catalog_description.format(file_name=file_name) + information_description_tmp = information_description.format( + label_type=label_type, file_name=file_name + ) + category_attribute_description = category_attribute_descriptions[dataset_name].format( + dataset_name=dataset_name, label_type=label_type + ) + with open(os.path.join(example_path, f"{file_name}.rst"), "w", encoding="utf-8") as fp: + fp.write( + EXAMPLES_TEMPLATE.format( + dataset_name=dataset_name, + file_name=file_name, + label_type=label_type, + figure_description=figure_description_tmp, + catalog_description=catalog_description_tmp, + category_attribute_description=category_attribute_description, + information_description=information_description_tmp, + ) + ) diff --git a/docs/source/quick_start/examples/BSTLD.rst b/docs/source/quick_start/examples/BSTLD.rst index 062b22be7..66425628c 100644 --- a/docs/source/quick_start/examples/BSTLD.rst +++ b/docs/source/quick_start/examples/BSTLD.rst @@ -1,11 +1,14 @@ -######## + +################### BSTLD -######## +################### -This topic describes how to manage the `BSTLD Dataset `_, -which is a dataset with :ref:`reference/label_format/Box2D:Box2D` label(:numref:`Fig. %s `). +This topic describes how to manage the `BSTLD Dataset `_, which is a dataset with +:ref:`reference/label_format/Box2D:Box2D` label +(:numref:`Fig. %s `). -.. _example-bstld: +.. _example-BSTLD: .. figure:: ../../images/example-Box2D.png :scale: 50 % @@ -13,6 +16,7 @@ which is a dataset with :ref:`reference/label_format/Box2D:Box2D` label(:numref: The preview of a cropped image with labels from "BSTLD". + ***************************** Authorize a Client Instance ***************************** @@ -37,7 +41,8 @@ An :ref:`reference/glossary:accesskey` is needed to authenticate identity when u Organize Dataset ****************** -Normally, ``dataloader.py`` and ``catalog.json`` are required to organize the "BSTLD" dataset into the :class:`~tensorbay.dataset.dataset.Dataset` instance. +Normally, ``dataloader.py`` and ``catalog.json`` are required to organize the "BSTLD" +dataset into the :class:`~tensorbay.dataset.dataset.Dataset` instance. In this example, they are stored in the same directory like:: BSTLD/ @@ -51,32 +56,41 @@ A :ref:`reference/dataset_structure:catalog` contains all label information of o is typically stored in a json file like ``catalog.json``. .. literalinclude:: ../../../../tensorbay/opendataset/BSTLD/catalog.json - :language: json - :name: BSTLD-catalog - :linenos: + :language: json + :name: BSTLD-catalog + :linenos: + + + +The only annotation type for "BSTLD" is +:ref:`reference/label_format/Box2D:Box2D`, and there are 13 +:ref:`reference/label_format/CommonLabelProperties:category` types and one +:ref:`reference/label_format/CommonLabelProperties:attributes` type. -The only annotation type for "BSTLD" is :ref:`reference/label_format/Box2D:Box2D`, and there are 13 -:ref:`reference/label_format/CommonLabelProperties:category` types and one :ref:`reference/label_format/CommonLabelProperties:attributes` type. .. note:: - By passing the path of the ``catalog.json``, :func:`~tensorbay.dataset.dataset.DatasetBase.load_catalog` supports loading the catalog into dataset. + By passing the path of the ``catalog.json``, :func:`~tensorbay.dataset.dataset.DatasetBase. + load_catalog` supports loading the catalog into dataset. .. important:: - See :ref:`catalog table ` for more catalogs with different label types. + See :ref:`catalog table ` for more catalogs with different + label types. Step 2: Write the Dataloader ============================ -A :ref:`reference/glossary:dataloader` is needed to organize the dataset into a :class:`~tensorbay.dataset.dataset.Dataset` instance. +A :ref:`reference/glossary:dataloader` is needed to organize the dataset into a :class:`~tensorbay. +dataset.dataset.Dataset` instance. .. literalinclude:: ../../../../tensorbay/opendataset/BSTLD/loader.py :language: python :name: BSTLD-dataloader :linenos: -See :ref:`Box2D annotation ` for more details. +See :ref:`Box2D annotation ` for more +details. There are already a number of dataloaders in TensorBay SDK provided by the community. Thus, instead of writing, importing an available dataloader is also feasible. @@ -88,11 +102,13 @@ Thus, instead of writing, importing an available dataloader is also feasible. .. note:: - Note that catalogs are automatically loaded in available dataloaders, users do not have to write them again. + Note that catalogs are automatically loaded in available dataloaders, users do not have to write + them again. .. important:: - See :ref:`dataloader table ` for dataloaders with different label types. + See :ref:`dataloader table ` for dataloaders with different label + types. ******************* Visualize Dataset @@ -114,7 +130,7 @@ The organized "BSTLD" dataset can be uploaded to TensorBay for sharing, reuse, e :end-before: """""" .. note:: - Set `skip_uploaded_files=True` to skip uploaded data. + Set ``skip_uploaded_files=True`` to skip uploaded data. The data will be skiped if its name and segment name is the same as remote data. Similar with Git, the commit step after uploading can record changes to the dataset as a version. @@ -132,8 +148,6 @@ Now "BSTLD" dataset can be read from TensorBay. :start-after: """Read Dataset / get dataset""" :end-before: """""" -In :ref:`reference/dataset_structure:dataset` "BSTLD", there are three -:ref:`segments `: ``train``, ``test`` and ``additional``. Get the segment names by listing them all. .. literalinclude:: ../../../../docs/code/BSTLD.py @@ -148,9 +162,8 @@ Get a segment by passing the required segment name. :start-after: """Read Dataset / get segment""" :end-before: """""" - -In the train :ref:`reference/dataset_structure:segment`, there is a sequence of :ref:`reference/dataset_structure:data`, -which can be obtained by index. +In the :ref:`reference/dataset_structure:segment`, there is a sequence of +:ref:`reference/dataset_structure:data`, which can be obtained by index. .. literalinclude:: ../../../../docs/code/BSTLD.py :language: python @@ -166,12 +179,15 @@ which can be obtained by index. :start-after: """Read Dataset / get label""" :end-before: """""" -There is only one label type in "BSTLD" dataset, which is ``box2d``. -The information stored in :ref:`reference/label_format/CommonLabelProperties:category` is -one of the names in "categories" list of :ref:`catalog.json `. The information stored -in :ref:`reference/label_format/CommonLabelProperties:attributes` is one or several of the attributes in "attributes" list of :ref:`catalog.json `. +There is only one label type in "BSTLD" dataset, which is ``Box2D``. +The information stored in +:ref:`reference/label_format/CommonLabelProperties:category` is one of the names in "categories" +list of :ref:`catalog.json `. The information stored in +:ref:`reference/label_format/CommonLabelProperties:attributes` is one or several of +the attributes in "attributes" list of :ref:`catalog.json `. See :ref:`reference/label_format/Box2D:Box2D` label format for more details. + **************** Delete Dataset **************** diff --git a/docs/source/quick_start/examples/DogsVsCats.rst b/docs/source/quick_start/examples/DogsVsCats.rst index c166acc7a..e4417fe69 100644 --- a/docs/source/quick_start/examples/DogsVsCats.rst +++ b/docs/source/quick_start/examples/DogsVsCats.rst @@ -1,9 +1,12 @@ -############## - Dogs vs Cats -############## -This topic describes how to manage the `Dogs vs Cats Dataset `_, -which is a dataset with :ref:`reference/label_format/Classification:Classification` label. +################### + Dogs Vs Cats +################### + +This topic describes how to manage the `Dogs Vs Cats Dataset `_, which is a dataset with +:ref:`reference/label_format/Classification:Classification` label + ***************************** Authorize a Client Instance @@ -29,10 +32,11 @@ An :ref:`reference/glossary:accesskey` is needed to authenticate identity when u Organize Dataset ****************** -Normally, ``dataloader.py`` and ``catalog.json`` are required to organize the "Dogs vs Cats" dataset into the :class:`~tensorbay.dataset.dataset.Dataset` instance. +Normally, ``dataloader.py`` and ``catalog.json`` are required to organize the "Dogs Vs Cats" +dataset into the :class:`~tensorbay.dataset.dataset.Dataset` instance. In this example, they are stored in the same directory like:: - Dogs vs Cats/ + Dogs Vs Cats/ catalog.json dataloader.py @@ -43,37 +47,43 @@ A :ref:`reference/dataset_structure:catalog` contains all label information of o is typically stored in a json file like ``catalog.json``. .. literalinclude:: ../../../../tensorbay/opendataset/DogsVsCats/catalog.json - :language: json - :name: dogsvscats-catalog - :linenos: + :language: json + :name: DogsVsCats-catalog + :linenos: -The only annotation type for "Dogs vs Cats" is :ref:`reference/label_format/Classification:Classification`, and there are 2 + + +The only annotation type for "Dogs Vs Cats" is +:ref:`reference/label_format/Classification:Classification`, and there are 2 :ref:`reference/label_format/CommonLabelProperties:category` types. + .. note:: - By passing the path of the ``catalog.json``, :func:`~tensorbay.dataset.dataset.DatasetBase.load_catalog` supports loading the catalog into dataset. + By passing the path of the ``catalog.json``, :func:`~tensorbay.dataset.dataset.DatasetBase. + load_catalog` supports loading the catalog into dataset. .. important:: - See :ref:`catalog table ` for more catalogs with different label types. + See :ref:`catalog table ` for more catalogs with different + label types. Step 2: Write the Dataloader ============================ -A :ref:`reference/glossary:dataloader` is needed to organize the dataset into -a :class:`~tensorbay.dataset.dataset.Dataset` instance. +A :ref:`reference/glossary:dataloader` is needed to organize the dataset into a :class:`~tensorbay. +dataset.dataset.Dataset` instance. .. literalinclude:: ../../../../tensorbay/opendataset/DogsVsCats/loader.py :language: python - :name: dogsvscats-dataloader + :name: DogsVsCats-dataloader :linenos: -See :ref:`Classification annotation ` for more details. - +See :ref:`Classification annotation ` for more +details. There are already a number of dataloaders in TensorBay SDK provided by the community. -Thus, instead of writing, importing an available dataloadert is also feasible. +Thus, instead of writing, importing an available dataloader is also feasible. .. literalinclude:: ../../../../docs/code/DogsVsCats.py :language: python @@ -82,11 +92,13 @@ Thus, instead of writing, importing an available dataloadert is also feasible. .. note:: - Note that catalogs are automatically loaded in available dataloaders, users do not have to write them again. + Note that catalogs are automatically loaded in available dataloaders, users do not have to write + them again. .. important:: - See :ref:`dataloader table ` for more examples of dataloaders with different label types. + See :ref:`dataloader table ` for dataloaders with different label + types. ******************* Visualize Dataset @@ -100,13 +112,17 @@ Please see :ref:`features/visualization:Visualization` for more details. Upload Dataset **************** -The organized "Dogs vs Cats" dataset can be uploaded to TensorBay for sharing, reuse, etc. +The organized "Dogs Vs Cats" dataset can be uploaded to TensorBay for sharing, reuse, etc. .. literalinclude:: ../../../../docs/code/DogsVsCats.py :language: python :start-after: """Upload Dataset""" :end-before: """""" +.. note:: + Set ``skip_uploaded_files=True`` to skip uploaded data. + The data will be skiped if its name and segment name is the same as remote data. + Similar with Git, the commit step after uploading can record changes to the dataset as a version. If needed, do the modifications and commit again. Please see :ref:`features/version_control/index:Version Control` for more details. @@ -115,15 +131,13 @@ Please see :ref:`features/version_control/index:Version Control` for more detail Read Dataset ************** -Now "Dogs vs Cats" dataset can be read from TensorBay. +Now "Dogs Vs Cats" dataset can be read from TensorBay. .. literalinclude:: ../../../../docs/code/DogsVsCats.py :language: python :start-after: """Read Dataset / get dataset""" :end-before: """""" -In :ref:`reference/dataset_structure:dataset` "Dogs vs Cats", there are two -:ref:`segments `: ``train`` and ``test``. Get the segment names by listing them all. .. literalinclude:: ../../../../docs/code/DogsVsCats.py @@ -138,8 +152,8 @@ Get a segment by passing the required segment name. :start-after: """Read Dataset / get segment""" :end-before: """""" -In the train :ref:`reference/dataset_structure:segment`, there is a sequence of :ref:`reference/dataset_structure:data`, -which can be obtained by index. +In the :ref:`reference/dataset_structure:segment`, there is a sequence of +:ref:`reference/dataset_structure:data`, which can be obtained by index. .. literalinclude:: ../../../../docs/code/DogsVsCats.py :language: python @@ -155,10 +169,15 @@ which can be obtained by index. :start-after: """Read Dataset / get label""" :end-before: """""" -There is only one label type in "Dogs vs Cats" dataset, which is ``classification``. The information stored in :ref:`reference/label_format/CommonLabelProperties:category` is -one of the names in "categories" list of :ref:`catalog.json `. +There is only one label type in "Dogs Vs Cats" dataset, which is ``Classification``. +The information stored in +:ref:`reference/label_format/CommonLabelProperties:category` is one of the names in "categories" +list of :ref:`catalog.json `. The information stored in +:ref:`reference/label_format/CommonLabelProperties:attributes` is one or several of +the attributes in "attributes" list of :ref:`catalog.json `. See :ref:`reference/label_format/Classification:Classification` label format for more details. + **************** Delete Dataset **************** diff --git a/docs/source/quick_start/examples/LeedsSportsPose.rst b/docs/source/quick_start/examples/LeedsSportsPose.rst index fa8ef1ad7..409594156 100644 --- a/docs/source/quick_start/examples/LeedsSportsPose.rst +++ b/docs/source/quick_start/examples/LeedsSportsPose.rst @@ -1,17 +1,21 @@ + ################### Leeds Sports Pose ################### -This topic describes how to manage the `Leeds Sports Pose Dataset `_, -which is a dataset with :ref:`reference/label_format/Keypoints2D:Keypoints2D` label(:numref:`Fig. %s `). +This topic describes how to manage the `Leeds Sports Pose Dataset `_, which is a dataset with +:ref:`reference/label_format/Keypoints2D:Keypoints2D` label +(:numref:`Fig. %s `). -.. _example-leedssportspose: +.. _example-LeedsSportsPose: .. figure:: ../../images/example-Keypoints2D.png - :scale: 80 % + :scale: 50 % :align: center - The preview of an image with labels from "Leeds Sports Pose". + The preview of a cropped image with labels from "Leeds Sports Pose". + ***************************** Authorize a Client Instance @@ -37,7 +41,8 @@ An :ref:`reference/glossary:accesskey` is needed to authenticate identity when u Organize Dataset ****************** -Normally, ``dataloader.py`` and ``catalog.json`` are required to organize the "Leeds Sports Pose" dataset into the :class:`~tensorbay.dataset.dataset.Dataset` instance. +Normally, ``dataloader.py`` and ``catalog.json`` are required to organize the "Leeds Sports Pose" +dataset into the :class:`~tensorbay.dataset.dataset.Dataset` instance. In this example, they are stored in the same directory like:: Leeds Sports Pose/ @@ -51,33 +56,39 @@ A :ref:`reference/dataset_structure:catalog` contains all label information of o is typically stored in a json file like ``catalog.json``. .. literalinclude:: ../../../../tensorbay/opendataset/LeedsSportsPose/catalog.json - :language: json - :name: LeedsSportsPose-catalog - :linenos: + :language: json + :name: LeedsSportsPose-catalog + :linenos: + + + +The only annotation type for "Leeds Sports Pose" is +:ref:`reference/label_format/Keypoints2D:Keypoints2D`. -The only annotation type for "Leeds Sports Pose" is :ref:`reference/label_format/Keypoints2D:Keypoints2D`. .. note:: - By passing the path of the ``catalog.json``, :func:`~tensorbay.dataset.dataset.DatasetBase.load_catalog` supports loading the catalog into dataset. + By passing the path of the ``catalog.json``, :func:`~tensorbay.dataset.dataset.DatasetBase. + load_catalog` supports loading the catalog into dataset. .. important:: - See :ref:`catalog table ` for more catalogs with different label types. + See :ref:`catalog table ` for more catalogs with different + label types. Step 2: Write the Dataloader ============================ -A :ref:`reference/glossary:dataloader` is needed to organize the dataset into -a :class:`~tensorbay.dataset.dataset.Dataset` instance. +A :ref:`reference/glossary:dataloader` is needed to organize the dataset into a :class:`~tensorbay. +dataset.dataset.Dataset` instance. .. literalinclude:: ../../../../tensorbay/opendataset/LeedsSportsPose/loader.py :language: python :name: LeedsSportsPose-dataloader :linenos: -See :ref:`Keipoints2D annotation ` for more details. - +See :ref:`Keypoints2D annotation ` for more +details. There are already a number of dataloaders in TensorBay SDK provided by the community. Thus, instead of writing, importing an available dataloader is also feasible. @@ -89,11 +100,13 @@ Thus, instead of writing, importing an available dataloader is also feasible. .. note:: - Note that catalogs are automatically loaded in available dataloaders, users do not have to write them again. + Note that catalogs are automatically loaded in available dataloaders, users do not have to write + them again. .. important:: - See :ref:`dataloader table ` for dataloaders with different label types. + See :ref:`dataloader table ` for dataloaders with different label + types. ******************* Visualize Dataset @@ -107,13 +120,17 @@ Please see :ref:`features/visualization:Visualization` for more details. Upload Dataset **************** -The organized "BSTLD" dataset can be uploaded to TensorBay for sharing, reuse, etc. +The organized "Leeds Sports Pose" dataset can be uploaded to TensorBay for sharing, reuse, etc. .. literalinclude:: ../../../../docs/code/LeedsSportsPose.py :language: python :start-after: """Upload Dataset""" :end-before: """""" +.. note:: + Set ``skip_uploaded_files=True`` to skip uploaded data. + The data will be skiped if its name and segment name is the same as remote data. + Similar with Git, the commit step after uploading can record changes to the dataset as a version. If needed, do the modifications and commit again. Please see :ref:`features/version_control/index:Version Control` for more details. @@ -129,16 +146,22 @@ Now "Leeds Sports Pose" dataset can be read from TensorBay. :start-after: """Read Dataset / get dataset""" :end-before: """""" -In :ref:`reference/dataset_structure:dataset` "Leeds Sports Pose", there is one -:ref:`reference/dataset_structure:segment` named ``default``. Get it by passing the segment name or the index. +Get the segment names by listing them all. + +.. literalinclude:: ../../../../docs/code/LeedsSportsPose.py + :language: python + :start-after: """Read Dataset / list segment names""" + :end-before: """""" + +Get a segment by passing the required segment name. .. literalinclude:: ../../../../docs/code/LeedsSportsPose.py :language: python :start-after: """Read Dataset / get segment""" :end-before: """""" -In the default :ref:`reference/dataset_structure:segment`, there is a sequence of :ref:`reference/dataset_structure:data`, -which can be obtained by index. +In the :ref:`reference/dataset_structure:segment`, there is a sequence of +:ref:`reference/dataset_structure:data`, which can be obtained by index. .. literalinclude:: ../../../../docs/code/LeedsSportsPose.py :language: python @@ -154,10 +177,14 @@ which can be obtained by index. :start-after: """Read Dataset / get label""" :end-before: """""" -There is only one label type in "Leeds Sports Pose" dataset, which is ``keypoints2d``. The information stored in ``x`` (``y``) is -the x (y) coordinate of one keypoint of one keypoints list. The information stored in ``v`` is -the visible status of one keypoint of one keypoints list. See :ref:`reference/label_format/Keypoints2D:Keypoints2D` -label format for more details. +There is only one label type in "Leeds Sports Pose" dataset, which is ``Keypoints2D``. +The information stored in +:ref:`reference/label_format/CommonLabelProperties:category` is one of the names in "categories" +list of :ref:`catalog.json `. The information stored in +:ref:`reference/label_format/CommonLabelProperties:attributes` is one or several of +the attributes in "attributes" list of :ref:`catalog.json `. +See :ref:`reference/label_format/Keypoints2D:Keypoints2D` label format for more details. + **************** Delete Dataset diff --git a/docs/source/quick_start/examples/NeolixOD.rst b/docs/source/quick_start/examples/NeolixOD.rst index 4107765db..6f10bbba0 100644 --- a/docs/source/quick_start/examples/NeolixOD.rst +++ b/docs/source/quick_start/examples/NeolixOD.rst @@ -1,20 +1,21 @@ -########### - Neolix OD -########### -This topic describes how to manage the `Neolix OD dataset`_, -which is a dataset with :ref:`reference/label_format/Box3D:Box3D` label type -(:numref:`Fig. %s `). +################### + Neolix OD +################### -.. _Neolix OD dataset: https://gas.graviti.cn/dataset/graviti-open-dataset/NeolixOD +This topic describes how to manage the `Neolix OD Dataset `_, which is a dataset with +:ref:`reference/label_format/Box3D:Box3D` label +(:numref:`Fig. %s `). -.. _example-neolixod: +.. _example-NeolixOD: .. figure:: ../../images/example-Box3D.png :scale: 50 % :align: center - The preview of a point cloud from "Neolix OD" with Box3D labels. + The preview of a cropped image with labels from "Neolix OD". + ***************************** Authorize a Client Instance @@ -31,7 +32,6 @@ An :ref:`reference/glossary:accesskey` is needed to authenticate identity when u Create Dataset **************** - .. literalinclude:: ../../../../docs/code/NeolixOD.py :language: python :start-after: """Create Dataset""" @@ -41,7 +41,8 @@ An :ref:`reference/glossary:accesskey` is needed to authenticate identity when u Organize Dataset ****************** -Normally, ``dataloader.py`` and ``catalog.json`` are required to organize the "Neolix OD" dataset into the :class:`~tensorbay.dataset.dataset.Dataset` instance. +Normally, ``dataloader.py`` and ``catalog.json`` are required to organize the "Neolix OD" +dataset into the :class:`~tensorbay.dataset.dataset.Dataset` instance. In this example, they are stored in the same directory like:: Neolix OD/ @@ -51,38 +52,45 @@ In this example, they are stored in the same directory like:: Step 1: Write the Catalog ========================= -A :ref:`Catalog ` contains all label information of one dataset, -which is typically stored in a json file like ``catalog.json``. +A :ref:`reference/dataset_structure:catalog` contains all label information of one dataset, which +is typically stored in a json file like ``catalog.json``. .. literalinclude:: ../../../../tensorbay/opendataset/NeolixOD/catalog.json - :language: json - :name: neolixod-catalog - :linenos: + :language: json + :name: NeolixOD-catalog + :linenos: + + + +The only annotation type for "Neolix OD" is +:ref:`reference/label_format/Box3D:Box3D`, and there are 15 +:ref:`reference/label_format/CommonLabelProperties:category` types and 3 +:ref:`reference/label_format/CommonLabelProperties:attributes` type. -The only annotation type for "Neolix OD" is :ref:`reference/label_format/Box3D:Box3D`, and there are 15 -:ref:`reference/label_format/CommonLabelProperties:Category` types and 3 :ref:`reference/label_format/CommonLabelProperties:Attributes` types. .. note:: - By passing the path of the ``catalog.json``, :func:`~tensorbay.dataset.dataset.DatasetBase.load_catalog` supports loading the catalog into dataset. + By passing the path of the ``catalog.json``, :func:`~tensorbay.dataset.dataset.DatasetBase. + load_catalog` supports loading the catalog into dataset. .. important:: - See :ref:`catalog table ` for more catalogs with different label types. + See :ref:`catalog table ` for more catalogs with different + label types. Step 2: Write the Dataloader ============================ -A :ref:`reference/glossary:dataloader` is needed to organize the dataset into -a :class:`~tensorbay.dataset.dataset.Dataset` instance. +A :ref:`reference/glossary:dataloader` is needed to organize the dataset into a :class:`~tensorbay. +dataset.dataset.Dataset` instance. .. literalinclude:: ../../../../tensorbay/opendataset/NeolixOD/loader.py :language: python - :name: neolixod-dataloader + :name: NeolixOD-dataloader :linenos: -See :ref:`Box3D annotation ` for more details. - +See :ref:`Box3D annotation ` for more +details. There are already a number of dataloaders in TensorBay SDK provided by the community. Thus, instead of writing, importing an available dataloader is also feasible. @@ -94,11 +102,13 @@ Thus, instead of writing, importing an available dataloader is also feasible. .. note:: - Note that catalogs are automatically loaded in available dataloaders, users do not have to write them again. + Note that catalogs are automatically loaded in available dataloaders, users do not have to write + them again. .. important:: - See :ref:`dataloader table ` for dataloaders with different label types. + See :ref:`dataloader table ` for dataloaders with different label + types. ******************* Visualize Dataset @@ -112,13 +122,17 @@ Please see :ref:`features/visualization:Visualization` for more details. Upload Dataset **************** -The organized "Neolix OD" dataset can be uploaded to tensorBay for sharing, reuse, etc. +The organized "Neolix OD" dataset can be uploaded to TensorBay for sharing, reuse, etc. .. literalinclude:: ../../../../docs/code/NeolixOD.py :language: python :start-after: """Upload Dataset""" :end-before: """""" +.. note:: + Set ``skip_uploaded_files=True`` to skip uploaded data. + The data will be skiped if its name and segment name is the same as remote data. + Similar with Git, the commit step after uploading can record changes to the dataset as a version. If needed, do the modifications and commit again. Please see :ref:`features/version_control/index:Version Control` for more details. @@ -134,39 +148,46 @@ Now "Neolix OD" dataset can be read from TensorBay. :start-after: """Read Dataset / get dataset""" :end-before: """""" -In :ref:`reference/dataset_structure:Dataset` "Neolix OD", there is only one -:ref:`segment `: ``default``. -Get a segment by passing the required segment name or the index. +Get the segment names by listing them all. + +.. literalinclude:: ../../../../docs/code/NeolixOD.py + :language: python + :start-after: """Read Dataset / list segment names""" + :end-before: """""" + +Get a segment by passing the required segment name. .. literalinclude:: ../../../../docs/code/NeolixOD.py :language: python :start-after: """Read Dataset / get segment""" :end-before: """""" -In the default :ref:`reference/dataset_structure:Segment`, -there is a sequence of :ref:`reference/dataset_structure:Data`, -which can be obtained by index. +In the :ref:`reference/dataset_structure:segment`, there is a sequence of +:ref:`reference/dataset_structure:data`, which can be obtained by index. .. literalinclude:: ../../../../docs/code/NeolixOD.py :language: python :start-after: """Read Dataset / get data""" :end-before: """""" -In each :ref:`reference/dataset_structure:Data`, +In each :ref:`reference/dataset_structure:data`, there is a sequence of :ref:`reference/label_format/Box3D:Box3D` annotations, +which can be obtained by index. .. literalinclude:: ../../../../docs/code/NeolixOD.py :language: python :start-after: """Read Dataset / get label""" :end-before: """""" -There is only one label type in "Neolix OD" dataset, which is ``box3d``. -The information stored in :ref:`reference/label_format/CommonLabelProperties:Category` is -one of the category names in "categories" list of :ref:`catalog.json `. -The information stored in :ref:`reference/label_format/CommonLabelProperties:Attributes` -is one of the attributes in "attributes" list of :ref:`catalog.json `. +There is only one label type in "Neolix OD" dataset, which is ``Box3D``. +The information stored in +:ref:`reference/label_format/CommonLabelProperties:category` is one of the names in "categories" +list of :ref:`catalog.json `. The information stored in +:ref:`reference/label_format/CommonLabelProperties:attributes` is one or several of +the attributes in "attributes" list of :ref:`catalog.json `. See :ref:`reference/label_format/Box3D:Box3D` label format for more details. + **************** Delete Dataset **************** diff --git a/docs/source/quick_start/examples/Newsgroups20.rst b/docs/source/quick_start/examples/Newsgroups20.rst index 02045b430..0c33baa31 100644 --- a/docs/source/quick_start/examples/Newsgroups20.rst +++ b/docs/source/quick_start/examples/Newsgroups20.rst @@ -1,11 +1,12 @@ -############### + +################### 20 Newsgroups -############### +################### -This topic describes how to manage the `20 Newsgroups dataset`_, which is a dataset -with :ref:`reference/label_format/Classification:Classification` label type. +This topic describes how to manage the `20 Newsgroups Dataset `_, which is a dataset with +:ref:`reference/label_format/Classification:Classification` label -.. _20 Newsgroups dataset: https://gas.graviti.cn/dataset/data-decorators/Newsgroups20 ***************************** Authorize a Client Instance @@ -21,7 +22,7 @@ An :ref:`reference/glossary:accesskey` is needed to authenticate identity when u **************** Create Dataset **************** - + .. literalinclude:: ../../../../docs/code/Newsgroups20.py :language: python :start-after: """Create Dataset""" @@ -31,61 +32,55 @@ An :ref:`reference/glossary:accesskey` is needed to authenticate identity when u Organize Dataset ****************** -Normally, ``dataloader.py`` and ``catalog.json`` are required to organize the "20 Newsgroups" dataset into the :class:`~tensorbay.dataset.dataset.Dataset` instance. +Normally, ``dataloader.py`` and ``catalog.json`` are required to organize the "20 Newsgroups" +dataset into the :class:`~tensorbay.dataset.dataset.Dataset` instance. In this example, they are stored in the same directory like:: 20 Newsgroups/ catalog.json dataloader.py - -It takes the following steps to organize the "20 Newsgroups" dataset by -the :class:`~tensorbay.dataset.dataset.Dataset` instance. - Step 1: Write the Catalog ========================= -A :ref:`Catalog ` contains all label information of one dataset, -which is typically stored in a json file like ``catalog.json``. +A :ref:`reference/dataset_structure:catalog` contains all label information of one dataset, which +is typically stored in a json file like ``catalog.json``. .. literalinclude:: ../../../../tensorbay/opendataset/Newsgroups20/catalog.json - :language: json - :name: Newsgroups20-catalog - :linenos: + :language: json + :name: Newsgroups20-catalog + :linenos: + + + +The only annotation type for "20 Newsgroups" is +:ref:`reference/label_format/Classification:Classification`, and there are 20 +:ref:`reference/label_format/CommonLabelProperties:category` types -The only annotation type for "20 Newsgroups" is :ref:`reference/label_format/Classification:Classification`, -and there are 20 :ref:`reference/label_format/CommonLabelProperties:Category` types. .. note:: - * The :ref:`categories` in - :ref:`reference/dataset_structure:Dataset` "20 Newsgroups" have parent-child relationship, - and it use "." to sparate different levels. - * By passing the path of the ``catalog.json``, :func:`~tensorbay.dataset.dataset.DatasetBase.load_catalog` supports loading the catalog into dataset. + By passing the path of the ``catalog.json``, :func:`~tensorbay.dataset.dataset.DatasetBase. + load_catalog` supports loading the catalog into dataset. .. important:: - See :ref:`catalog table ` for more catalogs with different label types. + See :ref:`catalog table ` for more catalogs with different + label types. Step 2: Write the Dataloader ============================ -A :ref:`reference/glossary:Dataloader` is neeeded to organize the dataset into a -:class:`~tensorbay.dataset.dataset.Dataset` instance. +A :ref:`reference/glossary:dataloader` is needed to organize the dataset into a :class:`~tensorbay. +dataset.dataset.Dataset` instance. .. literalinclude:: ../../../../tensorbay/opendataset/Newsgroups20/loader.py :language: python :name: Newsgroups20-dataloader :linenos: -See :ref:`Classification annotation ` for more details. - -.. note:: - - The data in "20 Newsgroups" do not have extensions - so that a "txt" extension is added to the remote path of each data file - to ensure the loaded dataset could function well on TensorBay. - +See :ref:`Classification annotation ` for more +details. There are already a number of dataloaders in TensorBay SDK provided by the community. Thus, instead of writing, importing an available dataloader is also feasible. @@ -97,11 +92,13 @@ Thus, instead of writing, importing an available dataloader is also feasible. .. note:: - Note that catalogs are automatically loaded in available dataloaders, users do not have to write them again. + Note that catalogs are automatically loaded in available dataloaders, users do not have to write + them again. .. important:: - See :ref:`dataloader table ` for dataloaders with different label types. + See :ref:`dataloader table ` for dataloaders with different label + types. ******************* Visualize Dataset @@ -122,6 +119,10 @@ The organized "20 Newsgroups" dataset can be uploaded to TensorBay for sharing, :start-after: """Upload Dataset""" :end-before: """""" +.. note:: + Set ``skip_uploaded_files=True`` to skip uploaded data. + The data will be skiped if its name and segment name is the same as remote data. + Similar with Git, the commit step after uploading can record changes to the dataset as a version. If needed, do the modifications and commit again. Please see :ref:`features/version_control/index:Version Control` for more details. @@ -137,9 +138,6 @@ Now "20 Newsgroups" dataset can be read from TensorBay. :start-after: """Read Dataset / get dataset""" :end-before: """""" -In :ref:`reference/dataset_structure:Dataset` "20 Newsgroups", there are four -:ref:`Segments `: ``20news-18828``, -``20news-bydate-test`` and ``20news-bydate-train``, ``20_newsgroups``. Get the segment names by listing them all. .. literalinclude:: ../../../../docs/code/Newsgroups20.py @@ -154,15 +152,15 @@ Get a segment by passing the required segment name. :start-after: """Read Dataset / get segment""" :end-before: """""" -In the 20news-18828 :ref:`reference/dataset_structure:Segment`, there is a sequence of :ref:`reference/dataset_structure:Data`, -which can be obtained by index. +In the :ref:`reference/dataset_structure:segment`, there is a sequence of +:ref:`reference/dataset_structure:data`, which can be obtained by index. .. literalinclude:: ../../../../docs/code/Newsgroups20.py :language: python :start-after: """Read Dataset / get data""" :end-before: """""" -In each :ref:`reference/dataset_structure:Data`, +In each :ref:`reference/dataset_structure:data`, there is a sequence of :ref:`reference/label_format/Classification:Classification` annotations, which can be obtained by index. @@ -172,11 +170,14 @@ which can be obtained by index. :end-before: """""" There is only one label type in "20 Newsgroups" dataset, which is ``Classification``. -The information stored in :ref:`reference/label_format/CommonLabelProperties:Category` is -one of the category names in "categories" list of :ref:`catalog.json `. -See :ref:`this page ` for more details about the -structure of Classification. - +The information stored in +:ref:`reference/label_format/CommonLabelProperties:category` is one of the names in "categories" +list of :ref:`catalog.json `. The information stored in +:ref:`reference/label_format/CommonLabelProperties:attributes` is one or several of +the attributes in "attributes" list of :ref:`catalog.json `. +See :ref:`reference/label_format/Classification:Classification` label format for more details. + + **************** Delete Dataset **************** diff --git a/docs/source/quick_start/examples/THCHS30.rst b/docs/source/quick_start/examples/THCHS30.rst index 66020089a..0be63b145 100644 --- a/docs/source/quick_start/examples/THCHS30.rst +++ b/docs/source/quick_start/examples/THCHS30.rst @@ -1,11 +1,12 @@ -########### + +################### THCHS-30 -########### +################### -This topic describes how to manage the `THCHS-30 Dataset`_, -which is a dataset with :ref:`reference/label_format/Sentence:Sentence` label +This topic describes how to manage the `THCHS-30 Dataset `_, which is a dataset with +:ref:`reference/label_format/Sentence:Sentence` label -.. _THCHS-30 Dataset: https://www.graviti.com/open-datasets/data-decorators/THCHS30 ***************************** Authorize a Client Instance @@ -28,18 +29,33 @@ An :ref:`reference/glossary:accesskey` is needed to authenticate identity when u :end-before: """""" ****************** -Organize Dataset + Organize Dataset ****************** -It takes the following steps to organize the “THCHS-30” dataset by the :class:`~tensorbay.dataset.dataset.Dataset` instance. +Normally, ``dataloader.py`` and ``catalog.json`` are required to organize the "THCHS-30" +dataset into the :class:`~tensorbay.dataset.dataset.Dataset` instance. +In this example, they are stored in the same directory like:: + + THCHS-30/ + catalog.json + dataloader.py Step 1: Write the Catalog ========================= -A :ref:`Catalog ` contains all label information of one -dataset, which is typically stored in a json file. However the catalog of THCHS-30 is too -large, instead of reading it from json file, we read it by mapping from subcatalog that is -loaded by the raw file. Check the :ref:`dataloader ` below for more details. +A :ref:`reference/dataset_structure:catalog` contains all label information of one dataset, which +is typically stored in a json file like ``catalog.json``. +However the catalog of THCHS-30 is too large, instead of +reading it from json file, we read it by mapping from subcatalog that is loaded by +the raw file. Check the :ref:`dataloader ` below for more details. + + + + +.. note:: + + By passing the path of the ``catalog.json``, :func:`~tensorbay.dataset.dataset.DatasetBase. + load_catalog` supports loading the catalog into dataset. .. important:: @@ -49,19 +65,19 @@ loaded by the raw file. Check the :ref:`dataloader ` below f Step 2: Write the Dataloader ============================ -A :ref:`dataloader ` is needed to organize the dataset -into a :class:`~tensorbay.dataset.dataset.Dataset` instance. +A :ref:`reference/glossary:dataloader` is needed to organize the dataset into a :class:`~tensorbay. +dataset.dataset.Dataset` instance. .. literalinclude:: ../../../../tensorbay/opendataset/THCHS30/loader.py :language: python :name: THCHS30-dataloader :linenos: -See :ref:`Sentence annotation ` for more details. - +See :ref:`Sentence annotation ` for more +details. There are already a number of dataloaders in TensorBay SDK provided by the community. -Thus, instead of writing, importing an available dataloadert is also feasible. +Thus, instead of writing, importing an available dataloader is also feasible. .. literalinclude:: ../../../../docs/code/THCHS30.py :language: python @@ -70,11 +86,13 @@ Thus, instead of writing, importing an available dataloadert is also feasible. .. note:: - Note that catalogs are automatically loaded in available dataloaders, users do not have to write them again. + Note that catalogs are automatically loaded in available dataloaders, users do not have to write + them again. .. important:: - See :ref:`dataloader table ` for dataloaders with different label types. + See :ref:`dataloader table ` for dataloaders with different label + types. ******************* Visualize Dataset @@ -85,7 +103,7 @@ This step can help users to check whether the dataset is correctly organized. Please see :ref:`features/visualization:Visualization` for more details. **************** -Upload Dataset + Upload Dataset **************** The organized "THCHS-30" dataset can be uploaded to TensorBay for sharing, reuse, etc. @@ -95,12 +113,16 @@ The organized "THCHS-30" dataset can be uploaded to TensorBay for sharing, reuse :start-after: """Upload Dataset""" :end-before: """""" +.. note:: + Set ``skip_uploaded_files=True`` to skip uploaded data. + The data will be skiped if its name and segment name is the same as remote data. + Similar with Git, the commit step after uploading can record changes to the dataset as a version. If needed, do the modifications and commit again. Please see :ref:`features/version_control/index:Version Control` for more details. ************** -Read Dataset + Read Dataset ************** Now "THCHS-30" dataset can be read from TensorBay. @@ -110,9 +132,6 @@ Now "THCHS-30" dataset can be read from TensorBay. :start-after: """Read Dataset / get dataset""" :end-before: """""" -In :ref:`reference/dataset_structure:Dataset` "THCHS-30", there are three -:ref:`Segments `: -``dev``, ``train`` and ``test``. Get the segment names by listing them all. .. literalinclude:: ../../../../docs/code/THCHS30.py @@ -127,16 +146,15 @@ Get a segment by passing the required segment name. :start-after: """Read Dataset / get segment""" :end-before: """""" -In the dev :ref:`reference/dataset_structure:Segment`, -there is a sequence of :ref:`reference/dataset_structure:Data`, -which can be obtained by index. +In the :ref:`reference/dataset_structure:segment`, there is a sequence of +:ref:`reference/dataset_structure:data`, which can be obtained by index. .. literalinclude:: ../../../../docs/code/THCHS30.py :language: python :start-after: """Read Dataset / get data""" :end-before: """""" -In each :ref:`reference/dataset_structure:Data`, +In each :ref:`reference/dataset_structure:data`, there is a sequence of :ref:`reference/label_format/Sentence:Sentence` annotations, which can be obtained by index. @@ -145,12 +163,14 @@ which can be obtained by index. :start-after: """Read Dataset / get label""" :end-before: """""" -There is only one label type in "THCHS-30" dataset, which is ``Sentence``. It contains -``sentence``, ``spell`` and ``phone`` information. See :ref:`Sentence ` -label format for more details. +There is only one label type in "THCHS-30" dataset, which is ``Sentence``. +It contains ``sentence``, ``spell`` and ``phone`` information. +See :ref:`Sentence ` label format for +more details. + **************** -Delete Dataset + Delete Dataset **************** .. literalinclude:: ../../../../docs/code/THCHS30.py