Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Updated deeplake.random.seed documentation #2593

Merged
merged 10 commits into from
Sep 15, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
35 changes: 35 additions & 0 deletions deeplake/core/seed.py
Original file line number Diff line number Diff line change
Expand Up @@ -4,13 +4,47 @@

class DeeplakeRandom(object):
def __new__(cls):
"""Returns a :class:`~deeplake.core.seed.DeeplakeRandom` object singleton instance."""
if not hasattr(cls, "instance"):
cls.instance = super(DeeplakeRandom, cls).__new__(cls)
cls.instance.internal_seed = None
cls.instance.indra_api = None
return cls.instance

def seed(self, seed: Optional[int] = None):
"""Set random seed to the deeplake engines

Args:
seed (int, optional): Integer seed for initializing the computational engines, used for bringing reproducibility to random operations.
Set to ``None`` to reset the seed. Defaults to ``None``.

Raises:
TypeError: If the provided value type is not supported.

Background
----------

Specify a seed to train models and run randomized Deep Lake operations reproducibly. Features affected are:

- Dataloader shuffling
- Sampling and random operations in Tensor Query Language (TQL)
- :meth:`Dataset.random_split <deeplake.core.dataset.Dataset.random_split>`


The random seed can be specified using ``deeplake.random.seed``:

>>> import deeplake
>>> deeplake.random.seed(0)

Random number generators in other libraries
-------------------------------------------

The Deep Lake random seed does not affect random number generators in other libraries such as ``numpy``.

However, seeds in other libraries will affect code where Deep Lake uses those libraries, but it will not impact
the methods above where Deep Lake uses its internal seed.

"""
if seed is None or isinstance(seed, int):
self.internal_seed = seed
if self.indra_api is None: # type: ignore
Expand All @@ -27,4 +61,5 @@ def seed(self, seed: Optional[int] = None):
)

def get_seed(self) -> Optional[int]:
"""Returns the seed which set to the deeplake to control the flows"""
return self.internal_seed
1 change: 1 addition & 0 deletions docs/source/Datasets.rst
Original file line number Diff line number Diff line change
Expand Up @@ -69,6 +69,7 @@ Dataset Operations
Dataset.flush
Dataset.clear_cache
Dataset.size_approx
Dataset.random_split

Dataset Visualization
~~~~~~~~~~~~~~~~~~~~~
Expand Down
7 changes: 7 additions & 0 deletions docs/source/deeplake.random.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
deeplake.random.seed
====================

.. currentmodule:: deeplake.core.seed

.. autoclass:: DeeplakeRandom
:members:
1 change: 1 addition & 0 deletions docs/source/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -58,6 +58,7 @@ Deep Lake is an open-source database for AI.
deeplake.client.log
deeplake.core.transform
deeplake.core.vectorstore
deeplake.random


Indices and tables
Expand Down
Loading