You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
sdf = soma.DataFrame.create(...) and sdf = soma.DataFrame.open(...)
sdf.write(...)
The former two are Python class methods (in particular, non-instance methods): the thing before the . is the class name. The latter is a Python instance method: the thing before the . is the name of an instance of the class.
R6, by contrast, only has the latter and not the former.
The best option seems to be to use top-level function names like SOMADataFrameCreate and SOMADataFrameOpen, etc.. This (moreover) mimics Seurat's Load_10X (note the leading capital L).
We're in need of free-standing helpers that put a shim on the existing functionality and return such R6 objects by hiding some 'details' we need but don't want to expose. These can be called "free-standing helpers" or "non-instance factory methods" -- whatever we call them, say SOMADataFrameCreate and SOMADataFrameOpen etc., they will (a) do the creating or the opening, (b) invoke $new(), and (c) return that to the caller.
SOMAOpen should check an object's metadata and do return SOMATypeNameGoesHereOpen for the corresponding type -- this is [r] Have a general-purpose opener #1009
$new() should be informatively blocked as described above
Something that can be done over time but needn't be done straightaway:
Do more caching for objects opened in read mode -- profiling/benchmarking can help guide us here
Benefits will be less obvious for working on small local-disk data but will become more salient for working on larger datasets which are stored remotely
Implementation: this means a tiledb handle (for writes) or a SOMAReader instance (for array reads) needs to be stored within a given object
This is what obj$close() must close
In Python the semantics are that writes aren't finalized until obj$close() and that should apply here as well
Why:
API parity with tiledbsoma-py
Once we have things explicitly opened for read or explicitly opened for write at the externally visible tiledbsoma API level, then at the internal tiledb/libtiledbsoma level we won't have to do open/close on arrays/group so often -- this will help performance. Another benefit of this is that if an object is opened for read, then we can confidently cache things about it -- e.g. an array's metadata, or a group's names-to-uris mapping, etc
The text was updated successfully, but these errors were encountered:
johnkerl
changed the title
[r] Make create and open non-instance factory methods
[r] Make create and open into non-instance factory methods
Feb 27, 2023
Context: R/Python parity as tracked by #839.
In the current Python implementation we have:
sdf = soma.DataFrame.create(...)
andsdf = soma.DataFrame.open(...)
sdf.write(...)
The former two are Python class methods (in particular, non-instance methods): the thing before the
.
is the class name. The latter is a Python instance method: the thing before the.
is the name of an instance of the class.R6, by contrast, only has the latter and not the former.
The best option seems to be to use top-level function names like
SOMADataFrameCreate
andSOMADataFrameOpen
, etc.. This (moreover) mimics Seurat'sLoad_10X
(note the leading capitalL
).See also #1009 and #1010 for follow-on work.
Some more wording worked out between @eddelbuettel and @johnkerl :
SOMADataFrameCreate
andSOMADataFrameOpen
etc., they will (a) do the creating or the opening, (b) invoke$new()
, and (c) return that to the caller.$new()
can't be called directly by users, and if they do, give them an informative message telling them to use the create/open methods. Example here: https://github.com/single-cell-data/TileDB-SOMA/blob/1.0.0/apis/python/src/tiledbsoma/_tiledb_object.py#L90-L95SOMADataFrameCreate
,SOMADataFrameOpen
SOMASparseNDArrayCreate
,SOMASparseNDArrayOpen
SOMADenseNDArrayCreate
,SOMADenseNDArrayOpen
SOMACollectionCreate
,SOMACollectionOpen
SOMAMeasurementCreate
,SOMAMeasurementOpen
SOMAExperimentCreate
,SOMAExperimentOpen
SOMAOpen
should check an object's metadata and do returnSOMATypeNameGoesHereOpen
for the corresponding type -- this is [r] Have a general-purpose opener #1009$close()
needs to be exposed -- this is issue [r] Makeclose
methods public #1010 -- that work could be done on this issue if you prefer$new()
should be informatively blocked as described aboveSOMAReader
instance (for array reads) needs to be stored within a given objectobj$close()
must closeobj$close()
and that should apply here as wellThe text was updated successfully, but these errors were encountered: