Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add docs for annotations #332

Merged
merged 2 commits into from
Dec 1, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
61 changes: 31 additions & 30 deletions docs/files/tutorial-create.py
Original file line number Diff line number Diff line change
Expand Up @@ -30,37 +30,37 @@
reads.name = 'My first data'
reads.save()

# define the chosen descriptor schema
reads.descriptor_schema = 'reads'

# define the descriptor
reads.descriptor = {
'description': 'Some free text...',
}

# Very important: save changes!
reads.save()

reads.sample.descriptor_schema = 'sample'

reads.sample.descriptor = {
'general': {
'description': 'This is a sample...',
'species': 'Homo sapiens',
'strain': 'F1 hybrid FVB/N x 129S6/SvEv',
'cell_type': 'glioblastoma',
},
'experiment': {
'assay_type': 'rna-seq',
'molecule': 'total_rna',
},
reads.sample.set_annotation("general.species", "Homo sapiens")

# Get the field by it's group and name:
field = res.annotation_field.get(group__name="general", name="species")
# Same thing, but in shorter syntax
field = res.annotation_field.from_path("general.species")
# Examine some of the field attributes
field.name
field.group
field.description

res.annotation_field.all()
# You can also filter the results
res.annotation_field.filter(group__name="general")

# Get an AnnotationValue
ann_value = reads.sample.get_annotation("general.species")
# The actual value
ann_value.value
# The corresponding field
ann_value.field
# The corresponding sample
ann_value.sample

reads.sample.annotations
reads.sample.get_annotations()
annotations = {
"general.species": "Homo sapiens", "general.description": "Description"
}

reads.sample.save()




reads.sample.set_annotations(annotations)
reads.sample.set_annotation("general.description", None, force=True)



Expand Down Expand Up @@ -115,3 +115,4 @@

# Access process' execution errors
alignment.process_error

6 changes: 3 additions & 3 deletions docs/metadata.rst
Original file line number Diff line number Diff line change
Expand Up @@ -4,9 +4,9 @@
Metadata
========

Samples are normally annotated with the use of ``descriptor`` and
``descriptor_schema``. However in some cases the fields defined in
``DescriptorSchema`` do not suffice and it comes handy to upload sample
Samples are normally annotated with the use of ``AnnotationField``\ s and
``AnnotationValue``\ s. However in some cases the available
``AnnotationField``\ s do not suffice and it comes handy to upload sample
annotations in a table where each row holds information about some
sample in collection. In general, there can be multiple rows referring
to the same sample in the collection (for example one sample received
Expand Down
104 changes: 71 additions & 33 deletions docs/tutorial-create.rst
Original file line number Diff line number Diff line change
Expand Up @@ -74,50 +74,88 @@ applied on the server.
modify ``created`` or ``contributor`` fields. You will get an error if you
try.

Annotate Samples and Data
=========================

The obvious next thing to do after uploading some data is to annotate it.
Annotations are encoded as bundles of descriptors, where each descriptor
references a value in a descriptor schema (*i.e.* a template). Annotations for
data objects, samples, and collections each follow a different descriptor
format. For example, a reads data object can be annotated with the 'reads'
descriptor schema, while a sample can be annotated by the 'sample' annotation
schema. Each data object that is associated with the sample is also connected
to the sample's annotation, so that the annotation for a sample (or collection)
represents all Data objects attached to it. `Descriptor schemas`_ are described
in detail (with `accompanying examples`_) in the
`Resolwe Bioinformatics documentation`_.
Annotate Samples
================

.. _Resolwe Bioinformatics documentation: http://resolwe-bio.readthedocs.io
.. _Descriptor schemas: https://resolwe-bio.readthedocs.io/en/latest/descriptor.html
.. _accompanying examples: https://github.com/genialis/resolwe-bio/tree/master/resolwe_bio/descriptors
The next thing to do after uploading some data is to annotate samples this data
belongs to. This can be done by assigning a value to a predefined field on a
given sample. See the example below.

Here, we show how to annotate the ``reads`` data object by defining the
descriptor information that populates the annotation fields as defined in the
'reads' descriptor schema:
Each sample should be assigned a species. This is done by attaching the
``general.species`` field on a sample and assigning it a value, e.g.
``Homo sapiens``.

.. literalinclude:: files/tutorial-create.py
:lines: 33-42
:lines: 33


Annotation Fields
-----------------

We can annotate the sample object using a similar process with a 'sample'
descriptor schema:
You might be wondering why the example above requires ``general.species`` string
instead of e.g. just ``species``. The answer to this are ``AnnotationField``\ s.
These are predefined *objects* that are available to annotate samples. These
objects primarily have a name, but also other properties. Let's examine some of
those:

.. literalinclude:: files/tutorial-create.py
:lines: 44-59
:lines: 35-42

.. warning::

Many descriptor schemas have required fields with a limited set of choices
that may be applied as annotations. For example, the 'species' annotation
in a sample descriptor must be selected from the list of options in the
`sample descriptor schema`_, represented by its Latin name.
.. note::

Each field is uniquely defined by the combination of ``name`` and ``group``.

.. _sample descriptor schema: https://github.com/genialis/resolwe-bio/blob/master/resolwe_bio/descriptors/sample.yml
If you wish to examine what fields are available, use a query

.. literalinclude:: files/tutorial-create.py
:lines: 44-46


You may be wondering whether you can create your own fields / groups. The answer
is no. Time has proven that keeping things organized requires the usage
of a selected set of predefined fields. If you absolutely feel that you need an
additional annotation field, let us know or use resources such as :ref:`metadata`.


Annotation Values
-----------------

As mentioned before, fields are only one part of the annotation. The other part
are annotation values, stored as a standalone resource ``AnnotationValues``.
They connect the field with the actual value.

.. literalinclude:: files/tutorial-create.py
:lines: 48-55

We can also define descriptors and descriptor schema directly when calling
``res.run`` function. This is described in the section about the ``run()``
method below.

As a shortcut, you can get all the ``AnnotationValue``\ s for a given sample by:

.. literalinclude:: files/tutorial-create.py
:lines: 57


Helper methods
--------------

Sometimes it is convenient to represent the annotations with the dictionary,
where keys are field names and values are annotation values. You can get all
the annotation for a given sample in this format by calling:

.. literalinclude:: files/tutorial-create.py
:lines: 58

Multiple annotations stored in the dictionary can be assigned to sample by:

.. literalinclude:: files/tutorial-create.py
:lines: 59-62

Annotation is deleted from the sample by setting its value to ``None`` when
calling ``set_annotation`` or ``set_annotations`` helper methods. To avoid
confirmation prompt, you can set ``force=True``.

.. literalinclude:: files/tutorial-create.py
:lines: 63

Run analyses
============
Expand Down
2 changes: 1 addition & 1 deletion src/resdk/resources/sample.py
Original file line number Diff line number Diff line change
Expand Up @@ -281,7 +281,7 @@ def set_annotation(
return None
field = self.resolwe.annotation_field.from_path(full_path)
annotation_value = self.resolwe.annotation_value.create(
sample=self, field=field, value=value
sample=self.id, field=field.id, value=value
)
return annotation_value

Expand Down
Loading