Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tissue Region masking in scPortrait #50

Open
sophiamaedler opened this issue Aug 17, 2024 · 5 comments · May be fixed by #104
Open

Tissue Region masking in scPortrait #50

sophiamaedler opened this issue Aug 17, 2024 · 5 comments · May be fixed by #104
Assignees

Comments

@sophiamaedler
Copy link
Collaborator

Aim: when working with tissue data often large parts of an acquired image constitute background which is not relevant for later processing. These regions should be masked from the input to limit the amount of data that needs to be processed.

Example: m3B dataset from publicly available scDVP manuscript
Screenshot 2024-08-16 at 10 53 53

Required implemented methods:

  1. streamlined workflow where the user can either interactively annotate a region in the napari-spatialdata viewer which is saved under a defined label (maybe "ImageArea"?) and this region is then applied as a mask to the input image where all pixel values outside of this region are set to 0 and the image area is also compressed to the minimal bounding box surrounding all non-zero values.
  • we definitely want to support a workflow where a full non-cropped and non-masked image can be loaded into scPortrait and subsequently interactively viewed and a masking region defined. This means that this process would have to destructively overwrite the already loaded input image. I think a destructive process here would be most suitable to limit data duplication but am open to a discussion.
  1. alternatively this region can be loaded from a saved file (not sure what formats we should support? pickle?) for the same downstream processing
  2. a loaded mask applied to an existing segmentation mask (this should not be destructive)
  3. a loaded mask applied to an existing single-cell HDF5 dataset to generate a new one only containing cells from the selected area (this should not be destructive)
@LLehner
Copy link
Collaborator

LLehner commented Aug 27, 2024

the image area is also compressed to the minimal bounding box surrounding all non-zero values.

and

This means that this process would have to destructively overwrite the already loaded input image.

would mean that the image should be reduced to the unmasked area, meaning all zero valued pixels are completely removed such that the original image gets cropped or should the image keep its original size, i.e. keep the zero valued pixels?

Another question: depending on the image it might be easier to annotate a region that gets set to 0 instead of annotating the region which should be kept. So should there be an option to either mask the region one wants to keep or which one wants to remove?

@sophiamaedler
Copy link
Collaborator Author

sophiamaedler commented Aug 27, 2024

would mean that the image should be reduced to the unmasked area, meaning all zero valued pixels are completely removed such that the original image gets cropped or should the image keep its original size, i.e. keep the zero valued pixels?

I think it would be better if the image changes is original size because this would reduce the resources required for all downstream operations. There might be some disadvantages to reducing the size but at the moment I can't think of one.
Maybe while implementing this we could define a boolean parameter like crop_image to True which in case we encounter scenarios where cropping is not ideal at somepoint in the future we can easily change this behaviour without refactoring the entire code.

Another question: depending on the image it might be easier to annotate a region that gets set to 0 instead of annotating the region which should be kept. So should there be an option to either mask the region one wants to keep or which one wants to remove?

that makes sense! lets do it :)

@sophiamaedler
Copy link
Collaborator Author

sophiamaedler commented Aug 27, 2024

we could also implement a method that automatically detects tissue using a thresholding + watershed approach. Could be a nice feature to implement in addition.

I was working with a CODEX dataset today and quickly implemented an automatic detection method. Not ideal yet I am sure but maybe a starting point for this.

from skimage.filters import threshold_otsu
from skimage.measure import label, regionprops
from skimage.filters import gaussian
from skimage.morphology import dilation
from skimage.transform import resize
from tifffile import imwrite
import matplotlib.patches as patches
from scportrait.processing.preprocessing import downsample_img_padding

downsampled_images = downsample_img_padding(image[0:1], 25)
downsampled_images = percentile_normalization(downsampled_images, 0.05, 0.4)

plt.figure()
plt.imshow(downsampled_images[0], cmap="gray")

# Apply Gaussian blur to the downsampled images
blurred_images = gaussian(downsampled_images, sigma=2)

plt.figure()
plt.imshow(blurred_images[0], cmap="gray")

# Perform thresholding with global Otsu on the blurred images
threshold = threshold_otsu(blurred_images)
binary_image = blurred_images > threshold

plt.figure()
plt.imshow(binary_image[0], cmap="gray")

# Perform object detection
labeled_image = label(binary_image)
regions = regionprops(labeled_image[0])

# Plot the labeled image
fig, ax = plt.subplots(figsize=(8, 8))
ax.imshow(labeled_image[0], cmap='nipy_spectral')

# Iterate over the detected objects
for region in regions:
    # Extract properties of each object
    minr, minc, maxr, maxc = region.bbox
    width = maxc - minc
    height = maxr - minr

    # Draw bounding box around the object
    rect = patches.Rectangle((minc, minr), width, height, edgecolor='red', facecolor='none')
    ax.add_patch(rect)

# Set axis labels and title
ax.set_xlabel('X')
ax.set_ylabel('Y')
ax.set_title('Object Detection')

# Dilate the labeled image
dilated_image = dilation(labeled_image,)

ids = set(np.unique(dilated_image)) - set([0])
n_tissue_regions = len(ids)

for id in ids:
    region = dilated_image == id
    rescaled_region = resize(region, image.shape, anti_aliasing=False, preserve_range=True)
    rescaled_region = rescaled_region  != 0
    rescaled_region = dilation(rescaled_region)

    regions = regionprops(rescaled_region[0].astype(np.uint8))

    if len(regions) > 1:
        raise ValueError("Multiple regions detected in the tissue mask.")
    
    # Extract properties of each object
    minr, minc, maxr, maxc = regions[0].bbox
    width = maxc - minc
    height = maxr - minr

    _subset_image = np.where(rescaled_region, image, 0)
    _subset_image = _subset_image[:, minr:maxr, minc:maxc]

    imwrite(f"/Users/sophia/Documents/Code/scPortrait/data/CODEX/doi_10_5061_dryad_brv15dvj1__v20240708/codex_tissue_areas/tissue_region_{id}.tif", _subset_image)

    plt.figure()  
    plt.imshow(_subset_image[0], cmap="gray")

plt.figure()

plt.imshow(image[0], cmap="gray")

Screenshot 2024-08-28 at 19 58 47

Screenshot 2024-08-28 at 19 58 21

Screenshot 2024-08-28 at 19 58 37

@LLehner
Copy link
Collaborator

LLehner commented Oct 18, 2024

Hi @sophiamaedler,

before creating a PR, just a few more questions:

  1. Do we assume that the images we want to mask are in the default coordinate system or did they already undergo a transformation (e.g. for aligment), because if the image we want to mask was transformed before this requires a quite different approach compared to image and shape residing in a "default" coordinate system (in spatialdata its usually called "global").
  2. For automatic tissue detection, do you have specific shapes to detect in mind?
  3. Object detection can vary quite a bit and there are many algorithms available so perhaps it would be interesting to give the user the option for plugin models, what do you think?
  4. Should i already include tests?

@sophiamaedler
Copy link
Collaborator Author

  1. Do we assume that the images we want to mask are in the default coordinate system or did they already undergo a transformation (e.g. for aligment), because if the image we want to mask was transformed before this requires a quite different approach compared to image and shape residing in a "default" coordinate system (in spatialdata its usually called "global").

I would assume they are in the default coordinate system. I guess there might be some cases where they have been transformed but certainly not the main use case.

  1. For automatic tissue detection, do you have specific shapes to detect in mind?

I think the most common application would be circles because they are TMA cores but I think a general thresholding approach would be best that can work with many different shapes.

  1. Object detection can vary quite a bit and there are many algorithms available so perhaps it would be interesting to give the user the option for plugin models, what do you think?

sounds like a good idea. I would implement one base method that works e.g. on the provided example so that we have something to test and work with but make the code modular in such a way that users can plugin other models as needed.

  1. Should i already include tests?

Would be great! But has the lowest priority a the moment.

@LLehner LLehner linked a pull request Nov 11, 2024 that will close this issue
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants