tif inferred as pickle #423

robmarkcole · 2024-11-27T16:22:35Z

🐛 Bug

Returning paths to tifs, but they are inferred as pickle:

# dataset yields paths 

test_dataset[0] == {'image_path': '/teamspace/studios/this_studio/dataset/test/T20MMT_20200505T142729_tile_17_11.tif',
 'mask_path': '/teamspace/studios/this_studio/dataset/test/T20MMT_20200505T142729_tile_17_11_mask.tif',
 'image_id': 'T20MMT_20200505T142729_tile_17_11'}

Logs

Rank 1 inferred the following `['str', 'pickle', 'pickle']` data format.

But expect:

Rank 1 inferred the following `['str', 'tif', 'tif']` data format.

The script generates the same number of chunk files for both train and test, although train has significantly more tifs..

litdata==0.2.19

Reproducible example:

from litdata.streaming.serializers import _SERIALIZERS

image_path = '/teamspace/studios/this_studio/dataset/test/T10VDK_20200519T193911_tile_0_7.tif'

def evaluate_serializer(filename: str) -> str:
    """
    Evaluate which serializer would be used for a given filename.

    Args:
        filename: The name of the file to evaluate.

    Returns:
        The name of the selected serializer.
    """
    # Iterate through serializers in the order defined in _SERIALIZERS
    for serializer_name, serializer in _SERIALIZERS.items():
        if serializer.can_serialize(filename):
            return serializer_name

    # If no serializer can handle the file, raise an exception
    raise ValueError(f"No suitable serializer found for filename: {filename}")

assert evaluate_serializer(image_path) == 'pickle'

The text was updated successfully, but these errors were encountered:

robmarkcole · 2024-11-27T17:30:37Z

OK, I see

# FileSerializer will be removed in the future.

With this going away, appears tif need a new serializer? Prototyped a solution using tifffile which supports multispectral data (pillow is limited)

robmarkcole added bug Something isn't working help wanted Extra attention is needed labels Nov 27, 2024

robmarkcole mentioned this issue Nov 28, 2024

POC: add tiffile serializer #425

Merged

tchaton closed this as completed in #425 Nov 28, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

tif inferred as pickle #423

tif inferred as pickle #423

robmarkcole commented Nov 27, 2024 •

edited

Loading

robmarkcole commented Nov 27, 2024 •

edited

Loading

tif inferred as pickle #423

tif inferred as pickle #423

Comments

robmarkcole commented Nov 27, 2024 • edited Loading

🐛 Bug

robmarkcole commented Nov 27, 2024 • edited Loading

robmarkcole commented Nov 27, 2024 •

edited

Loading

robmarkcole commented Nov 27, 2024 •

edited

Loading