Skip to content
This repository has been archived by the owner on Oct 4, 2020. It is now read-only.

Cannot open a CR2 file #11

Open
achennu opened this issue May 8, 2015 · 13 comments
Open

Cannot open a CR2 file #11

achennu opened this issue May 8, 2015 · 13 comments

Comments

@achennu
Copy link

achennu commented May 8, 2015

I thought to try this library to read the raw data from a CR2 file. After a pip install rawphoto, I tried:

from rawphoto import cr2
cr2.Cr2(filename='RAW_CANON_5D_ARGB.CR2')

Then I get an error thrown:

In [24]: cr2.Cr2(filename='RAW_CANON_5D_ARGB.CR2')

KeyError Traceback (most recent call last)
in ()
----> 1 cr2.Cr2(filename='RAW_CANON_5D_ARGB.CR2')

/home/arjun/Software/pypack/rawphoto/rawphoto/cr2.pyc in init(self, blob, file, filename)
109 self.seek(next_ifd_offset)
110 self.ifds.append(Ifd(self.endianness, file=self.fhandle, tags=tags,
--> 111 subdirs=subdirs))
112 next_ifd_offset = self.ifds[len(self.ifds) - 1].next_ifd_offset
113

/home/arjun/Software/pypack/rawphoto/rawphoto/tiff.pyc in init(self, endianness, file, blob, offset, subdirs, tags, tag_types)
135 for i in range(num_entries):
136 e = IfdEntry(endianness, blob=buf[(12 * i):(12 * (i + 1))],
--> 137 tags=tags)
138 self.entries[e.tag_name] = e
139 if e.tag_id in subdirs:

/home/arjun/Software/pypack/rawphoto/rawphoto/tiff.pyc in new(cls, endianness, file, blob, offset, tags, tag_types)
87 else:
88 tag_name = tag_id
---> 89 tag_type = tag_types[tag_type_key]
90 if struct.calcsize(tag_type) > 4 or tag_type == 's':
91 # If the value is a pointer to something small:

KeyError: 64

Seems like the module does not work after an initial install now.

@SamWhited
Copy link
Member

Thanks for the report, do you happen to have a cr2 file that exhibits this behavior uploaded somewhere that I could test with? Also, what sort of camera is it? Thanks.

@achennu
Copy link
Author

achennu commented May 8, 2015

I've tried it with other files with the same error. The file under test here was http://www.rawsamples.ch/raws/canon/5d/RAW_CANON_5D_ARGB.CR2

@SamWhited
Copy link
Member

I've downloaded the file and reproduced the issue; I need to work on a test suite that tests against real data from different cameras.

Fair warning: This library is still very much alpha quality, and the plan is to change the APIs again before the 1.0 release. It may be that it goes entirely towards being a libraw wrapper and drops the pure Python decoders, but that's still undecided as I haven't had a lot of time to put towards it lately.

@SamWhited
Copy link
Member

This library has now been superceded by our new library, rawkit. It's still very early too (it's a wrapper for libraw), but further development will occur there. We'll keep rawphoto around as an educational tool (and I may still get around to fixing this), but for now please start migrating to rawkit. We're working hard on documentation and the like, so it should get to a more or less usable level fairly quickly.

@achennu
Copy link
Author

achennu commented May 18, 2015

Good to hear that there is progress, one way or the other.

I'm a python dev myself, so maybe i can contribute if there's scope for that?

@achennu achennu closed this as completed May 18, 2015
@SamWhited SamWhited reopened this May 18, 2015
@SamWhited
Copy link
Member

@achennu Feel free; PR's are always welcome.

I'll leave this issue open, as we still think there's value in having a pure python implementation of a basic raw parser (even if it's just for educational purposes), and I may get around to figuring out where it gets off one of these days.

@achennu
Copy link
Author

achennu commented May 18, 2015

@SamWhited I definitely agree on that. The ability to read out the image data & exif data should come under the 'parsing' task. Raw image processing is only further down the pipeline. I have a big bunch of raw files to process for data analysis, and all I'm trying to get at is the image data which I would then work with in numpy. Currently it requires to batch convert the files to TIFF & then read them in... which is inelegant and cumbersome.

If there's something I can do to have this basic parsing working rawphoto, I'd try to help.

@SamWhited
Copy link
Member

@achennu You should be able to do all that with rawkit instead. Currently it's got complete libraw bindings which you can use, and nicer higher level APIs just for saving the image. We'll be adding nice high level bindings for other things as we go (eventually the higher level bindings will catch up to what rawphoto offers), but in the mean time you can still use the lower level ctypes bindings.

If you really want a pure python alternative and want to take a stab at this issue, feel free. It's working on some CR2 files, and not on others, so I suspect there's a field type with a length we don't recognize somewhere that's present in that particular CR2 so we're getting off by a few bytes when we attempt to read whatever that field is.

@campaul
Copy link
Member

campaul commented May 18, 2015

Here's an example of using the libraw bindings in rawkit directly to access metadata:

from rawkit.libraw import libraw

raw = libraw.libraw_init(0)

for file_name in file_names:
    # populate 'raw' with metadata for the given file
    libraw.libraw_open_file(raw, bytes(file_name, 'utf-8'))

    # print some metadata
    print('width ' + raw.contents.sizes.width)
    print('height' + raw.contents.sizes.height)

    # recycle raw so it can be used for the next image
    libraw.libraw_recycle(raw)

As long as you don't use libraw.libraw_unpack the image data is never loaded, so everything is fast. The raw variable is an instance of the libraw_data_t structure. See http://www.libraw.org/docs/API-C-eng.html and http://www.libraw.org/docs/API-datastruct-eng.html#libraw_data_t for more information, or you can read rawkit/libraw.py.

@achennu
Copy link
Author

achennu commented May 18, 2015

Thanks for the heads up @campaul. I was already browsing through raw.py and libraw.py to see what I could do with it. The function pointers are not explicitly exported in the python file, so for example the libraw.libraw_dcraw_ppm_tiff_writer which is used in Raw.save is not available for pythonic introspection.

While this is not the rawkit repo, could I bother you to tell me how I could access the image data as a buffer to feed into numpy?

import numpy as np
libraw.libraw_unpack(raw)
imgdat = np.frombuffer(raw.contents.rawdata, dtype='uint8')
imgdat = np.frombuffer(raw.contents.rawdata.raw_image, dtype='uint8')
imgdat = np.frombuffer(raw.contents.rawdata.color3_image, dtype='uint8')

On doing the first option I get an array of size 131968, where as `raw.contents.sizes' gives me a w=4310, h=2868, so a total of 12361080 array elements. The other two options just return an array of shape (8,).

What am I missing? Also, how do I access the red, blue, green channels?

@campaul
Copy link
Member

campaul commented May 18, 2015

That involves using another struct that I haven't defined yet, but that's the next feature I'm going to be adding. I'll probably do that tonight when I get off work.

Just so I can know exactly what to prioritize, are you looking to get the image data in its raw sensor data format, or in a processed RGB format? I'm pretty sure I can do both with libraw, but I've only thought about the RGB case so far.

EDIT: I just noticed you asked for RGB values, so I'm guessing you want the processed version of the image data.

@achennu
Copy link
Author

achennu commented May 18, 2015

The interest is to get the RGB channels out, as well as read some of the metadata like timestamps, sizes, etc. The latter already works (good job!), so getting the RGB channels would be great!

I can write some python code, but I don't know much C. So if I can contribute in some way towards rawkit, I'll try to do that too.

@campaul
Copy link
Member

campaul commented May 18, 2015

photoshell/rawkit#4 is the relevant issue.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants