Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error when indexing .ebsp file produced by AZtec 6.0 #16

Open
christiankwalters opened this issue Feb 16, 2023 · 21 comments
Open

Error when indexing .ebsp file produced by AZtec 6.0 #16

christiankwalters opened this issue Feb 16, 2023 · 21 comments

Comments

@christiankwalters
Copy link

Hello,

I am encountering an issue when trying to index a .ebsp pattern file that was produced by Oxford AZtec, version 6.0.x. When I try to run IndexEBSD through the command line, I get the following message:

            • warning: some namelist parameters weren't used: patdset * * * * * *

couldn't determine pixel type for patterns in cc90b00e-a139-462d-8515-02d667ad026e.ebsp

When I try to use the EMSphInxEBSD graphical wizard instead of the command line, the program crashes as soon as I enter the .ebsp filename into the "Pattern File" field. The corresponding terminal output is

Unhandled unknown exception; terminating the application.

These issues occur on both a high-performance compute cluster running CentOS 7 and a laptop running Ubuntu Linux. In both cases, I am using the Debian precompiled binaries for EMSphInx version 0.2. Is the program having difficulty determining the bit depth from the pattern file? I am very interested in getting EMSphInx working with our Oxford EBSD system, so I would welcome any comments or suggestions.

I should note that I have had issues with these AZtec-generated files in EMsoft 5.0.2 as well. In that case, I was able to find a workaround by converting the AZtec .H5OINA file to a NORDIF binary file with kikuchipy. Are there any plans to add NORDIF .dat support to EMSphInx in the future, or any suggestions for getting EMSphInx working with this version of AZtec?

Thanks for your help.

@marcdegraef
Copy link
Collaborator

marcdegraef commented Feb 16, 2023 via email

@hakonanes
Copy link

hakonanes commented Feb 16, 2023

Hi Christian and Marc,

the issue here might be the same I encountered with an Oxford binary .ebsp file which was of version 4 of that file format, which could not be read by kikuchipy (pyxem/kikuchipy#591). In our case, @drowenhorst-nrl figured out that there is one extra byte (1 uint8) between the file version and the pattern byte offsets in the particular file I struggled with. After skipping that byte, the file could be read as previous files (I have acces to) could. The relevant changes to our file reader are here.

To find out if this is actually the case, could you try importing your file with kikuchipy v0.8.0, Christian? After you install/update kikuchipy, you can try the following

import kikuchipy as kp
assert kp.__version__ == "0.8.0"
s = kp.load("cc90b00e-a139-462d-8515-02d667ad026e.ebsp", lazy=True)
s.plot()

If you can run this without an error occuring it is likely that only a small update to EMsoft's reader is necessary to fix this.

@hakonanes
Copy link

In that case, I was able to find a workaround by converting the AZtec .H5OINA file to a NORDIF binary file with kikuchipy. Are there any plans to add NORDIF .dat support to EMSphInx in the future, or any suggestions for getting EMSphInx working with this version of AZtec?

kikuchipy's h5ebsd file format can be read in EMsoft using its "EMEBSD" format. If you indeed can read your file with kikuchipy v0.8, you can write it to the h5ebsd format which seems to me to be readable by EMSphinx, based on the docs.

@christiankwalters
Copy link
Author

Marc and Håkon, thanks for your quick replies!

I think Marc's suggestion about the compressed .ebsp file was exactly right. When I followed Håkon's kikuchipy method, I received the following output:

NotImplementedError: Cannot read compressed EBSD patterns from <filename>

indicating that the file had been compressed by AZtec. When I try with an older uncompressed file instead, I get the expected plot from kikuchipy, and EMSphInxEBSD no longer crashes when inputting the filename.

With the uncompressed file, I am now able to make it through the process of setting up the namelist file with the graphical wizard. Unfortunately, I am now getting a segfault when I try to run the indexing program. If I manage to find a solution, I will post it here so others might possibly benefit from it.

Many thanks,

Christian

@hakonanes
Copy link

I think Marc's suggestion about the compressed .ebsp file was exactly right.

Indeed!

With the uncompressed file, I am now able to make it through the process of setting up the namelist file with the graphical wizard. Unfortunately, I am now getting a segfault when I try to run the indexing program. If I manage to find a solution, I will post it here so others might possibly benefit from it.

If you allow kikuchipy to print the log (executing kp.set_log_level("DEBUG") before attempting to loading the data), you should see which version of the .ebsp file format your file is in. If it is version 4, I believe the EMsoft reader needs the fix I mentioned above. You could try reading it in kikuchipy and writing to the h5ebsd format, and then read it into EMSphinx using the HDF5 reader. That might work.

@marcdegraef
Copy link
Collaborator

marcdegraef commented Feb 17, 2023 via email

@hakonanes
Copy link

I'm not sure where the version number is located in
such a file...

In files of version > 0 it should be the first 8 bytes ("long long", "int64" in Python). In all files of version > 0 I have access to, the version is a negative number in the range 1-4.

In version 4, what does the extra byte look like ?

In the two files of version 4 I have access to (I sent these to you via email), the extra byte is zero. I do not know what it means...

@drowenhorst-nrl
Copy link

As a point of clarification - the 8 byte version number, as a signed int is negative, unless the version number is 0! Thus (and I think FORTRAN does not support unsigned ints?)
Forgive my attempt at FORTRAN,

INTEGER (Int64) :: version
INTEGER(Int8)::mysterybyte
* ...  file open statements yada yada 
READ(fileid) version
IF (version GT 0) THEN 
    version = 0
ELSE
    version = -1 * version

IF (version GE 4) THEN 
    READ(fileid) mysterybyte
* continue on reading as though it is a version 2 EBSP file (near as we can tell). 

@marcdegraef
Copy link
Collaborator

marcdegraef commented Feb 17, 2023 via email

@christiankwalters
Copy link
Author

Håkon, in answer to your questions: When I load the newer .ebsp file, I receive the following debug output from kikuchipy:

DEBUG:kikuchipy.io.plugins.oxford_binary:Reading Oxford binary file of version 4
DEBUG:kikuchipy.io.plugins.oxford_binary:Unknown byte (uint8) in file of version 4: 1

In this case, it appears that the file is version 4, and the extra byte is present. The older, uncompressed .ebsp file seems to be version 2, so the extra byte can't be causing the error in that case. When I try to use the version 2 file with IndexEBSD, I get the following message:

            • warning: some namelist parameters weren't used: patdset,scanname * * * * * *

malloc(): mismatching next->prev_size (unsorted)
Aborted (core dumped)

A quick Google search suggests this is a C memory error. I also tried your suggestion of importing the older .ebsp file with kikuchipy, saving it as an h5ebsd file, and using that as the input to IndexEBSD. I received the following error message after doing so:

            • warning: some namelist parameters weren't used: scanname * * * * * *

/home/christianwalters/EMsoft/EMSphInx/patterns.h5 doesn't have a Manufacturer string

Here, it seems EMSphInx is expecting a dataset named 'Manufacturer' which contains the name of the instrument vendor, for example "EDAX". I tried getting around this by adding a new dataset to the file with h5py:

mfg = f.create_dataset('/Manufacturer',(1,),data="Oxford", dtype="|S5")

and after doing this, I receive a different error message from IndexEBSD.

HDF5-DIAG: Error detected in HDF5 (1.8.20) thread 0:
#000: /home/will/Documents/emsphinxpublic/build/hdf5/src/hdf5/src/H5D.c line 363 in H5Dopen2(): not a dataset
major: Dataset
minor: Inappropriate type
HDF5-DIAG: Error detected in HDF5 (1.8.20) thread 0:
#000: /home/will/Documents/emsphinxpublic/build/hdf5/src/hdf5/src/H5D.c line 363 in H5Dopen2(): not a dataset
major: Dataset
minor: Inappropriate type
H5 error attempting to read EBSD patterns:
file - /home/christianwalters/EMsoft/EMSphInx/patterns.h5
dset - /Scan 1/EBSD/Data/patterns
func - H5File::openDataSet
detailed message:
H5Dopen2 failed

Still, this is much closer than I have been able to get in the past. Thanks again for your help.

Regards, Christian

@hakonanes
Copy link

hakonanes commented Feb 19, 2023

Thank you for doing this test, Christian.

Here, it seems EMSphInx is expecting a dataset named 'Manufacturer' which contains the name of the instrument vendor, for example "EDAX". I tried getting around this by adding a new dataset to the file with h5py:

The kikuchipy h5ebsd format (still evolving) has a manufacturer dataset named "manufacturer" in the top group, not "Manufacturer". It seems like EMSphinx is case sensitive when trying to read HDF datasets. The other error message you get is not possible to explain based on the message itself.

A final thought: EMSphinx also reads the EMsoft binary format (.data file). @marcdegraef, can you confirm that the format is just one long byte string starting with the first pixel in the first pattern and ending with the last pixel in the last pattern? If so, it's identical to the NORDIF binary .dat file format and converting any file with kikuchipy to a NORDIF .dat file and changing the file ending to .data should make it readable by EMSphinx. I haven't tested this myself, but will see if I can get to it.

@marcdegraef
Copy link
Collaborator

marcdegraef commented Feb 19, 2023 via email

@marcdegraef
Copy link
Collaborator

marcdegraef commented Feb 21, 2023 via email

@christiankwalters
Copy link
Author

Marc,

Thanks very much for taking the time to do this. I am able to compile the updated code on Ubuntu, and the EMSphInxEBSD program is now able to handle my older (version 2) uncompressed .ebsp files with no issues. These weren't working previously, so this will be a big help to my research group. One of the EBSD systems at my institution is still on an older version of AZtec.

So far, I have not been successful at converting my compressed version 4 .ebsp file to an uncompressed file. It seems AZtec simply copies the compressed .ebsp files when using "Save As..." to make a copy of the project, even after turning off the compression option. I'll collect some fresh data when I have a chance, and I will post an update here once I am able to try the latest code with an uncompressed version 4 .ebsp. Hopefully, this will only take a day or two.

Many thanks,
Christian

@christiankwalters
Copy link
Author

Hello everyone,

My apologies for the very long delay in responding to this thread.

Since my last post, I've gone back and collected EBSD data from an electropolished cubic material (point group m-3m), saving it as an uncompressed version 4 .ebsp file with patterns of 622x512 pixels. I'm happy to report that the updated code accepts the version 4 .ebsp file and runs to completion with no crashes. Marc, thanks again for your help!

Unfortunately, it seems that there is a new problem. If I input the .ebsp file directly into EMSphInx, the following IPF-Z map is the result:
image

For reference, here is the correct map. My first thought was that this is some sort of sample-detector geometry issue.
image

Something I discovered while experimenting with the tutorial Ni dataset is that EMSphInx expects the patterns in a binary .data file to be flipped relative to how they are stored in the EDAX .h5 file format. If I flip the binary patterns before indexing them (for example, with numpy.flip() in Python), EMSphInx recovers exactly the same result as the .h5 file.
Ni tutorial dataset EMSphInx troubleshooting

I tried to do something similar with my dataset, i.e., converting the patterns to 32-bit floats, inverting them with numpy.flip(), and storing them in row-major form as a .data file. The results are dramatically improved, but there seems to be a persistent mis-indexing issue: possibly still a sample-detector geometry problem?
image

This persistent issue is what I am currently stuck on. I initially suspected that it could be a pattern center problem. Starting with the pattern center (x*, y*, z*) = (0.520, 0.551, 0.659) outputted by AZtec, I arrived at coordinates of (xpc, ypc, L) = (12.44, 86.722, 15552) for EMsoft version 4. My binned CCD pixel size is 37.94 μm. To further check my work, I used the EBSDDetector.pc_bruker() conversion tool in kikuchipy to convert these coordinates to Bruker convention, arriving at (0.52, 0.3306, 0.8006). The result was the same regardless of the convention I used in EMSphInx.

I know this code is not really maintained anymore, and I appreciate the help you've offered me so far. Are there any strategies I could try next to troubleshoot this issue?

Many thanks,
Christian

@drowenhorst-nrl
Copy link

One suggestion I might add in terms of a processing pipeline is that you can convert your data to a EDAX up1 or up2 file ... there is a small amount of header information, but otherwise it would be basically the same as the binary .data file.

Regarding your PC calcs, they match my understanding of the conversion from Oxford to EMSoft. I will make the comment that if you are unsure that your PC values are wrong, PyEBSDIndex is working on PC optimization routines that are fairly robust (can often get a good refinement with virtually no knowledge of a starting point).

Question: are you using the .sht files from the EMSphInx database, or are you converting from a EMSoft generated master pattern? I have found that I was getting different (worse) results when using the database, versus when I used a converted master pattern. Not sure what the reason (bug) is, but you might want to attempt making your own master pattern.
Finally, I would suggest that there is always dictionary indexing with refinement.

@hakonanes
Copy link

Starting with the pattern center (x*, y*, z*) = (0.520, 0.551, 0.659) outputted by AZtec, I arrived at coordinates of (xpc, ypc, L) = (12.44, 86.722, 15552) for EMsoft version 4. My binned CCD pixel size is 37.94 μm.

If I assume your detector (Ny, Nx) = (512, 622) is binned by a factor of b = 2, and that the unbinned pixel size is delta = 37.94 / 2 = 18.97 um, then I arrive at an EMsoft v4 PC of (xpc, ypc, L) = (24.88, 173.444, 15551.53012) using:

xpc = -Nx * b * (0.5 - x*)
ypc = Ny * b * ((Nx / Ny) * y* - 0.5)
L = Nx * b * delta * z*

I arrived at these equations by going from Oxford -> Bruker -> EMsoft v5 and negating xpc using those listed in kikuchipy (Oxford -> Bruker and Bruker -> EMsoft).

Could you try with these updated xpc and ypc values instead?

Verification using kikuchipy:

>>> import kikuchipy as kp
>>> det = kp.detectors.EBSDDetector(shape=(512, 622), pc=(0.52, 0.551, 0.659), px_size=37.94 / 2, binning=2, sample_tilt=70, convention="oxford")
>>> det
EBSDDetector (512, 622), px_size 18.97 um, binning 2, tilt 0, azimuthal 0, pc (0.52, 0.331, 0.801)
>>> det.pc_bruker()
array([[0.52      , 0.33062109, 0.80058203]])
>>> det.pc_emsoft(version=4)
array([[   24.88   ,   173.444  , 15551.53012]])

Sidenote: May I ask which detector you have? The pixel size of about 20 um is quite small, right? Our NORDIF UF-1100 has a pixel size of about 70 um. I'm asking because I'd like to make available a table of pixel sizes for different detectors in kikuchipy. This will only be useful if the pixel size is the same for each detector model, though, and I don't know if this is the case. Any input here would be very much appreciated.

@drowenhorst-nrl
Copy link

Ah! @hakonanes illustrates a good point. I was assuming that if you are providing the pixel and detector size in binned dimensions (e.g. 512, 622, 37.94µm) then the binning setting should be set to 1. Alternatively, you can use the calculations that Håkon used (note, they are simply 2*(xpc, ypc) of your original values) and using a binning value of 2. Marc can comment further, but I am pretty sure that when indexing, there is no difference between these two. However, if simulating patterns, with estimated intensities especially with added noise, EMSoft will provide different answers depending on which way you express the binning.

@hakonanes
Copy link

I was assuming that if you are providing the pixel and detector size in binned dimensions (e.g. 512, 622, 37.94µm) then the binning setting should be set to 1

Right... The position of the PC is the same in these two cases. Thank you for pointing this out, @drowenhorst-nrl. @christiankwalters, the PC conversions shouldn't be the issue, then.

@christiankwalters
Copy link
Author

@hakonanes, I think your table of detector pixel sizes sounds like a helpful addition! I am using Oxford's first-generation "Symmetry" detector with a full pattern resolution of 1244 x 1024 pixels. My 18.97 μm pixel size refers to the full-resolution patterns; the binned pixel sizes would be 37.94 and 75.88 μm, respectively, for the (622x512) and (311x256) binned patterns.

As an aside: If any other Oxford users are curious about their detector pixel size, you can acquire an EBSP in the Point Analysis mode and look in the details in the data tree. This shows the sample to screen distance in mm alongside the corresponding z* value. This is how I arrived at δ = L / (Nx * b * z*) = (13710) / (622 * 2 * 0.581) = 18.97 μm. That gives a detector width around 23.6 mm, which is reasonable.

@drowenhorst-nrl I did generate my own EMsoft master pattern for this dataset, although that's good to know for the future. Thanks for the suggestion about the .up1/.up2 files. I will have to give that a closer look.

@drowenhorst-nrl
Copy link

@christiankwalters they are pretty easy from python. An extension of .up1 indicates 8-bit patterns, .up2 indicates 16-bit.
For version 3 up* files then (which are compatible with EMSoft/SphInx it is just

with open('myfilename.up1','w+b') as f: 
  f.seek(0)
  np.asarray([3], dtype=np.uint32).tofile(f) # version number
  np.asarray(patternW, dtype=np.uint32).tofile(f)
  np.asarray(patternH, dtype=np.uint32).tofile(f)
  np.asarray([42], dtype=np.uint32).tofile(f) # position in bytes from start of file to the first pattern.  
  np.asarray(extraPatterns, dtype=np.uint8).tofile(f) # not really sure about why this is there, but safe to set = 0
  np.asarray(nCols, dtype=np.uint32).tofile(f)
  np.asarray(nRows, dtype=np.uint32).tofile(f)
  np.asarray(hexflag, dtype=np.uint8).tofile(f) # are the patterns in a hexagonal scan 0=False, 1=True
  np.asarray(xStep, dtype=np.float64).tofile(f)
  np.asarray(yStep, dtype=np.float64).tofile(f)
  np.asarray(patterns, dtype=np.uint8).tofile(f) 
# assuming writing a up1 file, and patterns is a numpy array with shape (nCols*nRow, patternH, patternW).  

I will leave any scaling from floats to 8/16-bit up to you.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants