Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Panic while reading xls #481

Open
prokie opened this issue Nov 29, 2024 · 10 comments
Open

Panic while reading xls #481

prokie opened this issue Nov 29, 2024 · 10 comments

Comments

@prokie
Copy link

prokie commented Nov 29, 2024

Hi When I try to open a workbook on one of my xls files I get the following error.

fn main() -> Result<()> {
    let mut workbook: Xls<_> = open_workbook("bla.xls")?;
    Ok(())
}
thread 'main' panicked at calamine-0.26.1/src/cfb.rs:362:9:
assertion `left == right` failed: i=3062, len=4334
  left: 0
 right: 3
stack backtrace:
   0:     0x7f7c0cfbab35 - <std::sys::backtrace::BacktraceLock::print::DisplayBacktrace as core::fmt::Display>::fmt::h358afad87e02ca76
   1:     0x7f7c0cfef77b - core::fmt::write::hb19b5b269a2fe458
   2:     0x7f7c0cfb8c1f - std::io::Write::write_fmt::he5a92676a45ef09d
   3:     0x7f7c0cfbbc81 - std::panicking::default_hook::{{closure}}::h3bff550b24d93725
   4:     0x7f7c0cfbb95c - std::panicking::default_hook::hd53b1b06d2b99687
   5:     0x7f7c0cfbc251 - std::panicking::rust_panic_with_hook::h9fdd87cddb2763da
   6:     0x7f7c0cfbc147 - std::panicking::begin_panic_handler::{{closure}}::h089783ab6b5cba45
   7:     0x7f7c0cfbaff9 - std::sys::backtrace::__rust_end_short_backtrace::hed34776d77ef7922
   8:     0x7f7c0cfbbdd4 - rust_begin_unwind
   9:     0x7f7c0ced69d3 - core::panicking::panic_fmt::h300583f35f37447a
  10:     0x7f7c0ced6dcf - core::panicking::assert_failed_inner::hafb0e3d63cb01ba6
  11:     0x7f7c0ced3dcf - core::panicking::assert_failed::h5549a7e67ae6daf2
  12:     0x7f7c0cf7f447 - calamine::cfb::decompress_stream::he271a08c0ffe52cb
  13:     0x7f7c0cf0afe7 - <alloc::vec::into_iter::IntoIter<T,A> as core::iter::traits::iterator::Iterator>::try_fold::hdf21cd55dade2c87
  14:     0x7f7c0ceedac1 - alloc::vec::in_place_collect::from_iter_in_place::h7d46edc2f8f2af80
  15:     0x7f7c0ceea3b9 - <alloc::collections::btree::map::BTreeMap<K,V> as core::iter::traits::collect::FromIterator<(K,V)>>::from_iter::hc1b6dd616abfaa91
  16:     0x7f7c0cf03bf2 - calamine::vba::VbaProject::from_cfb::h275adceebcd448df
  17:     0x7f7c0cee6b0d - calamine::xls::Xls<RS>::new_with_options::hed3b834fcadb6634
  18:     0x7f7c0cee63ec - calamine::open_workbook::hfc42529ea9511aa1
  19:     0x7f7c0cf0461f - pontus::main::hb4eed906d3c6d221
  20:     0x7f7c0cf10023 - std::sys::backtrace::__rust_begin_short_backtrace::h998b547d7489787b
  21:     0x7f7c0cefbc8d - std::rt::lang_start::{{closure}}::h2e2caad5b5a6f960
  22:     0x7f7c0cfb3af7 - std::rt::lang_start_internal::h93b3b742566fb30c
  23:     0x7f7c0cf06255 - main

Does this mean that the excel file is broken or why is it crashing?

@prokie
Copy link
Author

prokie commented Nov 30, 2024

It seems that the VBA script was the issue, I removed it and now the xls file parses without issue.

@sftse
Copy link
Contributor

sftse commented Dec 2, 2024

Can you provide a test case to investigate the issue?

@prokie
Copy link
Author

prokie commented Dec 2, 2024

Yes sure, I will see if I can remove everything sensitive in the xls and upload it here.

@prokie
Copy link
Author

prokie commented Dec 9, 2024

I am trying to remove all the confidential information, but is not easy. But the issue seems to be related to whitespace in VBA script.

If I add a newline in the VBA script the parsing will succeed every time, but if I add too many newlines I get the error at line 362 in cfb.rs consistently. There are two different failing behaviors, either it fails and left is always 2 or it fails and left is always 0.

@sftse
Copy link
Contributor

sftse commented Dec 9, 2024

If the issue is solely in the VBA script, it should be feasible to extract it using the rust-cfb crate into a separate CFB file. If after doing that calamine fails at the same line, this might help narrow down the cause.

@prokie
Copy link
Author

prokie commented Dec 9, 2024

Okay thanks, I will try that. I will also keep working on getting an example xls w/o confidential info..

@prokie
Copy link
Author

prokie commented Dec 10, 2024

@sftse Can you give example on how to do that using rust-cfb?

@sftse
Copy link
Contributor

sftse commented Dec 12, 2024

fn main() -> Result<(), Box<dyn std::error::Error>> {
    let in_path = "foo.xls";
    let out_path = "bar.xls";
    let mut original = cfb::open(in_path)?;
    let version = original.version();
    let out_file = File::create(out_path).unwrap();
    let mut duplicate = cfb::CompoundFile::create_with_version(version, out_file)?;
    let mut stream_paths = Vec::<std::path::PathBuf>::new();
    for entry in original.walk() {
        if entry.path().to_str().unwrap().contains("VBA") {
            if entry.is_storage() {
                if !entry.is_root() {
                    duplicate.create_storage(entry.path())?;
                }
                duplicate.set_storage_clsid(entry.path(), entry.clsid().clone())?;
            } else {
                stream_paths.push(entry.path().to_path_buf());
            }
        }
    }
    for path in stream_paths.iter() {
        std::io::copy(
            &mut original.open_stream(path)?,
            &mut duplicate.create_new_stream(path)?,
        )?;
    }
    Ok(())
}

@prokie
Copy link
Author

prokie commented Dec 18, 2024

Thanks for that source code. I ran basically the same but added the xls read from calamine afterwards and it still gives the same error,

use std::fs::File;

use calamine::{open_workbook, Xls};

fn main() -> Result<(), Box<dyn std::error::Error>> {
    env_logger::init();
    let in_path = "foo.xls";
    let out_path = "bar.xls";
    let mut original = cfb::open(in_path)?;
    let version = original.version();
    let out_file = File::create(out_path).unwrap();
    let mut duplicate = cfb::CompoundFile::create_with_version(version, out_file)?;
    let mut stream_paths = Vec::<std::path::PathBuf>::new();
    for entry in original.walk() {
        if entry.path().to_str().unwrap().contains("VBA") {
            if entry.is_storage() {
                if !entry.is_root() {
                    duplicate.create_storage(entry.path())?;
                }
                duplicate.set_storage_clsid(entry.path(), entry.clsid().clone())?;
            } else {
                stream_paths.push(entry.path().to_path_buf());
            }
        }
    }
    for path in stream_paths.iter() {
        std::io::copy(
            &mut original.open_stream(path)?,
            &mut duplicate.create_new_stream(path)?,
        )?;
    }
    let _: Xls<_> = open_workbook("bar.xls")?;

    Ok(())
}
assertion `left == right` failed: i=3062, len=4338
  left: 0
 right: 3

@sftse
Copy link
Contributor

sftse commented Dec 18, 2024

This reduced cfb file should have most of the sensitive parts removed. If you look through the remaining information and find it acceptable to publish, can you upload it as a test case?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants