Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Current (unstable) glib::Regex::match_* API is unsound (use-after-free) #1223

Closed
zachs18 opened this issue Nov 11, 2023 · 5 comments · Fixed by #1225
Closed

[BUG] Current (unstable) glib::Regex::match_* API is unsound (use-after-free) #1223

zachs18 opened this issue Nov 11, 2023 · 5 comments · Fixed by #1225
Labels
bug Something isn't working

Comments

@zachs18
Copy link
Contributor

zachs18 commented Nov 11, 2023

Bug description

The GMatchInfo returned by g_regex_match references the string that was passed to g_regex_match, but in the Rust bindings, string is from IntoGStr::run_with_gstr, so if anything but a &GStr or &GString is passed, it may be a temporary GString that is dropped immediately, or a reference to stack memory that later gets reused. Even in the case of passing in a &GStr or &GString, the returned MatchInfo could be used after the underlying GStr/GString is modified/dropped.

fn main() {
    let r = glib::Regex::new("hello", glib::RegexCompileFlags::DEFAULT, glib::RegexMatchFlags::DEFAULT).expect("valid regex").expect("valid regex");

    let m = r.match_("hello", glib::RegexMatchFlags::DEFAULT).expect("should match");

    dbg!(m.fetch_all());
    dbg!(m.fetch_all());
}
Cargo.toml
[package]
name = "glib-regex"
version = "0.1.0"
edition = "2021"

[dependencies]
glib = { git = "https://github.com/gtk-rs/gtk-rs-core", version = "0.19.0", rev = "24ac2075336c7a1ccd5058fd3b732f6616f2f9da" }

Backtrace

Backtrace
$ RUST_BACKTRACE=full cargo run
    Skipping git submodule `https://github.com/gtk-rs/gir` due to update strategy in .gitmodules
    Skipping git submodule `https://github.com/gtk-rs/gir-files` due to update strategy in .gitmodules
    Finished dev [unoptimized + debuginfo] target(s) in 0.02s
     Running `/home/zachary/opt_mount/zachary/cargo-target/debug/glib-regex`
[src/main.rs:8] m.fetch_all() = [
    "hello",
]
thread 'main' panicked at /home/zachary/.cargo/git/checkouts/gtk-rs-core-7be42ca38bd6361c/24ac207/glib/src/gstring.rs:1921:9:
assertion failed: cstr.to_str().is_ok()
stack backtrace:
   0:     0x55771f85525c - std::backtrace_rs::backtrace::libunwind::trace::he43a6a3949163f8c
                               at /rustc/cc66ad468955717ab92600c770da8c1601a4ff33/library/std/src/../../backtrace/src/backtrace/libunwind.rs:93:5
   1:     0x55771f85525c - std::backtrace_rs::backtrace::trace_unsynchronized::h50db52ca99f692e7
                               at /rustc/cc66ad468955717ab92600c770da8c1601a4ff33/library/std/src/../../backtrace/src/backtrace/mod.rs:66:5
   2:     0x55771f85525c - std::sys_common::backtrace::_print_fmt::hd37d595f2ceb2d3c
                               at /rustc/cc66ad468955717ab92600c770da8c1601a4ff33/library/std/src/sys_common/backtrace.rs:67:5
   3:     0x55771f85525c - <std::sys_common::backtrace::_print::DisplayBacktrace as core::fmt::Display>::fmt::h678bbcf9da6d7d75
                               at /rustc/cc66ad468955717ab92600c770da8c1601a4ff33/library/std/src/sys_common/backtrace.rs:44:22
   4:     0x55771f874d7c - core::fmt::rt::Argument::fmt::h3a159adc080a6fc9
                               at /rustc/cc66ad468955717ab92600c770da8c1601a4ff33/library/core/src/fmt/rt.rs:138:9
   5:     0x55771f874d7c - core::fmt::write::hb8eaf5a8e45a738e
                               at /rustc/cc66ad468955717ab92600c770da8c1601a4ff33/library/core/src/fmt/mod.rs:1094:21
   6:     0x55771f852eae - std::io::Write::write_fmt::h9663fe36b2ee08f9
                               at /rustc/cc66ad468955717ab92600c770da8c1601a4ff33/library/std/src/io/mod.rs:1714:15
   7:     0x55771f855044 - std::sys_common::backtrace::_print::hcd4834796ee88ad2
                               at /rustc/cc66ad468955717ab92600c770da8c1601a4ff33/library/std/src/sys_common/backtrace.rs:47:5
   8:     0x55771f855044 - std::sys_common::backtrace::print::h1360e9450e4f922a
                               at /rustc/cc66ad468955717ab92600c770da8c1601a4ff33/library/std/src/sys_common/backtrace.rs:34:9
   9:     0x55771f856603 - std::panicking::default_hook::{{closure}}::h2609fa95cd5ab1f4
                               at /rustc/cc66ad468955717ab92600c770da8c1601a4ff33/library/std/src/panicking.rs:270:22
  10:     0x55771f85631c - std::panicking::default_hook::h6d75f5747cab6e8d
                               at /rustc/cc66ad468955717ab92600c770da8c1601a4ff33/library/std/src/panicking.rs:290:9
  11:     0x55771f856b89 - std::panicking::rust_panic_with_hook::h57e78470c47c84de
                               at /rustc/cc66ad468955717ab92600c770da8c1601a4ff33/library/std/src/panicking.rs:707:13
  12:     0x55771f856a41 - std::panicking::begin_panic_handler::{{closure}}::h3dfd2453cf356ecb
                               at /rustc/cc66ad468955717ab92600c770da8c1601a4ff33/library/std/src/panicking.rs:597:13
  13:     0x55771f855786 - std::sys_common::backtrace::__rust_end_short_backtrace::hdb177d43678e4d7e
                               at /rustc/cc66ad468955717ab92600c770da8c1601a4ff33/library/std/src/sys_common/backtrace.rs:170:18
  14:     0x55771f8567d2 - rust_begin_unwind
                               at /rustc/cc66ad468955717ab92600c770da8c1601a4ff33/library/std/src/panicking.rs:595:5
  15:     0x55771f82c1b3 - core::panicking::panic_fmt::hd1e971d8d7c78e0e
                               at /rustc/cc66ad468955717ab92600c770da8c1601a4ff33/library/core/src/panicking.rs:67:14
  16:     0x55771f82c243 - core::panicking::panic::hd907d42d09324a32
                               at /rustc/cc66ad468955717ab92600c770da8c1601a4ff33/library/core/src/panicking.rs:117:5
  17:     0x55771f82fa4c - <glib::gstring::GString as glib::translate::FromGlibPtrFull<*mut u8>>::from_glib_full::hc0f54b200a06c1bf
                               at /home/zachary/.cargo/git/checkouts/gtk-rs-core-7be42ca38bd6361c/24ac207/glib/src/gstring.rs:1921:9
  18:     0x55771f8313c6 - glib::translate::from_glib_full::h100aa9698a3e7d50
                               at /home/zachary/.cargo/git/checkouts/gtk-rs-core-7be42ca38bd6361c/24ac207/glib/src/translate.rs:1631:5
  19:     0x55771f82fa67 - <glib::gstring::GString as glib::translate::FromGlibPtrFull<*mut i8>>::from_glib_full::hdbc169219363f076
                               at /home/zachary/.cargo/git/checkouts/gtk-rs-core-7be42ca38bd6361c/24ac207/glib/src/gstring.rs:1933:9
  20:     0x55771f831416 - glib::translate::from_glib_full::heb426ec8c2eaffef
                               at /home/zachary/.cargo/git/checkouts/gtk-rs-core-7be42ca38bd6361c/24ac207/glib/src/translate.rs:1631:5
  21:     0x55771f82fd16 - <glib::gstring::GString as glib::translate::FromGlibContainerAsVec<*mut i8,*mut *mut i8>>::from_glib_full_num_as_vec::h5dbbbd5fdfaa02f0
                               at /home/zachary/.cargo/git/checkouts/gtk-rs-core-7be42ca38bd6361c/24ac207/glib/src/translate.rs:2121:25
  22:     0x55771f82fd89 - <glib::gstring::GString as glib::translate::FromGlibPtrArrayContainerAsVec<*mut i8,*mut *mut i8>>::from_glib_full_as_vec::h87ace893d28fafdc
                               at /home/zachary/.cargo/git/checkouts/gtk-rs-core-7be42ca38bd6361c/24ac207/glib/src/translate.rs:2140:17
  23:     0x55771f832087 - <alloc::vec::Vec<T> as glib::translate::FromGlibPtrContainer<P,PP>>::from_glib_full::hf9dc6875251cb6d5
                               at /home/zachary/.cargo/git/checkouts/gtk-rs-core-7be42ca38bd6361c/24ac207/glib/src/translate.rs:2195:9
  24:     0x55771f8305fc - glib::auto::match_info::MatchInfo::fetch_all::h63e46969f2179ffe
                               at /home/zachary/.cargo/git/checkouts/gtk-rs-core-7be42ca38bd6361c/24ac207/glib/src/auto/match_info.rs:27:13
  25:     0x55771f82ea7b - glib_regex::main::h389405d5b91adbb3
                               at /home/zachary/Programming/rusttesting/glib-regex/src/main.rs:9:10
  26:     0x55771f82dcfb - core::ops::function::FnOnce::call_once::h2307b112f2b8e81d
                               at /rustc/cc66ad468955717ab92600c770da8c1601a4ff33/library/core/src/ops/function.rs:250:5
  27:     0x55771f82d5ae - std::sys_common::backtrace::__rust_begin_short_backtrace::h8d3af7cc3e64616c
                               at /rustc/cc66ad468955717ab92600c770da8c1601a4ff33/library/std/src/sys_common/backtrace.rs:154:18
  28:     0x55771f82d581 - std::rt::lang_start::{{closure}}::heb777d9131b8d833
                               at /rustc/cc66ad468955717ab92600c770da8c1601a4ff33/library/std/src/rt.rs:166:18
  29:     0x55771f850a8b - core::ops::function::impls::<impl core::ops::function::FnOnce<A> for &F>::call_once::hbcc4f8a3f5ada78c
                               at /rustc/cc66ad468955717ab92600c770da8c1601a4ff33/library/core/src/ops/function.rs:284:13
  30:     0x55771f850a8b - std::panicking::try::do_call::he5f117a9e13dadde
                               at /rustc/cc66ad468955717ab92600c770da8c1601a4ff33/library/std/src/panicking.rs:502:40
  31:     0x55771f850a8b - std::panicking::try::h2f3af9afce3a0443
                               at /rustc/cc66ad468955717ab92600c770da8c1601a4ff33/library/std/src/panicking.rs:466:19
  32:     0x55771f850a8b - std::panic::catch_unwind::h6d6c387f38ef05ea
                               at /rustc/cc66ad468955717ab92600c770da8c1601a4ff33/library/std/src/panic.rs:142:14
  33:     0x55771f850a8b - std::rt::lang_start_internal::{{closure}}::h6ca09d5905711415
                               at /rustc/cc66ad468955717ab92600c770da8c1601a4ff33/library/std/src/rt.rs:148:48
  34:     0x55771f850a8b - std::panicking::try::do_call::ha9fd18ea06654a4b
                               at /rustc/cc66ad468955717ab92600c770da8c1601a4ff33/library/std/src/panicking.rs:502:40
  35:     0x55771f850a8b - std::panicking::try::hda5c2a4432362341
                               at /rustc/cc66ad468955717ab92600c770da8c1601a4ff33/library/std/src/panicking.rs:466:19
  36:     0x55771f850a8b - std::panic::catch_unwind::h440f731b142bc235
                               at /rustc/cc66ad468955717ab92600c770da8c1601a4ff33/library/std/src/panic.rs:142:14
  37:     0x55771f850a8b - std::rt::lang_start_internal::hc0b4e50f058f62ce
                               at /rustc/cc66ad468955717ab92600c770da8c1601a4ff33/library/std/src/rt.rs:148:20
  38:     0x55771f82d55a - std::rt::lang_start::hacd555fa1bbfe3e2
                               at /rustc/cc66ad468955717ab92600c770da8c1601a4ff33/library/std/src/rt.rs:165:17
  39:     0x55771f82ed2e - main
  40:     0x7ff57c829d90 - __libc_start_call_main
                               at ./csu/../sysdeps/nptl/libc_start_call_main.h:58:16
  41:     0x7ff57c829e40 - __libc_start_main_impl
                               at ./csu/../csu/libc-start.c:392:3
  42:     0x55771f82c8a5 - _start
  43:                0x0 - <unknown>
Running under valgrind
==585572== Memcheck, a memory error detector
==585572== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.
==585572== Using Valgrind-3.18.1 and LibVEX; rerun with -h for copyright info
==585572== Command: /home/zachary/opt_mount/zachary/cargo-target/debug/glib-regex
==585572== 
==585572== Invalid read of size 1
==585572==    at 0x484EFFF: strncpy (in /usr/libexec/valgrind/vgpreload_memcheck-amd64-linux.so)
==585572==    by 0x4B2E82B: g_strndup (in /usr/lib/x86_64-linux-gnu/libglib-2.0.so.0.7200.4)
==585572==    by 0x4B26076: g_match_info_fetch (in /usr/lib/x86_64-linux-gnu/libglib-2.0.so.0.7200.4)
==585572==    by 0x4B261B0: g_match_info_fetch_all (in /usr/lib/x86_64-linux-gnu/libglib-2.0.so.0.7200.4)
==585572==    by 0x1155ED: glib::auto::match_info::MatchInfo::fetch_all (match_info.rs:27)
==585572==    by 0x11380A: glib_regex::main (main.rs:8)
==585572==    by 0x112CFA: core::ops::function::FnOnce::call_once (function.rs:250)
==585572==    by 0x1125AD: std::sys_common::backtrace::__rust_begin_short_backtrace (backtrace.rs:154)
==585572==    by 0x112580: std::rt::lang_start::{{closure}} (rt.rs:166)
==585572==    by 0x135A8A: std::rt::lang_start_internal (function.rs:284)
==585572==    by 0x112559: std::rt::lang_start (rt.rs:165)
==585572==    by 0x113D2D: main (in /home/zachary/opt_mount/zachary/cargo-target/debug/glib-regex)
==585572==  Address 0x1ffeffefb0 is on thread 1's stack
==585572==  608 bytes below stack pointer
==585572== 
==585572== Invalid read of size 1
==585572==    at 0x484F01D: strncpy (in /usr/libexec/valgrind/vgpreload_memcheck-amd64-linux.so)
==585572==    by 0x4B2E82B: g_strndup (in /usr/lib/x86_64-linux-gnu/libglib-2.0.so.0.7200.4)
==585572==    by 0x4B26076: g_match_info_fetch (in /usr/lib/x86_64-linux-gnu/libglib-2.0.so.0.7200.4)
==585572==    by 0x4B261B0: g_match_info_fetch_all (in /usr/lib/x86_64-linux-gnu/libglib-2.0.so.0.7200.4)
==585572==    by 0x1155ED: glib::auto::match_info::MatchInfo::fetch_all (match_info.rs:27)
==585572==    by 0x11380A: glib_regex::main (main.rs:8)
==585572==    by 0x112CFA: core::ops::function::FnOnce::call_once (function.rs:250)
==585572==    by 0x1125AD: std::sys_common::backtrace::__rust_begin_short_backtrace (backtrace.rs:154)
==585572==    by 0x112580: std::rt::lang_start::{{closure}} (rt.rs:166)
==585572==    by 0x135A8A: std::rt::lang_start_internal (function.rs:284)
==585572==    by 0x112559: std::rt::lang_start (rt.rs:165)
==585572==    by 0x113D2D: main (in /home/zachary/opt_mount/zachary/cargo-target/debug/glib-regex)
==585572==  Address 0x1ffeffefb1 is on thread 1's stack
==585572==  607 bytes below stack pointer
==585572== 
[src/main.rs:8] m.fetch_all() = [
    "hello",
]
thread 'main' panicked at /home/zachary/.cargo/git/checkouts/gtk-rs-core-7be42ca38bd6361c/24ac207/glib/src/gstring.rs:1921:9:
assertion failed: cstr.to_str().is_ok()
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
==585572== 
==585572== HEAP SUMMARY:
==585572==     in use at exit: 38,922 bytes in 258 blocks
==585572==   total heap usage: 340 allocs, 82 frees, 51,502 bytes allocated
==585572== 
==585572== LEAK SUMMARY:
==585572==    definitely lost: 16 bytes in 1 blocks
==585572==    indirectly lost: 6 bytes in 1 blocks
==585572==      possibly lost: 0 bytes in 0 blocks
==585572==    still reachable: 36,884 bytes in 235 blocks
==585572==         suppressed: 0 bytes in 0 blocks
==585572== Rerun with --leak-check=full to see details of leaked memory
==585572== 
==585572== For lists of detected and suppressed errors, rerun with: -s
==585572== ERROR SUMMARY: 10 errors from 2 contexts (suppressed: 0 from 0)

One possible API change to make this sound would be to make glib::Regex::match_ (and similar) take only &GStr (not impl IntoGStr), and have MatchInfo hold a lifetime parameter which expires with that &GStr borrow, e.g.

pub struct MatchInfo<'input> { 
  _lifetime: PhantomData<&'input GStr>,
  ...
};

impl Regex {
  pub fn match_<'input>(
    &self,
    string: &'input GStr, 
    match_options: RegexMatchFlags
  ) -> Option<MatchInfo<'input>>;
  // etc for match_* methods
}

impl<'input> MatchInfo<'input> {
  // g_match_info_get_string returns a reference to the borrowed string, so
  // this could return a borrowed `&'input GStr`, or it could copy to a `GString` like it currently does
  fn string(&self) -> &'input GStr;
  // the rest of `GMatchInfo`'s methods that return `*const gchar` or similar return ownership, so &'input GStr would not be correct there.
}

Another possible change would be to have glib::MatchInfo be more than a wrapper around GMatchInfo and have it hold an owned immutable buffer that the inner GMatchInfo references (i.e. all the Regex::match_* functions copy the passed string into a new heap-allocated buffer, then call the g_regex_match* function with that buffer, and return the GMatchInfo along with the buffer in the glib::MatchInfo).

@zachs18 zachs18 added the bug Something isn't working label Nov 11, 2023
@sdroege
Copy link
Member

sdroege commented Nov 12, 2023

One possible API change to make this sound would be to make glib::Regex::match_ (and similar) take only &GStr (not impl IntoGStr), and have MatchInfo hold a lifetime parameter which expires with that &GStr borrow

I think this is the best option here. Anything involving a new heap-allocated buffer internally in the bindings can easily lead to unexpected performance characteristics.

@sdroege
Copy link
Member

sdroege commented Nov 12, 2023

Do you want to provide a PR for that?

@zachs18
Copy link
Contributor Author

zachs18 commented Nov 13, 2023

I have a WIP branch with MatchInfo having a lifetime parameter. I have a few unresolved questions though:

  1. Some 'static lifetime bounds had to be relaxed to make it compile (removed 'static bound from trait ValueType: 'static; relaxed impl<'a, T: 'static, MM> ToGlibPtr<'a, T> for Shared<T, MM> where MM: 'static to where MM: 'a), and I don't know if those bounds were "load-bearing" with respect to soundness in other parts of the codebase. Actually thinking about it more, if MatchInfo<'_> implements ToValue and FromValue, then you can safely convert a MatchInfo<'short> to Value then Value::get back to MatchInfo<'long> and get a use-after-free, so I think at least the ValueType relaxation is probably not sound, and maybe MatchInfo just shouldn't implement ValueType? Or maybe it should implement ValueType but not ToValue/FromValue?
  2. I manually changed the autogenerated src/auto/match_info.rs instead of modifying Gir.toml, since I don't think gir can currently emit a type with a lifetime parameter. It might be better to just make MatchInfo not be autogenerated at all?
  3. I changed the glib::wrapper! and glib::shared::glib_shared_wrapper! macros for only Shared<_> inputs; but all of the wrapper! branches could probably accept lifetime generics (not strictly necessary for this issue). Or, those changes could be removed and everything implemented manually for MatchInfo?

@sdroege
Copy link
Member

sdroege commented Nov 13, 2023

and maybe MatchInfo just shouldn't implement ValueType

I'd go with that for now, yes.

It might be better to just make MatchInfo not be autogenerated at all?

Yes

but all of the wrapper! branches could probably accept lifetime generics

In theory that should already work

everything implemented manually for MatchInfo

That seems fine to me too. This type seems special enough that a manual implementation seems OK

@zachs18
Copy link
Contributor Author

zachs18 commented Nov 13, 2023

Okay, I opened a PR (#1225) with the suggested changes. Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants