-
Notifications
You must be signed in to change notification settings - Fork 19
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Crash due to TRAILER is unexpectedly appended to the end of PAYLOAD buffer #196
Comments
Have you tried the |
Oh yes, I use it, and I'm quite unhappy, because it still does not support containers evaluation. The things is most of the Rust data items are containers, like C++ STL. For example, I may often wish to print a C++ std::vector, but it cannot be printed directly, it's not an array. GDB has special treatment of std::vector to print it, though. That's what we still badly miss from rust-gdb, and what ruins the debugging experience. |
Thread 3 "stream" hit Breakpoint 2, cameleon_device::u3v::protocol::stream::Trailer::parse_prefix (cursor=0x7ffff71fd188)
at device/src/u3v/protocol/stream.rs:451
451 Err(Error::InvalidPacket("invalid prefix magic".into())) |
(gdb) bt
#0 cameleon_device::u3v::protocol::stream::Trailer::parse_prefix (cursor=0x7ffff71fd188) at device/src/u3v/protocol/stream.rs:451
#1 0x000055555563095d in cameleon_device::u3v::protocol::stream::Trailer::parse<alloc::vec::Vec<u8, alloc::alloc::Global>> (buf=0x7ffff71fd7d8)
at device/src/u3v/protocol/stream.rs:397
#2 0x00005555556084b2 in cameleon::u3v::stream_handle::StreamingLoop::run (self=...) at cameleon/src/u3v/stream_handle.rs:261
#3 0x0000555555610d6c in cameleon::u3v::stream_handle::{impl#1}::start_streaming_loop::{closure#1} () at cameleon/src/u3v/stream_handle.rs:113
#4 0x000055555565ab76 in std::sys::backtrace::__rust_begin_short_backtrace<cameleon::u3v::stream_handle::{impl#1}::start_streaming_loop::{closure_env#1}, ()>
(f=...) at /rustc/f6e511eec7342f59a25f7c0534f1dbea00d01b14/library/std/src/sys/backtrace.rs:154
#5 0x0000555555634c1d in std::thread::{impl#0}::spawn_unchecked_::{closure#1}::{closure#0}<cameleon::u3v::stream_handle::{impl#1}::start_streaming_loop::{closure_env#1}, ()> () at /rustc/f6e511eec7342f59a25f7c0534f1dbea00d01b14/library/std/src/thread/mod.rs:522
#6 0x000055555566a0c4 in core::panic::unwind_safe::{impl#23}::call_once<(), std::thread::{impl#0}::spawn_unchecked_::{closure#1}::{closure_env#0}<cameleon::u3v::stream_handle::{impl#1}::start_streaming_loop::{closure_env#1}, ()>> (self=...)
at /rustc/f6e511eec7342f59a25f7c0534f1dbea00d01b14/library/core/src/panic/unwind_safe.rs:272
#7 0x000055555565af54 in std::panicking::try::do_call<core::panic::unwind_safe::AssertUnwindSafe<std::thread::{impl#0}::spawn_unchecked_::{closure#1}::{closure_env#0}<cameleon::u3v::stream_handle::{impl#1}::start_streaming_loop::{closure_env#1}, ()>>, ()> (data=0x7ffff71ffc30)
at /rustc/f6e511eec7342f59a25f7c0534f1dbea00d01b14/library/std/src/panicking.rs:554
#8 0x0000555555634ccb in __rust_try ()
#9 0x0000555555634819 in std::panicking::try<(), core::panic::unwind_safe::AssertUnwindSafe<std::thread::{impl#0}::spawn_unchecked_::{closure#1}::{closure_env#0}<cameleon::u3v::stream_handle::{impl#1}::start_streaming_loop::{closure_env#1}, ()>>> (f=...)
at /rustc/f6e511eec7342f59a25f7c0534f1dbea00d01b14/library/std/src/panicking.rs:518
#10 std::panic::catch_unwind<core::panic::unwind_safe::AssertUnwindSafe<std::thread::{impl#0}::spawn_unchecked_::{closure#1}::{closure_env#0}<cameleon::u3v::stream_handle::{impl#1}::start_streaming_loop::{closure_env#1}, ()>>, ()> (f=...)
at /rustc/f6e511eec7342f59a25f7c0534f1dbea00d01b14/library/std/src/panic.rs:345
#11 std::thread::{impl#0}::spawn_unchecked_::{closure#1}<cameleon::u3v::stream_handle::{impl#1}::start_streaming_loop::{closure_env#1}, ()> ()
at /rustc/f6e511eec7342f59a25f7c0534f1dbea00d01b14/library/std/src/thread/mod.rs:521
#12 0x0000555555637aef in core::ops::function::FnOnce::call_once<std::thread::{impl#0}::spawn_unchecked_::{closure_env#1}<cameleon::u3v::stream_handle::{impl#1}::start_streaming_loop::{closure_env#1}, ()>, ()> () at /rustc/f6e511eec7342f59a25f7c0534f1dbea00d01b14/library/core/src/ops/function.rs:250
#13 0x000055555577298b in alloc::boxed::{impl#48}::call_once<(), dyn core::ops::function::FnOnce<(), Output=()>, alloc::alloc::Global> ()
at alloc/src/boxed.rs:2231
#14 alloc::boxed::{impl#48}::call_once<(), alloc::boxed::Box<dyn core::ops::function::FnOnce<(), Output=()>, alloc::alloc::Global>, alloc::alloc::Global> ()
at alloc/src/boxed.rs:2231
#15 std::sys::pal::unix::thread::{impl#2}::new::thread_start () at std/src/sys/pal/unix/thread.rs:105
#16 0x00007ffff7d47144 in ?? () from /lib/x86_64-linux-gnu/libc.so.6
#17 0x00007ffff7dc77dc in ?? () from /lib/x86_64-linux-gnu/libc.so.6 |
This comment is wrong btw, should be "Parse bytes as Trailer": impl<'a> Trailer<'a> {
const TRAILER_MAGIC: u32 = 0x5456_3355;
/// Parse bytes as Leader.
|
But the main problem is that the leader magic is given instead of the expected trailer magic: (gdb) li
446 fn parse_prefix(cursor: &mut Cursor<&[u8]>) -> Result<()> {
447 let magic: u32 = cursor.read_bytes_le()?;
448 if magic == Self::TRAILER_MAGIC {
449 Ok(())
450 } else {
451 Err(Error::InvalidPacket("invalid prefix magic".into()))
452 }
453 }
454 }
455
(gdb) p/x magic
$1 = 0x4c563355 Btw, here is an example of how rust-gdb handles Rust-aware debugging:
So we have to go to the source code and see that the const TRAILER_MAGIC: u32 = 0x5456_3355;
const LEADER_MAGIC: u32 = 0x4C56_3355; |
The panic happens because "trailer must follow the leader" logic is hardcoded in Cameleon streaming: (gdb) li 252
247 // We received the data from the bulk transfers, try to parse stuff now.
248 let leader = match u3v_stream::Leader::parse(&leader_buf)
249 .map_err(|e| StreamError::InvalidPayload(format!("{}", e).into()))
250 {
251 Ok(leader) => leader,
252 Err(err) => {
253 warn!(?err);
254 // Reuse `payload_buf`.
255 payload_buf_opt = Some(payload_buf);
256 self.sender.try_send(Err(err)).ok();
(gdb)
257 continue;
258 }
259 };
260
261 let trailer = match u3v_stream::Trailer::parse(&trailer_buf)
262 .map_err(|e| StreamError::InvalidPayload(format!("invalid trailer: {}", e).into()))
263 {
264 Ok(trailer) => trailer,
265 Err(err) => {
266 warn!(?err);
(gdb)
267 // Reuse `payload_buf`.
268 payload_buf_opt = Some(payload_buf);
269 self.sender.try_send(Err(err)).ok();
270 continue;
271 }
272 }; See: the only expected packet after LEADER is TRAILER. |
I have no clue why it is incorrect, so I take Aravis as a reference. And it appears that in Aravis camera streaming loop LEADER is followed by DATA, not by TRAILER:
|
Cameleon actually has the same data reading sequence as Aravis: if let Err(err) = read_leader(&mut async_pool, &self.params, &mut leader_buf) {
// Report and send error if the error is fatal.
if matches!(err, StreamError::Io(..) | StreamError::Disconnected) {
error!(?err);
self.sender.try_send(Err(err)).ok();
}
payload_buf_opt = Some(payload_buf);
continue;
};
if let Err(err) = read_payload(&mut async_pool, &self.params, &mut payload_buf) {
warn!(?err);
// Reuse `payload_buf`.
payload_buf_opt = Some(payload_buf);
self.sender.try_send(Err(err)).ok();
continue;
};
if let Err(err) = read_trailer(&mut async_pool, &self.params, &mut trailer_buf) {
warn!(?err);
// Reuse `payload_buf`.
payload_buf_opt = Some(payload_buf);
self.sender.try_send(Err(err)).ok();
continue;
}; This exactly corresponds to this sequence in Aravis:
So my guess is that |
Actually, it's even more wrong than my guess:
It's all messed up, see the GDB below: Thread 3 "stream" hit Breakpoint 2, cameleon::u3v::stream_handle::StreamingLoop::run (self=...) at cameleon/src/u3v/stream_handle.rs:245
245 let payload_len = payload_len - last_buf_len.unwrap();
(gdb) p leader_buf
$4 = Vec(size=52) = {85, 51, 86, 84, 0, 0, 32, 0, 145, 8, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 176, 4, 0, 0, 0, 0, 0, 224, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0}
(gdb) p payload_buf
$5 = Vec(size=614400) = {85, 51, 86, 76, 0, 0, 52, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 136, 102, 114, 231, 19, 3, 0, 0, 1, 0, 8, 1, 128, 2, 0, 0, 224, 1,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0...}
(gdb) p trailer_buf
$6 = Vec(size=52) = {85, 51, 86, 84, 0, 0, 32, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 176, 4, 0, 0, 0, 0, 0, 224, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0}
(gdb) p 0x4C
$7 = 76
(gdb) p 0x56
$8 = 86
(gdb) p 0x33
$10 = 51
(gdb) p 0x55
$11 = 85 |
It seems that I'm able to fix this issue with the following 2 changes:
@bschwind , in your a371076 I think you are trying to make the USB bulk transfers as async as possible, for good. That seems to work, but there is one side effect: the bulk transfers may come out shuffled: LEADER buffer is not necessary the first in the sequence of captured buffers, and so on. Perhaps, you were lucky that shuffling did not happen on your system, and the LEADER-DATA-TRAILER order is preserved in the async pool. I propose to revert your patch an re-apply it again with a proper identification of the buffer kind after async polling is finished. What do you think?
- if params.payload_final2_size != 0 {
- async_pool.submit(&mut buf[cursor..cursor + params.payload_final2_size])?;
+ //if params.payload_final2_size != 0 {
+ // async_pool.submit(&mut buf[cursor..cursor + params.payload_final2_size])?;
+ //} I have no idea what are those |
commit a371076 Author: Brian Schwind <[email protected]> Date: Thu May 11 22:49:45 2023 +0900 Use async transfers for the leader and trailer buffers Performing leader, payload and trailer bulk transfers in one common async pool loop may change the order of buffers. So instead of leader-payload-trailer we may get trailer-leader-payload, which is currently not handled by the buffer parsing code. This patch could be re-applied later, acompanied by a proper order-independent handling.
commit a371076 Author: Brian Schwind <[email protected]> Date: Thu May 11 22:49:45 2023 +0900 Use async transfers for the leader and trailer buffers Performing leader, payload and trailer bulk transfers in one common async pool loop may change the order of buffers. So instead of leader-payload-trailer we may get trailer-leader-payload, which is currently not handled by the buffer parsing code. This patch could be re-applied later, acompanied by a proper order-independent handling.
Correction: the first point above about async bulk transfers has no negative effect. Only the second point about the payload size is enough to fix this issue! |
I've drafted #197 to discuss a possible fix with you. |
I'll give this a look - it's definitely important to have the bulk transfers pipelined as much as possible to attain the throughput you need for higher frame rates. I'll have to double check the current code and its possibilities for shuffling or unnecessary transfers. It's also possible one of our cameras isn't following the standard, but yeah give me some time to read up again on these things. |
I'm testing Chinese machine vision camera with a stream example. Cameleon expects strictly one LEADER packet, followed by TRAILER packets. An unrecoverable issue is that Cameleon receives TRAILER packet instead of LEADER packet from this camera, which is not designed to happen.
What I know for sure is: the same camera has no issue streaming with Aravis.
I wish to collect more context, if you guide me how. I'm not a Rust programmer, and GDB debugger support for Rust is just awful: it's easier to hack into the memory with bare hands than to understand how to print out anything meaningful from the Rust variables...
The text was updated successfully, but these errors were encountered: