Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

crash: when a mv write to a stream from a rand stream. #735

Open
yokofly opened this issue May 17, 2024 · 2 comments
Open

crash: when a mv write to a stream from a rand stream. #735

yokofly opened this issue May 17, 2024 · 2 comments
Labels
bug Something isn't working

Comments

@yokofly
Copy link
Collaborator

yokofly commented May 17, 2024

Describe what's wrong

2024.05.17 03:20:18.194438 [ 33 ] {cfacb0d4-a21d-4fa5-b0b7-f0d10d20ec7e} <Information> StorageMaterializedView (default.mv): Took 2 ms to wait for built background pipeline during matierialized view 'default.mv' startup
2024.05.17 03:20:18.194499 [ 333 ] {.inner.query-id.from-408480d3-2064-48e7-91fe-fcfe70397a61} <Information> PipelineExecutor: Using 20 threads to execute pipeline for query_id=.inner.query-id.from-408480d3-2064-48e7-91fe-fcfe70397a61
2024.05.17 03:20:18.194816 [ 333 ] {.inner.query-id.from-408480d3-2064-48e7-91fe-fcfe70397a61} <Information> LocalFileSystemCheckpoint: Took 1 ms to checkpoint to /var/lib/proton/checkpoint/.inner.query-id.from-408480d3-2064-48e7-91fe-fcfe70397a61/dag.ckpt, compressed_size=1144, uncompressed_size=925
2024.05.17 03:20:18.194878 [ 333 ] {.inner.query-id.from-408480d3-2064-48e7-91fe-fcfe70397a61} <Information> LocalFileSystemCheckpoint: Took 0 ms to checkpoint to /var/lib/proton/checkpoint/.inner.query-id.from-408480d3-2064-48e7-91fe-fcfe70397a61/query.ckpt, compressed_size=68, uncompressed_size=41
2024.05.17 03:20:18.194903 [ 333 ] {.inner.query-id.from-408480d3-2064-48e7-91fe-fcfe70397a61} <Information> CheckpointCoordinator: Register query=.inner.query-id.from-408480d3-2064-48e7-91fe-fcfe70397a61 with 900 seconds checkpoint interval, source_node_descriptions=0-Random, ack_node_descriptions=5-EmptySink
^C2024.05.17 03:20:22.071332 [ 32 ] {} <Information> Application: Received termination signal (Interrupt)
2024.05.17 03:20:22.198153 [ 340 ] {} <Information> default.v (3ecab0be-ae17-4676-ae6d-c54eed230692): Committed sn=399 for shard=0 to local file system
2024.05.17 03:20:22.881160 [ 320 ] {} <Information> system.metric_log (7d372865-d0b9-42ae-8025-589ff5051941): Found 0 parts for disk 'default' to load
2024.05.17 03:20:22.881208 [ 312 ] {} <Information> system.query_log (967fa75b-c7d4-41b3-a980-0d1213a07348): Found 0 parts for disk 'default' to load
2024.05.17 03:20:22.881255 [ 316 ] {} <Information> system.trace_log (e6f6d424-4670-4bcf-a4e1-374861335755): Found 0 parts for disk 'default' to load
2024.05.17 03:20:22.881330 [ 313 ] {} <Information> system.part_log (2dc4f295-87fd-4cee-9db3-d313afef3309): Found 0 parts for disk 'default' to load
2024.05.17 03:20:22.881683 [ 314 ] {} <Information> system.query_thread_log (b3d80e57-2e37-41ae-823e-759f78391946): Found 0 parts for disk 'default' to load
2024.05.17 03:20:22.940554 [ 1 ] {} <Information> Application: Closed all listening sockets. Waiting for 1 outstanding connections.
2024.05.17 03:20:22.940586 [ 1 ] {} <Information> CheckpointCoordinator: Trigger last checkpoint and flush begin
2024.05.17 03:20:22.940894 [ 382 ] {} <Fatal> BaseDaemon: ########## Short fault info ############
2024.05.17 03:20:22.940927 [ 382 ] {} <Fatal> BaseDaemon: (version 1.5.8, build id: E8B09E6E8FB8A2EEEB1DFA6F88518F1D7A4B9E96, git hash: 26b4810034decd3fcb91508d672856b04ff536b1) (from thread 1) Received signal 11
2024.05.17 03:20:22.940934 [ 382 ] {} <Fatal> BaseDaemon: Signal description: Segmentation fault
2024.05.17 03:20:22.940941 [ 382 ] {} <Fatal> BaseDaemon: Address: 0x1b8 Access: read. Address not mapped to object.
2024.05.17 03:20:22.940952 [ 382 ] {} <Fatal> BaseDaemon: Stack trace: 0x00000000168f1625 0x000000001a5e100e 0x000000001a5e34d5 0x00000000101874b8 0x0000000010181c2c 0x000000001a93cb46 0x0000000010173c79 0x000000001a94fe33 0x00000000101719fa 0x000000000afcaadd 0x00007f607b3f0083 0x000000000afca02e
2024.05.17 03:20:22.940957 [ 382 ] {} <Fatal> BaseDaemon: ########################################
2024.05.17 03:20:22.940962 [ 382 ] {} <Fatal> BaseDaemon: (version 1.5.8, build id: E8B09E6E8FB8A2EEEB1DFA6F88518F1D7A4B9E96, git hash: 26b4810034decd3fcb91508d672856b04ff536b1) (from thread 1) (no query) Received signal Segmentation fault (11)
2024.05.17 03:20:22.940966 [ 382 ] {} <Fatal> BaseDaemon: Address: 0x1b8 Access: read. Address not mapped to object.
2024.05.17 03:20:22.940970 [ 382 ] {} <Fatal> BaseDaemon: Stack trace: 0x00000000168f1625 0x000000001a5e100e 0x000000001a5e34d5 0x00000000101874b8 0x0000000010181c2c 0x000000001a93cb46 0x0000000010173c79 0x000000001a94fe33 0x00000000101719fa 0x000000000afcaadd 0x00007f607b3f0083 0x000000000afca02e
2024.05.17 03:20:22.940999 [ 382 ] {} <Fatal> BaseDaemon: 3. DB::ExecutingGraph::hasProcessedNewDataSinceLastCheckpoint() const @ 0x00000000168f1625 in /usr/bin/proton
2024.05.17 03:20:22.941010 [ 382 ] {} <Fatal> BaseDaemon: 4. DB::CheckpointCoordinator::doTriggerCheckpoint(std::__1::weak_ptr<DB::PipelineExecutor> const&, std::__1::shared_ptr<DB::CheckpointContext const>) @ 0x000000001a5e100e in /usr/bin/proton
2024.05.17 03:20:22.941017 [ 382 ] {} <Fatal> BaseDaemon: 5. DB::CheckpointCoordinator::triggerLastCheckpointAndFlush() @ 0x000000001a5e34d5 in /usr/bin/proton
2024.05.17 03:20:22.941029 [ 382 ] {} <Fatal> BaseDaemon: 6. BasicScopeGuard<DB::Server::main(std::__1::vector<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>, std::__1::allocator<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>>> const&)::$_7>::~BasicScopeGuard() @ 0x00000000101874b8 in /usr/bin/proton
2024.05.17 03:20:22.941038 [ 382 ] {} <Fatal> BaseDaemon: 7. DB::Server::main(std::__1::vector<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>, std::__1::allocator<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>>> const&) @ 0x0000000010181c2c in /usr/bin/proton
2024.05.17 03:20:22.941052 [ 382 ] {} <Fatal> BaseDaemon: 8. Poco::Util::Application::run() @ 0x000000001a93cb46 in /usr/bin/proton
2024.05.17 03:20:22.941057 [ 382 ] {} <Fatal> BaseDaemon: 9. DB::Server::run() @ 0x0000000010173c79 in /usr/bin/proton
2024.05.17 03:20:22.941064 [ 382 ] {} <Fatal> BaseDaemon: 10. Poco::Util::ServerApplication::run(int, char**) @ 0x000000001a94fe33 in /usr/bin/proton
2024.05.17 03:20:22.941070 [ 382 ] {} <Fatal> BaseDaemon: 11. mainServer(int, char**) @ 0x00000000101719fa in /usr/bin/proton
2024.05.17 03:20:22.941079 [ 382 ] {} <Fatal> BaseDaemon: 12. main @ 0x000000000afcaadd in /usr/bin/proton
2024.05.17 03:20:22.941086 [ 382 ] {} <Fatal> BaseDaemon: 13. __libc_start_main @ 0x00007f607b3f0083 in ?
2024.05.17 03:20:22.941091 [ 382 ] {} <Fatal> BaseDaemon: 14. _start @ 0x000000000afca02e in /usr/bin/proton
2024.05.17 03:20:22.941097 [ 382 ] {} <Fatal> BaseDaemon: Integrity check of the executable skipped because the reference checksum could not be read.
2024.05.17 03:20:23.380744 [ 319 ] {} <Information> system.crash_log (2018f2aa-b7b8-4c9c-8a3b-dfab4207fdc0): Found 0 parts for disk 'default' to load

How to reproduce

create stream v(id int);
create random stream v_rand(id int default rand()%100);
create materialized view mv into v as select * from v_rand;

then shutdown proton

Error message and/or stacktrace

Additional context

@yokofly yokofly added the bug Something isn't working label May 17, 2024
@yokofly
Copy link
Collaborator Author

yokofly commented May 17, 2024

for a temp solution:
no mv, directly insert works as expected.

@yokofly
Copy link
Collaborator Author

yokofly commented May 17, 2024

well, this crash requires an exit proton it will trigger the last ckpt.
If we want to skip, we can directly drop the mv after ingesting some necessary data.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants