Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

branch-3.0: [fix](arrow-flight-sql) Fix Doris NULL column conversion to arrow batch #43929 #44231

Merged
merged 2 commits into from
Dec 3, 2024

Conversation

github-actions[bot]
Copy link
Contributor

Cherry-picked from #43929

…ch (#43929)

### What problem does this PR solve?

Problem Summary:

The representation of NULL columns in Doris is special, which is
`DataTypeNull<DataTypeNumber::Uint8>`. `Uint8` uses
`arrow::BooleanBuilder` when serializing into arrow batch, which does
not match the expected `arrow::NullBuilder`.

Fix:

```
*** Query id: fd32741526804c1e-bc016473fd8f3aa3 ***
*** is nereids: 1 ***
*** tablet id: 0 ***
*** Aborted at 1731327262 (unix time) try "date -d @1731327262" if you are using GNU date ***
*** Current BE git commitID: 653e315 ***
*** SIGSEGV address not mapped to object (@0x100000024) received by PID 1442863 (TID 1443456 OR 0x7f8b8cdea700) from PID 36; stack trace: ***
 0# doris::signal::(anonymous namespace)::FailureSignalHandler(int, siginfo_t*, void*) at /mnt/disk2/liyifan/doris/doris_2.1/doris/be/src/common/signal_handler.h:421
 1# PosixSignals::chained_handler(int, siginfo*, void*) [clone .part.0] in /mnt/disk2/liyifan/doris/jdk-17.0.2/lib/server/libjvm.so
 2# JVM_handle_linux_signal in /mnt/disk2/liyifan/doris/jdk-17.0.2/lib/server/libjvm.so
 3# 0x00007F8CA1F38B50 in /lib64/libc.so.6
 4# 0x000055FC45E5B2D3 in /mnt/disk2/liyifan/doris/doris_2.1/doris/output_run/be/lib/doris_be
 5# arrow::BooleanBuilder::AppendValues(unsigned char const*, long, unsigned char const*) in /mnt/disk2/liyifan/doris/doris_2.1/doris/output_run/be/lib/doris_be
 6# doris::vectorized::DataTypeNumberSerDe<unsigned char>::write_column_to_arrow(doris::vectorized::IColumn const&, doris::vectorized::PODArray<unsigned char, 4096ul, Allocator<false, false, false, DefaultMemoryAllocator>, 15ul, 16ul> const*, arrow::ArrayBuilder*, int, int, cctz::time_zone const&) const at /mnt/disk2/liyifan/doris/doris_2.1/doris/be/src/vec/data_types/serde/data_type_number_serde.cpp:86
 7# doris::FromBlockConverter::convert(std::shared_ptr<arrow::RecordBatch>*) at /mnt/disk2/liyifan/doris/doris_2.1/doris/be/src/util/arrow/block_convertor.cpp:390
 8# doris::convert_to_arrow_batch(doris::vectorized::Block const&, std::shared_ptr<arrow::Schema> const&, arrow::MemoryPool*, std::shared_ptr<arrow::RecordBatch>*, cctz::time_zone const&) in /mnt/disk2/liyifan/doris/doris_2.1/doris/output_run/be/lib/doris_be
 9# doris::vectorized::VArrowFlightResultWriter::write(doris::vectorized::Block&) at /mnt/disk2/liyifan/doris/doris_2.1/doris/be/src/vec/sink/varrow_flight_result_writer.cpp:76
10# doris::vectorized::VResultSink::send(doris::RuntimeState*, doris::vectorized::Block*, bool) at /mnt/disk2/liyifan/doris/doris_2.1/doris/be/src/vec/sink/vresult_sink.cpp:149
11# doris::PlanFragmentExecutor::open_vectorized_internal() at /mnt/disk2/liyifan/doris/doris_2.1/doris/be/src/runtime/plan_fragment_executor.cpp:341
12# doris::PlanFragmentExecutor::open() at /mnt/disk2/liyifan/doris/doris_2.1/doris/be/src/runtime/plan_fragment_executor.cpp:273
```
@doris-robot
Copy link

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR.

Please clearly describe your PR:

  1. What problem was fixed (it's best to include specific error reporting information). How it was fixed.
  2. Which behaviors were modified. What was the previous behavior, what is it now, why was it modified, and what possible impacts might there be.
  3. What features were added. Why was this function added?
  4. Which code was refactored and why was this part of the code refactored?
  5. Which functions were optimized and what is the difference before and after the optimization?

@dataroaring dataroaring reopened this Nov 19, 2024
@doris-robot
Copy link

run buildall

Copy link
Contributor Author

clang-tidy review says "All clean, LGTM! 👍"

@xinyiZzz xinyiZzz closed this Nov 19, 2024
@xinyiZzz xinyiZzz reopened this Dec 2, 2024
@xinyiZzz
Copy link
Contributor

xinyiZzz commented Dec 2, 2024

run buildall

@doris-robot
Copy link

TPC-H: Total hot run time: 40351 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit dd40fd68c84287b055708a588959248ff5d4465d, data reload: false

------ Round 1 ----------------------------------
q1	17572	7355	7287	7287
q2	2047	158	171	158
q3	10725	1036	1127	1036
q4	10528	734	705	705
q5	7739	2791	2784	2784
q6	233	144	148	144
q7	953	598	593	593
q8	9570	1916	1971	1916
q9	7065	6414	6428	6414
q10	6985	2264	2383	2264
q11	452	255	256	255
q12	397	213	210	210
q13	17780	2935	3039	2935
q14	231	203	205	203
q15	556	520	506	506
q16	668	601	581	581
q17	958	588	554	554
q18	7143	6671	6587	6587
q19	1391	1005	1028	1005
q20	490	195	190	190
q21	3935	3143	3053	3053
q22	1091	997	971	971
Total cold run time: 108509 ms
Total hot run time: 40351 ms

----- Round 2, with runtime_filter_mode=off -----
q1	7280	7266	7206	7206
q2	316	229	224	224
q3	2839	2847	2848	2847
q4	1968	1836	1727	1727
q5	5633	5663	5649	5649
q6	215	135	140	135
q7	2166	1714	1783	1714
q8	3289	3467	3463	3463
q9	8778	8905	8761	8761
q10	3533	3506	3495	3495
q11	589	495	486	486
q12	772	577	625	577
q13	16559	3158	3149	3149
q14	295	261	274	261
q15	564	518	513	513
q16	693	655	670	655
q17	1828	1626	1567	1567
q18	8226	7701	7499	7499
q19	4147	1613	1638	1613
q20	2032	1823	1860	1823
q21	5233	5186	5265	5186
q22	1123	1041	1034	1034
Total cold run time: 78078 ms
Total hot run time: 59584 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 194636 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit dd40fd68c84287b055708a588959248ff5d4465d, data reload: false

query1	1234	916	900	900
query2	6233	2097	2045	2045
query3	10874	3990	3926	3926
query4	67609	27790	23384	23384
query5	5517	448	441	441
query6	465	174	174	174
query7	6193	302	304	302
query8	332	229	224	224
query9	9395	2651	2655	2651
query10	509	279	251	251
query11	18044	15037	15680	15037
query12	171	109	105	105
query13	1567	454	408	408
query14	11465	6875	7398	6875
query15	212	172	174	172
query16	7486	450	496	450
query17	1282	579	569	569
query18	1907	321	310	310
query19	197	156	145	145
query20	113	111	106	106
query21	218	104	98	98
query22	4925	4584	4445	4445
query23	34855	33889	34219	33889
query24	5956	2935	2975	2935
query25	532	416	420	416
query26	682	175	169	169
query27	1804	303	297	297
query28	4495	2562	2534	2534
query29	728	439	419	419
query30	236	162	155	155
query31	985	810	825	810
query32	60	53	54	53
query33	477	273	270	270
query34	904	505	516	505
query35	863	714	711	711
query36	1045	943	944	943
query37	110	66	72	66
query38	4119	4024	4000	4000
query39	1511	1457	1473	1457
query40	198	95	102	95
query41	50	46	46	46
query42	110	101	105	101
query43	532	477	475	475
query44	1158	799	802	799
query45	190	162	166	162
query46	1140	739	741	739
query47	1996	1885	1901	1885
query48	467	371	366	366
query49	744	370	369	369
query50	815	411	417	411
query51	7283	7035	7054	7035
query52	92	88	85	85
query53	250	183	180	180
query54	561	436	435	435
query55	75	74	72	72
query56	250	224	239	224
query57	1207	1115	1087	1087
query58	213	213	199	199
query59	3005	2881	2933	2881
query60	264	245	243	243
query61	128	101	100	100
query62	773	641	651	641
query63	205	192	186	186
query64	1630	646	638	638
query65	3221	3147	3320	3147
query66	716	298	297	297
query67	15994	15538	15270	15270
query68	4598	549	553	549
query69	441	255	261	255
query70	1162	1114	1054	1054
query71	435	250	253	250
query72	6461	3960	3951	3951
query73	756	339	344	339
query74	10399	8815	8780	8780
query75	3356	2592	2638	2592
query76	2426	1030	1148	1030
query77	488	269	261	261
query78	10855	9777	9423	9423
query79	10132	594	596	594
query80	2089	410	411	410
query81	563	235	231	231
query82	1449	114	115	114
query83	303	135	136	135
query84	287	77	87	77
query85	1798	299	288	288
query86	463	293	304	293
query87	4463	4190	4313	4190
query88	5451	2397	2422	2397
query89	563	288	287	287
query90	2115	178	182	178
query91	171	139	140	139
query92	62	46	45	45
query93	7038	541	540	540
query94	844	290	291	290
query95	342	246	245	245
query96	657	283	276	276
query97	3330	3107	3162	3107
query98	212	203	200	200
query99	1621	1311	1301	1301
Total cold run time: 343907 ms
Total hot run time: 194636 ms

@xinyiZzz xinyiZzz merged commit 3879940 into branch-3.0 Dec 3, 2024
22 of 26 checks passed
@github-actions github-actions bot deleted the auto-pick-43929-branch-3.0 branch December 3, 2024 09:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants