Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

branch-3.0: [enhancement](tablet-meta) Avoid be coredump due to potential race condition when updating tablet cumu point #45643 #45785

Merged
merged 1 commit into from
Dec 25, 2024

Conversation

github-actions[bot]
Copy link
Contributor

Cherry-picked from #45643

…ndition when updating tablet cumu point (#45643)

Currently, when setting tablet's cumu point, aseert fail will happend if
new point is less than local value, resulting BE coredump.

This could happend when race condition happend:
1. thread A try to sync rowset
2. thread A fetch cumu point from ms 
3. thread B update cumu point(like sc/compaction),commit to ms after 2.
and set be tablet cumu point before 4.
4. thread A try to set cumu point seen before and meet the assertion,
coredump.
@Thearas
Copy link
Contributor

Thearas commented Dec 23, 2024

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR.

Please clearly describe your PR:

  1. What problem was fixed (it's best to include specific error reporting information). How it was fixed.
  2. Which behaviors were modified. What was the previous behavior, what is it now, why was it modified, and what possible impacts might there be.
  3. What features were added. Why was this function added?
  4. Which code was refactored and why was this part of the code refactored?
  5. Which functions were optimized and what is the difference before and after the optimization?

@dataroaring dataroaring reopened this Dec 23, 2024
@Thearas
Copy link
Contributor

Thearas commented Dec 23, 2024

run buildall

@doris-robot
Copy link

TPC-H: Total hot run time: 40597 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 99bd8d6f70b17546cfb56c846b16103a16344525, data reload: false

------ Round 1 ----------------------------------
q1	17761	7481	7269	7269
q2	2050	172	188	172
q3	10603	1114	1175	1114
q4	10380	770	739	739
q5	7776	2795	2820	2795
q6	234	146	143	143
q7	979	613	595	595
q8	9354	1940	2027	1940
q9	6653	6406	6334	6334
q10	7002	2291	2287	2287
q11	476	272	266	266
q12	411	217	208	208
q13	17786	2961	2986	2961
q14	240	208	220	208
q15	553	524	526	524
q16	688	612	599	599
q17	973	569	555	555
q18	7337	6663	6652	6652
q19	1371	965	1070	965
q20	474	200	197	197
q21	3916	3112	3170	3112
q22	1103	977	962	962
Total cold run time: 108120 ms
Total hot run time: 40597 ms

----- Round 2, with runtime_filter_mode=off -----
q1	7202	7190	7187	7187
q2	326	232	232	232
q3	2846	2711	2722	2711
q4	1945	1646	1629	1629
q5	5399	5408	5393	5393
q6	215	137	137	137
q7	2057	1682	1664	1664
q8	3180	3401	3371	3371
q9	8494	8498	8529	8498
q10	3450	3378	3365	3365
q11	601	496	508	496
q12	760	540	585	540
q13	16882	2985	2980	2980
q14	287	256	260	256
q15	569	509	501	501
q16	692	658	671	658
q17	1793	1579	1558	1558
q18	7808	7532	7583	7532
q19	5071	1550	1572	1550
q20	1990	1799	1769	1769
q21	5216	5064	5094	5064
q22	1091	982	1002	982
Total cold run time: 77874 ms
Total hot run time: 58073 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 189053 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit 99bd8d6f70b17546cfb56c846b16103a16344525, data reload: false

query1	1005	370	370	370
query2	6517	2129	2068	2068
query3	6704	217	227	217
query4	34233	23572	23355	23355
query5	4367	452	429	429
query6	268	169	171	169
query7	4634	307	313	307
query8	284	220	224	220
query9	9549	2696	2677	2677
query10	459	264	255	255
query11	18315	15397	15035	15035
query12	149	97	98	97
query13	1648	421	392	392
query14	9877	7101	6518	6518
query15	208	183	181	181
query16	7698	470	457	457
query17	1614	578	553	553
query18	1959	306	314	306
query19	209	157	157	157
query20	118	111	115	111
query21	63	47	47	47
query22	4383	4153	4097	4097
query23	34506	33634	33641	33634
query24	12155	2824	2883	2824
query25	689	376	380	376
query26	1842	166	161	161
query27	2971	293	288	288
query28	8127	2459	2473	2459
query29	1088	434	416	416
query30	337	175	162	162
query31	1013	772	797	772
query32	91	57	57	57
query33	773	285	265	265
query34	1022	477	504	477
query35	921	727	712	712
query36	1073	947	922	922
query37	256	70	68	68
query38	3920	3808	3786	3786
query39	1465	1410	1435	1410
query40	222	84	83	83
query41	52	49	50	49
query42	106	102	95	95
query43	540	511	506	506
query44	1276	772	784	772
query45	181	167	167	167
query46	1114	698	714	698
query47	1896	1814	1789	1789
query48	464	358	363	358
query49	1294	374	379	374
query50	797	394	402	394
query51	7242	7059	7056	7056
query52	100	94	92	92
query53	254	183	186	183
query54	1305	450	442	442
query55	75	74	77	74
query56	251	228	243	228
query57	1179	1093	1063	1063
query58	236	201	205	201
query59	3182	3137	2937	2937
query60	279	255	249	249
query61	113	107	111	107
query62	851	673	670	670
query63	210	180	182	180
query64	5326	629	654	629
query65	3313	3215	3184	3184
query66	1405	306	321	306
query67	15921	15287	15147	15147
query68	3986	552	567	552
query69	428	250	258	250
query70	1166	1123	1108	1108
query71	328	258	264	258
query72	6193	4028	4069	4028
query73	740	343	342	342
query74	10042	8901	9008	8901
query75	3355	2611	2647	2611
query76	2691	1017	1088	1017
query77	398	284	276	276
query78	10568	9698	9559	9559
query79	3439	605	586	586
query80	2069	445	445	445
query81	579	253	243	243
query82	826	126	115	115
query83	276	152	152	152
query84	296	90	87	87
query85	2181	305	285	285
query86	490	299	305	299
query87	4428	4316	4173	4173
query88	5064	2338	2441	2338
query89	422	285	293	285
query90	2042	181	180	180
query91	180	147	141	141
query92	66	53	52	52
query93	4944	542	527	527
query94	970	284	278	278
query95	341	249	243	243
query96	610	273	275	273
query97	3317	3120	3190	3120
query98	216	202	201	201
query99	1597	1288	1288	1288
Total cold run time: 309942 ms
Total hot run time: 189053 ms

@dataroaring dataroaring merged commit e3cae82 into branch-3.0 Dec 25, 2024
18 of 21 checks passed
@github-actions github-actions bot deleted the auto-pick-45643-branch-3.0 branch December 25, 2024 01:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants