-
Notifications
You must be signed in to change notification settings - Fork 20
/
atom.xml
607 lines (317 loc) · 480 KB
/
atom.xml
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
<?xml version="1.0" encoding="utf-8"?>
<feed xmlns="http://www.w3.org/2005/Atom">
<title>plantegg</title>
<subtitle>java tcp mysql performance network docker Linux</subtitle>
<link href="/atom.xml" rel="self"/>
<link href="https://plantegg.github.io/"/>
<updated>2024-11-25T12:25:32.496Z</updated>
<id>https://plantegg.github.io/</id>
<author>
<name>twitter @plantegg</name>
</author>
<generator uri="http://hexo.io/">Hexo</generator>
<entry>
<title>关于本博</title>
<link href="https://plantegg.github.io/2117/06/07/%E5%85%B3%E4%BA%8E%E6%9C%AC%E5%8D%9A/"/>
<id>https://plantegg.github.io/2117/06/07/关于本博/</id>
<published>2117-06-07T10:30:03.000Z</published>
<updated>2024-11-25T12:25:32.496Z</updated>
<content type="html"><![CDATA[<h2 id="关于本博"><a href="#关于本博" class="headerlink" title="关于本博"></a>关于本博</h2><p>find me on twitter: <a href="https://twitter.com/plantegg" target="_blank" rel="noopener">@plantegg</a></p><p>Github: <a href="https://github.com/plantegg/programmer_case" target="_blank" rel="noopener">欢迎star</a> </p><p>知识星球:<a href="https://t.zsxq.com/0cSFEUh2J" target="_blank" rel="noopener">https://t.zsxq.com/0cSFEUh2J</a></p><p>关注基础知识,一次把问题搞清楚,从案例出发深挖相关知识。</p><p>以前觉得自己一看就懂,实际是一问就打鼓,一用就糊涂。所以现在开始记录并总结再联系案例,一般是先把零散知识记录下来(看到过),慢慢地相关知识积累更多,直到碰到实践案例或是有点领悟到于是发现这块知识可以整理成一篇系统些的文章(基本快懂了)。</p><p>“技术变化太快,容易过时”,我的看法是网络知识、操作系统、计算机原理等核心概念知识的寿命会比你的职业生涯还长。这些都是40岁之后还会还会很有用</p><p><a href="https://plantegg.github.io/2018/05/23/%E5%A6%82%E4%BD%95%E5%9C%A8%E5%B7%A5%E4%BD%9C%E4%B8%AD%E5%AD%A6%E4%B9%A0/">如何在工作中学习</a> 所有方法我都记录在这篇文章中了,希望对你能有所帮助。</p><p>所有新文章从<a href="https://plantegg.github.io/archives">这里可以看到</a>,即使再简单的一篇总结我可以持续总结三五年,有新的发现、感悟都是直接在原文上增减,不会发表新的文章。</p><p><img src="https://cdn.jsdelivr.net/gh/plantegg/plantegg.github.io/images/951413iMgBlog/image-20220421102225491.png" alt="image-20220421102225491"></p><p>为什么写博客而不是公众号,我见过20年前的互联网,深度依赖搜索引擎,所以还是喜欢博客。另外技术类文章更适合电脑阅读(随时摘录、实验)</p><h2 id="精华文章推荐(2021年前)"><a href="#精华文章推荐(2021年前)" class="headerlink" title="精华文章推荐(2021年前)"></a>精华文章推荐(2021年前)</h2><h4 id="在2010年前后MySQL、PG、Oracle数据库在使用NUMA的时候碰到了性能问题,流传最广的这篇-MySQL-–-The-MySQL-“swap-insanity”-problem-and-the-effects-of-the-NUMA-architecture-http-blog-jcole-us-2010-09-28-mysql-swap-insanity-and-the-numa-architecture-文章描述了性能问题的原因-文章中把原因找错了-以及解决方案:关闭NUMA。-实际这个原因是kernel实现的一个低级bug,这个Bug在2014年修复了https-github-com-torvalds-linux-commit-4f9b16a64753d0bb607454347036dc997fd03b82,但是修复这么多年后仍然以讹传讹,这篇文章希望正本清源、扭转错误的认识。"><a href="#在2010年前后MySQL、PG、Oracle数据库在使用NUMA的时候碰到了性能问题,流传最广的这篇-MySQL-–-The-MySQL-“swap-insanity”-problem-and-the-effects-of-the-NUMA-architecture-http-blog-jcole-us-2010-09-28-mysql-swap-insanity-and-the-numa-architecture-文章描述了性能问题的原因-文章中把原因找错了-以及解决方案:关闭NUMA。-实际这个原因是kernel实现的一个低级bug,这个Bug在2014年修复了https-github-com-torvalds-linux-commit-4f9b16a64753d0bb607454347036dc997fd03b82,但是修复这么多年后仍然以讹传讹,这篇文章希望正本清源、扭转错误的认识。" class="headerlink" title="在2010年前后MySQL、PG、Oracle数据库在使用NUMA的时候碰到了性能问题,流传最广的这篇 MySQL – The MySQL “swap insanity” problem and the effects of the NUMA architecture http://blog.jcole.us/2010/09/28/mysql-swap-insanity-and-the-numa-architecture/ 文章描述了性能问题的原因(文章中把原因找错了)以及解决方案:关闭NUMA。 实际这个原因是kernel实现的一个低级bug,这个Bug在2014年修复了https://github.com/torvalds/linux/commit/4f9b16a64753d0bb607454347036dc997fd03b82,但是修复这么多年后仍然以讹传讹,这篇文章希望正本清源、扭转错误的认识。"></a><a href="https://plantegg.github.io/2021/05/14/%E5%8D%81%E5%B9%B4%E5%90%8E%E6%95%B0%E6%8D%AE%E5%BA%93%E8%BF%98%E6%98%AF%E4%B8%8D%E6%95%A2%E6%8B%A5%E6%8A%B1NUMA/">在2010年前后MySQL、PG、Oracle数据库在使用NUMA的时候碰到了性能问题,流传最广的这篇 MySQL – The MySQL “swap insanity” problem and the effects of the NUMA architecture http://blog.jcole.us/2010/09/28/mysql-swap-insanity-and-the-numa-architecture/ 文章描述了性能问题的原因(文章中把原因找错了)以及解决方案:关闭NUMA。 实际这个原因是kernel实现的一个低级bug,这个Bug在2014年修复了https://github.com/torvalds/linux/commit/4f9b16a64753d0bb607454347036dc997fd03b82,但是修复这么多年后仍然以讹传讹,这篇文章希望正本清源、扭转错误的认识。</a></h4><p><img src="https://cdn.jsdelivr.net/gh/plantegg/plantegg.github.io/images/951413iMgBlog/image-20210517082233798.png" alt="image-20210517082233798"></p><h4 id="CPU的制造和概念-从最底层的沙子开始用8篇文章来回答关于CPU的各种疑问以及大量的实验对比案例和测试数据来展示了CPU的各种原理,比如多核、超线程、NUMA、睿频、功耗、GPU、大小核再到分支预测、cache-line失效、加锁代价、IPC等各种指标(都有对应的代码和测试数据)。"><a href="#CPU的制造和概念-从最底层的沙子开始用8篇文章来回答关于CPU的各种疑问以及大量的实验对比案例和测试数据来展示了CPU的各种原理,比如多核、超线程、NUMA、睿频、功耗、GPU、大小核再到分支预测、cache-line失效、加锁代价、IPC等各种指标(都有对应的代码和测试数据)。" class="headerlink" title="CPU的制造和概念 从最底层的沙子开始用8篇文章来回答关于CPU的各种疑问以及大量的实验对比案例和测试数据来展示了CPU的各种原理,比如多核、超线程、NUMA、睿频、功耗、GPU、大小核再到分支预测、cache_line失效、加锁代价、IPC等各种指标(都有对应的代码和测试数据)。"></a><a href="https://plantegg.github.io/2021/06/01/CPU%E7%9A%84%E5%88%B6%E9%80%A0%E5%92%8C%E6%A6%82%E5%BF%B5/">CPU的制造和概念</a> 从最底层的沙子开始用8篇文章来回答关于CPU的各种疑问以及大量的实验对比案例和测试数据来展示了CPU的各种原理,比如多核、超线程、NUMA、睿频、功耗、GPU、大小核再到分支预测、cache_line失效、加锁代价、IPC等各种指标(都有对应的代码和测试数据)。</h4><p><img src="https://cdn.jsdelivr.net/gh/plantegg/plantegg.github.io/images/951413iMgBlog/image-20210802161410524-1011377.png" alt="image-20210802161410524"> </p><h4 id="《Intel-PAUSE指令变化是如何影响自旋锁以及MySQL的性能的》-从一个参数引起的rt抖动定位到OS锁等待再到CPU-Pause指令,以及不同CPU型号对Pause使用cycles不同的影响,最终反馈到应用层面的rt全过程。在MySQL内核开发的时候考虑了Pause,但是没有考虑不同的CPU型号,所以换了CPU型号后性能差异比较大"><a href="#《Intel-PAUSE指令变化是如何影响自旋锁以及MySQL的性能的》-从一个参数引起的rt抖动定位到OS锁等待再到CPU-Pause指令,以及不同CPU型号对Pause使用cycles不同的影响,最终反馈到应用层面的rt全过程。在MySQL内核开发的时候考虑了Pause,但是没有考虑不同的CPU型号,所以换了CPU型号后性能差异比较大" class="headerlink" title="《Intel PAUSE指令变化是如何影响自旋锁以及MySQL的性能的》 从一个参数引起的rt抖动定位到OS锁等待再到CPU Pause指令,以及不同CPU型号对Pause使用cycles不同的影响,最终反馈到应用层面的rt全过程。在MySQL内核开发的时候考虑了Pause,但是没有考虑不同的CPU型号,所以换了CPU型号后性能差异比较大"></a><a href="https://plantegg.github.io/2019/12/16/Intel%20PAUSE%E6%8C%87%E4%BB%A4%E5%8F%98%E5%8C%96%E6%98%AF%E5%A6%82%E4%BD%95%E5%BD%B1%E5%93%8D%E8%87%AA%E6%97%8B%E9%94%81%E4%BB%A5%E5%8F%8AMySQL%E7%9A%84%E6%80%A7%E8%83%BD%E7%9A%84/">《Intel PAUSE指令变化是如何影响自旋锁以及MySQL的性能的》 从一个参数引起的rt抖动定位到OS锁等待再到CPU Pause指令,以及不同CPU型号对Pause使用cycles不同的影响,最终反馈到应用层面的rt全过程。在MySQL内核开发的时候考虑了Pause,但是没有考虑不同的CPU型号,所以换了CPU型号后性能差异比较大</a></h4><p><img src="https://cdn.jsdelivr.net/gh/plantegg/plantegg.github.io/images/oss/d567449fe52725a9d0b9d4ec9baa372c.png" alt="image.png"></p><h4 id="10倍性能提升全过程-在双11的紧张流程下,将系统tps从500优化到5500,从网络到snat、再到Spring和StackTrace,一次全栈性能优化过程的详细记录和分析。"><a href="#10倍性能提升全过程-在双11的紧张流程下,将系统tps从500优化到5500,从网络到snat、再到Spring和StackTrace,一次全栈性能优化过程的详细记录和分析。" class="headerlink" title="10倍性能提升全过程 在双11的紧张流程下,将系统tps从500优化到5500,从网络到snat、再到Spring和StackTrace,一次全栈性能优化过程的详细记录和分析。"></a><a href="https://plantegg.github.io/2018/01/23/10+%E5%80%8D%E6%80%A7%E8%83%BD%E6%8F%90%E5%8D%87%E5%85%A8%E8%BF%87%E7%A8%8B/">10倍性能提升全过程</a> 在双11的紧张流程下,将系统tps从500优化到5500,从网络到snat、再到Spring和StackTrace,一次全栈性能优化过程的详细记录和分析。</h4><p><img src="https://cdn.jsdelivr.net/gh/plantegg/plantegg.github.io/images/oss/05703c168e63e96821ea9f921d83712b.png" alt="image.png"></p><h4 id="就是要你懂TCP–半连接队列和全连接队列:偶发性的连接reset异常、重启服务后短时间的连接异常,通过一篇文章阐明TCP连接的半连接队列和全连接队大小是怎么影响连接创建的,以及用什么工具来观察队列有没有溢出、连接为什么会RESET"><a href="#就是要你懂TCP–半连接队列和全连接队列:偶发性的连接reset异常、重启服务后短时间的连接异常,通过一篇文章阐明TCP连接的半连接队列和全连接队大小是怎么影响连接创建的,以及用什么工具来观察队列有没有溢出、连接为什么会RESET" class="headerlink" title="就是要你懂TCP–半连接队列和全连接队列:偶发性的连接reset异常、重启服务后短时间的连接异常,通过一篇文章阐明TCP连接的半连接队列和全连接队大小是怎么影响连接创建的,以及用什么工具来观察队列有没有溢出、连接为什么会RESET"></a><a href="https://plantegg.github.io/2017/06/07/%E5%B0%B1%E6%98%AF%E8%A6%81%E4%BD%A0%E6%87%82TCP--%E5%8D%8A%E8%BF%9E%E6%8E%A5%E9%98%9F%E5%88%97%E5%92%8C%E5%85%A8%E8%BF%9E%E6%8E%A5%E9%98%9F%E5%88%97/">就是要你懂TCP–半连接队列和全连接队列:偶发性的连接reset异常、重启服务后短时间的连接异常,通过一篇文章阐明TCP连接的半连接队列和全连接队大小是怎么影响连接创建的,以及用什么工具来观察队列有没有溢出、连接为什么会RESET</a></h4><img src="https://cdn.jsdelivr.net/gh/plantegg/plantegg.github.io/images/oss/1579241362064-807d8378-6c54-4a2c-a888-ff2337df817c.png" alt="image.png" style="zoom:80%;"><h4 id="就是要你懂TCP–性能和发送接收Buffer的关系:发送窗口大小-Buffer-、接收窗口大小-Buffer-对TCP传输速度的影响,以及怎么观察窗口对传输速度的影响。BDP、RT、带宽对传输速度又是怎么影响的"><a href="#就是要你懂TCP–性能和发送接收Buffer的关系:发送窗口大小-Buffer-、接收窗口大小-Buffer-对TCP传输速度的影响,以及怎么观察窗口对传输速度的影响。BDP、RT、带宽对传输速度又是怎么影响的" class="headerlink" title="就是要你懂TCP–性能和发送接收Buffer的关系:发送窗口大小(Buffer)、接收窗口大小(Buffer)对TCP传输速度的影响,以及怎么观察窗口对传输速度的影响。BDP、RT、带宽对传输速度又是怎么影响的"></a><a href="https://plantegg.github.io/2019/09/28/%E5%B0%B1%E6%98%AF%E8%A6%81%E4%BD%A0%E6%87%82TCP--%E6%80%A7%E8%83%BD%E5%92%8C%E5%8F%91%E9%80%81%E6%8E%A5%E6%94%B6Buffer%E7%9A%84%E5%85%B3%E7%B3%BB/">就是要你懂TCP–性能和发送接收Buffer的关系:发送窗口大小(Buffer)、接收窗口大小(Buffer)对TCP传输速度的影响,以及怎么观察窗口对传输速度的影响。BDP、RT、带宽对传输速度又是怎么影响的</a></h4><p><img src="https://cdn.jsdelivr.net/gh/plantegg/plantegg.github.io/images/oss/e177d59ecb886daef5905ed80a84dfd2.png"></p><h4 id="就是要你懂网络–一个网络包的旅程:教科书式地阐述书本中的路由、网关、子网、Mac地址、IP地址是如何一起协作让网络包最终传输到目标机器上。-同时可以跟讲这块的RFC1180比较一下,RFC1180-写的确实很好,清晰简洁,图文并茂,结构逻辑合理,但是对于90-的程序员没有什么卵用,看完几周后就忘得差不多,因为他不是从实践的角度来阐述问题,中间没有很多为什么,所以一般资质的程序员看完当时感觉很好,实际还是不会灵活运用"><a href="#就是要你懂网络–一个网络包的旅程:教科书式地阐述书本中的路由、网关、子网、Mac地址、IP地址是如何一起协作让网络包最终传输到目标机器上。-同时可以跟讲这块的RFC1180比较一下,RFC1180-写的确实很好,清晰简洁,图文并茂,结构逻辑合理,但是对于90-的程序员没有什么卵用,看完几周后就忘得差不多,因为他不是从实践的角度来阐述问题,中间没有很多为什么,所以一般资质的程序员看完当时感觉很好,实际还是不会灵活运用" class="headerlink" title="就是要你懂网络–一个网络包的旅程:教科书式地阐述书本中的路由、网关、子网、Mac地址、IP地址是如何一起协作让网络包最终传输到目标机器上。 同时可以跟讲这块的RFC1180比较一下,RFC1180 写的确实很好,清晰简洁,图文并茂,结构逻辑合理,但是对于90%的程序员没有什么卵用,看完几周后就忘得差不多,因为他不是从实践的角度来阐述问题,中间没有很多为什么,所以一般资质的程序员看完当时感觉很好,实际还是不会灵活运用"></a><a href="https://plantegg.github.io/2019/05/15/%E5%B0%B1%E6%98%AF%E8%A6%81%E4%BD%A0%E6%87%82%E7%BD%91%E7%BB%9C--%E4%B8%80%E4%B8%AA%E7%BD%91%E7%BB%9C%E5%8C%85%E7%9A%84%E6%97%85%E7%A8%8B/">就是要你懂网络–一个网络包的旅程:教科书式地阐述书本中的路由、网关、子网、Mac地址、IP地址是如何一起协作让网络包最终传输到目标机器上。</a> 同时可以跟讲这块的<a href="https://tools.ietf.org/html/rfc1180" target="_blank" rel="noopener">RFC1180</a>比较一下,RFC1180 写的确实很好,清晰简洁,图文并茂,结构逻辑合理,但是对于90%的程序员没有什么卵用,看完几周后就忘得差不多,因为他不是从实践的角度来阐述问题,中间没有很多为什么,所以一般资质的程序员看完当时感觉很好,实际还是不会灵活运用</h4><p><img src="https://cdn.jsdelivr.net/gh/plantegg/plantegg.github.io/images/oss/8f5d8518c1d92ed68d23218028e3cd11.png"></p><h4 id="国产CPU和Intel、AMD性能PK-从Intel、AMD、海光、鲲鹏920、飞腾2500-等CPU在TPCC、sysbench下的性能对比来分析他们的性能差距,同时分析内存延迟对性能的影响"><a href="#国产CPU和Intel、AMD性能PK-从Intel、AMD、海光、鲲鹏920、飞腾2500-等CPU在TPCC、sysbench下的性能对比来分析他们的性能差距,同时分析内存延迟对性能的影响" class="headerlink" title="国产CPU和Intel、AMD性能PK 从Intel、AMD、海光、鲲鹏920、飞腾2500 等CPU在TPCC、sysbench下的性能对比来分析他们的性能差距,同时分析内存延迟对性能的影响"></a><a href="https://plantegg.github.io/2022/01/13/%E4%B8%8D%E5%90%8CCPU%E6%80%A7%E8%83%BD%E5%A4%A7PK/">国产CPU和Intel、AMD性能PK</a> 从Intel、AMD、海光、鲲鹏920、飞腾2500 等CPU在TPCC、sysbench下的性能对比来分析他们的性能差距,同时分析内存延迟对性能的影响</h4><p><img src="https://cdn.jsdelivr.net/gh/plantegg/plantegg.github.io/images/951413iMgBlog/image-20220319115644219.png" alt="image-20220319115644219"></p><h4 id="从网络路由连通性的原理上来看负载均衡lvs的DR、NAT、FullNAT到底搞了些什么鬼,以及为什么要这么搞,和带来的优缺点:《就是要你懂负载均衡–lvs和转发模式》"><a href="#从网络路由连通性的原理上来看负载均衡lvs的DR、NAT、FullNAT到底搞了些什么鬼,以及为什么要这么搞,和带来的优缺点:《就是要你懂负载均衡–lvs和转发模式》" class="headerlink" title="从网络路由连通性的原理上来看负载均衡lvs的DR、NAT、FullNAT到底搞了些什么鬼,以及为什么要这么搞,和带来的优缺点:《就是要你懂负载均衡–lvs和转发模式》"></a><a href="/2019/06/20/%E5%B0%B1%E6%98%AF%E8%A6%81%E4%BD%A0%E6%87%82%E8%B4%9F%E8%BD%BD%E5%9D%87%E8%A1%A1--lvs%E5%92%8C%E8%BD%AC%E5%8F%91%E6%A8%A1%E5%BC%8F/">从网络路由连通性的原理上来看负载均衡lvs的DR、NAT、FullNAT到底搞了些什么鬼,以及为什么要这么搞,和带来的优缺点:《就是要你懂负载均衡–lvs和转发模式》</a></h4><p><img src="https://cdn.jsdelivr.net/gh/plantegg/plantegg.github.io/images/oss/94d55b926b5bb1573c4cab8353428712.png"></p><h4 id="LVS-20倍的负载不均衡,原来是内核的这个Bug,这个内核bug现在还在,可以稳定重现,有兴趣的话去重现一下,然后对照源代码以及抓包分析一下就清楚了。"><a href="#LVS-20倍的负载不均衡,原来是内核的这个Bug,这个内核bug现在还在,可以稳定重现,有兴趣的话去重现一下,然后对照源代码以及抓包分析一下就清楚了。" class="headerlink" title="LVS 20倍的负载不均衡,原来是内核的这个Bug,这个内核bug现在还在,可以稳定重现,有兴趣的话去重现一下,然后对照源代码以及抓包分析一下就清楚了。"></a><a href="/2019/07/19/%E5%B0%B1%E6%98%AF%E8%A6%81%E4%BD%A0%E6%87%82%E8%B4%9F%E8%BD%BD%E5%9D%87%E8%A1%A1--%E8%B4%9F%E8%BD%BD%E5%9D%87%E8%A1%A1%E8%B0%83%E5%BA%A6%E7%AE%97%E6%B3%95%E5%92%8C%E4%B8%BA%E4%BB%80%E4%B9%88%E4%B8%8D%E5%9D%87%E8%A1%A1/">LVS 20倍的负载不均衡,原来是内核的这个Bug</a>,这个内核bug现在还在,可以稳定重现,有兴趣的话去重现一下,然后对照源代码以及抓包分析一下就清楚了。</h4><h4 id="就是要你懂TCP–握手和挥手,不是你想象中三次握手、四次挥手就理解了TCP,本文从握手的本质–握手都做了什么事情、连接的本质是什么等来阐述握手、挥手的原理"><a href="#就是要你懂TCP–握手和挥手,不是你想象中三次握手、四次挥手就理解了TCP,本文从握手的本质–握手都做了什么事情、连接的本质是什么等来阐述握手、挥手的原理" class="headerlink" title="就是要你懂TCP–握手和挥手,不是你想象中三次握手、四次挥手就理解了TCP,本文从握手的本质–握手都做了什么事情、连接的本质是什么等来阐述握手、挥手的原理"></a><a href="/2017/06/02/%E5%B0%B1%E6%98%AF%E8%A6%81%E4%BD%A0%E6%87%82TCP--%E8%BF%9E%E6%8E%A5%E5%92%8C%E6%8F%A1%E6%89%8B/">就是要你懂TCP–握手和挥手,不是你想象中三次握手、四次挥手就理解了TCP,本文从握手的本质–握手都做了什么事情、连接的本质是什么等来阐述握手、挥手的原理</a></h4><p><img src="https://cdn.jsdelivr.net/gh/plantegg/plantegg.github.io/images/oss/6d66dadecb72e11e3e5ab765c6c3ea2e.png"></p><h4 id="nslookup-OK-but-ping-fail–看看老司机是如何解决问题的,解决问题的方法肯定比知识点重要多了,同时透过一个问题怎么样通篇来理解一大块知识,让这块原理真正在你的只是提示中扎根下来"><a href="#nslookup-OK-but-ping-fail–看看老司机是如何解决问题的,解决问题的方法肯定比知识点重要多了,同时透过一个问题怎么样通篇来理解一大块知识,让这块原理真正在你的只是提示中扎根下来" class="headerlink" title="nslookup OK but ping fail–看看老司机是如何解决问题的,解决问题的方法肯定比知识点重要多了,同时透过一个问题怎么样通篇来理解一大块知识,让这块原理真正在你的只是提示中扎根下来"></a><a href="/2019/01/09/nslookup-OK-but-ping-fail/">nslookup OK but ping fail–看看老司机是如何解决问题的,解决问题的方法肯定比知识点重要多了,同时透过一个问题怎么样通篇来理解一大块知识,让这块原理真正在你的只是提示中扎根下来</a></h4><p><img src="https://cdn.jsdelivr.net/gh/plantegg/plantegg.github.io/images/oss/ca466bb6430f1149958ceb41b9ffe591.png"></p><h4 id="如何在工作中学习-一篇很土但是很务实可以复制的方法论文章。不要讲举一反三、触类旁通,谁都知道要举一反三、触类旁通,但是为什么我总是不能够举一反三、触类旁通?"><a href="#如何在工作中学习-一篇很土但是很务实可以复制的方法论文章。不要讲举一反三、触类旁通,谁都知道要举一反三、触类旁通,但是为什么我总是不能够举一反三、触类旁通?" class="headerlink" title="如何在工作中学习 一篇很土但是很务实可以复制的方法论文章。不要讲举一反三、触类旁通,谁都知道要举一反三、触类旁通,但是为什么我总是不能够举一反三、触类旁通?"></a><a href="/2018/05/23/%E5%A6%82%E4%BD%95%E5%9C%A8%E5%B7%A5%E4%BD%9C%E4%B8%AD%E5%AD%A6%E4%B9%A0/">如何在工作中学习</a> 一篇很土但是很务实可以复制的方法论文章。不要讲举一反三、触类旁通,谁都知道要举一反三、触类旁通,但是为什么我总是不能够举一反三、触类旁通?</h4><h4 id="举三反一–从理论知识到实际问题的推导-坚决不让思路跑偏,如何从一个理论知识点推断可能的问题"><a href="#举三反一–从理论知识到实际问题的推导-坚决不让思路跑偏,如何从一个理论知识点推断可能的问题" class="headerlink" title="举三反一–从理论知识到实际问题的推导 坚决不让思路跑偏,如何从一个理论知识点推断可能的问题"></a><a href="/2020/11/02/%E4%B8%BE%E4%B8%89%E5%8F%8D%E4%B8%80--%E4%BB%8E%E7%90%86%E8%AE%BA%E7%9F%A5%E8%AF%86%E5%88%B0%E5%AE%9E%E9%99%85%E9%97%AE%E9%A2%98%E7%9A%84%E6%8E%A8%E5%AF%BC/">举三反一–从理论知识到实际问题的推导</a> 坚决不让思路跑偏,如何从一个理论知识点推断可能的问题</h4><h2 id="性能相关(2015-2018年)"><a href="#性能相关(2015-2018年)" class="headerlink" title="性能相关(2015-2018年)"></a>性能相关(2015-2018年)</h2><h4 id="就是要你懂TCP–半连接队列和全连接队列-偶发性的连接reset异常、重启服务后短时间的连接异常"><a href="#就是要你懂TCP–半连接队列和全连接队列-偶发性的连接reset异常、重启服务后短时间的连接异常" class="headerlink" title="就是要你懂TCP–半连接队列和全连接队列 偶发性的连接reset异常、重启服务后短时间的连接异常"></a><a href="/2017/06/07/%E5%B0%B1%E6%98%AF%E8%A6%81%E4%BD%A0%E6%87%82TCP--%E5%8D%8A%E8%BF%9E%E6%8E%A5%E9%98%9F%E5%88%97%E5%92%8C%E5%85%A8%E8%BF%9E%E6%8E%A5%E9%98%9F%E5%88%97/">就是要你懂TCP–半连接队列和全连接队列</a> 偶发性的连接reset异常、重启服务后短时间的连接异常</h4><h4 id="就是要你懂TCP–性能和发送接收Buffer的关系:发送窗口大小-Buffer-、接收窗口大小-Buffer-对TCP传输速度的影响,以及怎么观察窗口对传输速度的影响。BDP、RT、带宽对传输速度又是怎么影响的-发送窗口大小-Buffer-、接收窗口大小-Buffer-对TCP传输速度的影响,以及怎么观察窗口对传输速度的影响"><a href="#就是要你懂TCP–性能和发送接收Buffer的关系:发送窗口大小-Buffer-、接收窗口大小-Buffer-对TCP传输速度的影响,以及怎么观察窗口对传输速度的影响。BDP、RT、带宽对传输速度又是怎么影响的-发送窗口大小-Buffer-、接收窗口大小-Buffer-对TCP传输速度的影响,以及怎么观察窗口对传输速度的影响" class="headerlink" title="就是要你懂TCP–性能和发送接收Buffer的关系:发送窗口大小(Buffer)、接收窗口大小(Buffer)对TCP传输速度的影响,以及怎么观察窗口对传输速度的影响。BDP、RT、带宽对传输速度又是怎么影响的 发送窗口大小(Buffer)、接收窗口大小(Buffer)对TCP传输速度的影响,以及怎么观察窗口对传输速度的影响"></a><a href="/2019/09/28/%E5%B0%B1%E6%98%AF%E8%A6%81%E4%BD%A0%E6%87%82TCP--%E6%80%A7%E8%83%BD%E5%92%8C%E5%8F%91%E9%80%81%E6%8E%A5%E6%94%B6Buffer%E7%9A%84%E5%85%B3%E7%B3%BB/">就是要你懂TCP–性能和发送接收Buffer的关系:发送窗口大小(Buffer)、接收窗口大小(Buffer)对TCP传输速度的影响,以及怎么观察窗口对传输速度的影响。BDP、RT、带宽对传输速度又是怎么影响的</a> 发送窗口大小(Buffer)、接收窗口大小(Buffer)对TCP传输速度的影响,以及怎么观察窗口对传输速度的影响</h4><h4 id="就是要你懂TCP–性能优化大全"><a href="#就是要你懂TCP–性能优化大全" class="headerlink" title="就是要你懂TCP–性能优化大全"></a><a href="/2019/06/21/%E5%B0%B1%E6%98%AF%E8%A6%81%E4%BD%A0%E6%87%82TCP--%E6%80%A7%E8%83%BD%E4%BC%98%E5%8C%96%E5%A4%A7%E5%85%A8/">就是要你懂TCP–性能优化大全</a></h4><h4 id="就是要你懂TCP–TCP性能问题-Nagle算法和delay-ack"><a href="#就是要你懂TCP–TCP性能问题-Nagle算法和delay-ack" class="headerlink" title="就是要你懂TCP–TCP性能问题 Nagle算法和delay ack"></a><a href="/2018/06/14/%E5%B0%B1%E6%98%AF%E8%A6%81%E4%BD%A0%E6%87%82TCP--%E6%9C%80%E7%BB%8F%E5%85%B8%E7%9A%84TCP%E6%80%A7%E8%83%BD%E9%97%AE%E9%A2%98/">就是要你懂TCP–TCP性能问题</a> Nagle算法和delay ack</h4><h4 id="10倍性能提升全过程-在双11的紧张流程下,将系统tps从500优化到5500,从网络到snat、再到Spring和StackTrace,看看一个性能全栈工程师如何在各种工具加持下发现各种问题的。"><a href="#10倍性能提升全过程-在双11的紧张流程下,将系统tps从500优化到5500,从网络到snat、再到Spring和StackTrace,看看一个性能全栈工程师如何在各种工具加持下发现各种问题的。" class="headerlink" title="10倍性能提升全过程 在双11的紧张流程下,将系统tps从500优化到5500,从网络到snat、再到Spring和StackTrace,看看一个性能全栈工程师如何在各种工具加持下发现各种问题的。"></a><a href="/2018/01/23/10+%E5%80%8D%E6%80%A7%E8%83%BD%E6%8F%90%E5%8D%87%E5%85%A8%E8%BF%87%E7%A8%8B/">10倍性能提升全过程</a> 在双11的紧张流程下,将系统tps从500优化到5500,从网络到snat、再到Spring和StackTrace,看看一个性能全栈工程师如何在各种工具加持下发现各种问题的。</h4><h2 id="CPU系列文章(2021年完成)"><a href="#CPU系列文章(2021年完成)" class="headerlink" title="CPU系列文章(2021年完成)"></a>CPU系列文章(2021年完成)</h2><h4 id="CPU的制造和概念"><a href="#CPU的制造和概念" class="headerlink" title="CPU的制造和概念"></a><a href="/2021/06/01/CPU%E7%9A%84%E5%88%B6%E9%80%A0%E5%92%8C%E6%A6%82%E5%BF%B5/">CPU的制造和概念</a></h4><h4 id="十年后数据库还是不敢拥抱NUMA?"><a href="#十年后数据库还是不敢拥抱NUMA?" class="headerlink" title="十年后数据库还是不敢拥抱NUMA?"></a><a href="/2021/05/14/%E5%8D%81%E5%B9%B4%E5%90%8E%E6%95%B0%E6%8D%AE%E5%BA%93%E8%BF%98%E6%98%AF%E4%B8%8D%E6%95%A2%E6%8B%A5%E6%8A%B1NUMA/">十年后数据库还是不敢拥抱NUMA?</a></h4><h4 id="Intel-PAUSE指令变化是如何影响自旋锁以及MySQL的性能的-x2F-2019-x2F-12-x2F-16-x2F-Intel-PAUSE指令变化是如何影响自旋锁以及MySQL的性能的-x2F"><a href="#Intel-PAUSE指令变化是如何影响自旋锁以及MySQL的性能的-x2F-2019-x2F-12-x2F-16-x2F-Intel-PAUSE指令变化是如何影响自旋锁以及MySQL的性能的-x2F" class="headerlink" title="[Intel PAUSE指令变化是如何影响自旋锁以及MySQL的性能的](/2019/12/16/Intel PAUSE指令变化是如何影响自旋锁以及MySQL的性能的/)"></a>[Intel PAUSE指令变化是如何影响自旋锁以及MySQL的性能的](/2019/12/16/Intel PAUSE指令变化是如何影响自旋锁以及MySQL的性能的/)</h4><h4 id="Perf-IPC以及CPU性能-x2F-2021-x2F-05-x2F-16-x2F-Perf-IPC以及CPU利用率-x2F"><a href="#Perf-IPC以及CPU性能-x2F-2021-x2F-05-x2F-16-x2F-Perf-IPC以及CPU利用率-x2F" class="headerlink" title="[Perf IPC以及CPU性能](/2021/05/16/Perf IPC以及CPU利用率/)"></a>[Perf IPC以及CPU性能](/2021/05/16/Perf IPC以及CPU利用率/)</h4><h4 id="CPU性能和CACHE"><a href="#CPU性能和CACHE" class="headerlink" title="CPU性能和CACHE"></a><a href="https://plantegg.github.io/2021/07/19/CPU%E6%80%A7%E8%83%BD%E5%92%8CCACHE/">CPU性能和CACHE</a></h4><h4 id="CPU-性能和Cache-Line-x2F-2021-x2F-05-x2F-16-x2F-CPU-Cache-Line-和性能-x2F"><a href="#CPU-性能和Cache-Line-x2F-2021-x2F-05-x2F-16-x2F-CPU-Cache-Line-和性能-x2F" class="headerlink" title="[CPU 性能和Cache Line](/2021/05/16/CPU Cache Line 和性能/)"></a>[CPU 性能和Cache Line](/2021/05/16/CPU Cache Line 和性能/)</h4><h4 id="AMD-Zen-CPU-架构-以及-AMD、海光、Intel、鲲鹏的性能对比"><a href="#AMD-Zen-CPU-架构-以及-AMD、海光、Intel、鲲鹏的性能对比" class="headerlink" title="AMD Zen CPU 架构 以及 AMD、海光、Intel、鲲鹏的性能对比"></a><a href="/2021/08/13/AMD_Zen_CPU%E6%9E%B6%E6%9E%84/">AMD Zen CPU 架构 以及 AMD、海光、Intel、鲲鹏的性能对比</a></h4><h4 id="Intel、海光、鲲鹏920、飞腾2500-CPU性能对比"><a href="#Intel、海光、鲲鹏920、飞腾2500-CPU性能对比" class="headerlink" title="Intel、海光、鲲鹏920、飞腾2500 CPU性能对比"></a><a href="/2021/06/18/%E5%87%A0%E6%AC%BECPU%E6%80%A7%E8%83%BD%E5%AF%B9%E6%AF%94/">Intel、海光、鲲鹏920、飞腾2500 CPU性能对比</a></h4><h2 id="网络相关基础知识(2017年完成)"><a href="#网络相关基础知识(2017年完成)" class="headerlink" title="网络相关基础知识(2017年完成)"></a>网络相关基础知识(2017年完成)</h2><h4 id="就是要你懂网络–一个网络包的旅程"><a href="#就是要你懂网络–一个网络包的旅程" class="headerlink" title="就是要你懂网络–一个网络包的旅程"></a><a href="/2019/05/15/%E5%B0%B1%E6%98%AF%E8%A6%81%E4%BD%A0%E6%87%82%E7%BD%91%E7%BB%9C--%E4%B8%80%E4%B8%AA%E7%BD%91%E7%BB%9C%E5%8C%85%E7%9A%84%E6%97%85%E7%A8%8B/">就是要你懂网络–一个网络包的旅程</a></h4><h4 id="通过案例来理解MSS、MTU等相关TCP概念"><a href="#通过案例来理解MSS、MTU等相关TCP概念" class="headerlink" title="通过案例来理解MSS、MTU等相关TCP概念"></a><a href="/2018/05/07/%E5%B0%B1%E6%98%AF%E8%A6%81%E4%BD%A0%E6%87%82TCP--%E9%80%9A%E8%BF%87%E6%A1%88%E4%BE%8B%E6%9D%A5%E5%AD%A6%E4%B9%A0MSS%E3%80%81MTU/">通过案例来理解MSS、MTU等相关TCP概念</a></h4><h4 id="就是要你懂TCP–握手和挥手"><a href="#就是要你懂TCP–握手和挥手" class="headerlink" title="就是要你懂TCP–握手和挥手"></a><a href="/2017/06/02/%E5%B0%B1%E6%98%AF%E8%A6%81%E4%BD%A0%E6%87%82TCP--%E8%BF%9E%E6%8E%A5%E5%92%8C%E6%8F%A1%E6%89%8B/">就是要你懂TCP–握手和挥手</a></h4><h4 id="wireshark-dup-ack-issue-and-keepalive"><a href="#wireshark-dup-ack-issue-and-keepalive" class="headerlink" title="wireshark-dup-ack-issue and keepalive"></a><a href="/2017/06/02/%E5%B0%B1%E6%98%AF%E8%A6%81%E4%BD%A0%E6%87%82TCP--wireshark-dup-ack-issue/">wireshark-dup-ack-issue and keepalive</a></h4><h4 id="一个没有遵守tcp规则导致的问题"><a href="#一个没有遵守tcp规则导致的问题" class="headerlink" title="一个没有遵守tcp规则导致的问题"></a><a href="/2018/11/26/%E4%B8%80%E4%B8%AA%E6%B2%A1%E6%9C%89%E9%81%B5%E5%AE%88tcp%E8%A7%84%E5%88%99%E5%AF%BC%E8%87%B4%E7%9A%84%E9%97%AE%E9%A2%98/">一个没有遵守tcp规则导致的问题</a></h4><h4 id="kubernetes-service-和-kube-proxy详解-x2F-2020-x2F-09-x2F-22-x2F-kubernetes-service-和-kube-proxy详解-x2F"><a href="#kubernetes-service-和-kube-proxy详解-x2F-2020-x2F-09-x2F-22-x2F-kubernetes-service-和-kube-proxy详解-x2F" class="headerlink" title="[kubernetes service 和 kube-proxy详解](/2020/09/22/kubernetes service 和 kube-proxy详解/)"></a>[kubernetes service 和 kube-proxy详解](/2020/09/22/kubernetes service 和 kube-proxy详解/)</h4><h2 id="DNS相关"><a href="#DNS相关" class="headerlink" title="DNS相关"></a>DNS相关</h2><h4 id="就是要你懂DNS–一文搞懂域名解析相关问题"><a href="#就是要你懂DNS–一文搞懂域名解析相关问题" class="headerlink" title="就是要你懂DNS–一文搞懂域名解析相关问题"></a><a href="/2019/06/09/%E4%B8%80%E6%96%87%E6%90%9E%E6%87%82%E5%9F%9F%E5%90%8D%E8%A7%A3%E6%9E%90%E7%9B%B8%E5%85%B3%E9%97%AE%E9%A2%98/">就是要你懂DNS–一文搞懂域名解析相关问题</a></h4><h4 id="nslookup-OK-but-ping-fail"><a href="#nslookup-OK-but-ping-fail" class="headerlink" title="nslookup OK but ping fail"></a><a href="/2019/01/09/nslookup-OK-but-ping-fail/">nslookup OK but ping fail</a></h4><h4 id="Docker中的DNS解析过程"><a href="#Docker中的DNS解析过程" class="headerlink" title="Docker中的DNS解析过程"></a><a href="/2019/01/12/Docker%E4%B8%AD%E7%9A%84DNS%E8%A7%A3%E6%9E%90%E8%BF%87%E7%A8%8B/">Docker中的DNS解析过程</a></h4><h4 id="windows7的wifi总是报DNS域名异常无法上网"><a href="#windows7的wifi总是报DNS域名异常无法上网" class="headerlink" title="windows7的wifi总是报DNS域名异常无法上网"></a><a href="/2019/01/10/windows7%E7%9A%84wifi%E6%80%BB%E6%98%AF%E6%8A%A5DNS%E5%9F%9F%E5%90%8D%E5%BC%82%E5%B8%B8%E6%97%A0%E6%B3%95%E4%B8%8A%E7%BD%91/">windows7的wifi总是报DNS域名异常无法上网</a></h4><h2 id="LVS-负载均衡"><a href="#LVS-负载均衡" class="headerlink" title="LVS 负载均衡"></a>LVS 负载均衡</h2><h4 id="就是要你懂负载均衡–lvs和转发模式"><a href="#就是要你懂负载均衡–lvs和转发模式" class="headerlink" title="就是要你懂负载均衡–lvs和转发模式"></a><a href="/2019/06/20/%E5%B0%B1%E6%98%AF%E8%A6%81%E4%BD%A0%E6%87%82%E8%B4%9F%E8%BD%BD%E5%9D%87%E8%A1%A1--lvs%E5%92%8C%E8%BD%AC%E5%8F%91%E6%A8%A1%E5%BC%8F/">就是要你懂负载均衡–lvs和转发模式</a></h4><h4 id="就是要你懂负载均衡–负载均衡调度算法和为什么不均衡"><a href="#就是要你懂负载均衡–负载均衡调度算法和为什么不均衡" class="headerlink" title="就是要你懂负载均衡–负载均衡调度算法和为什么不均衡"></a><a href="/2019/07/19/%E5%B0%B1%E6%98%AF%E8%A6%81%E4%BD%A0%E6%87%82%E8%B4%9F%E8%BD%BD%E5%9D%87%E8%A1%A1--%E8%B4%9F%E8%BD%BD%E5%9D%87%E8%A1%A1%E8%B0%83%E5%BA%A6%E7%AE%97%E6%B3%95%E5%92%8C%E4%B8%BA%E4%BB%80%E4%B9%88%E4%B8%8D%E5%9D%87%E8%A1%A1/">就是要你懂负载均衡–负载均衡调度算法和为什么不均衡</a></h4><h2 id="网络工具"><a href="#网络工具" class="headerlink" title="网络工具"></a>网络工具</h2><h4 id="就是要你懂Unix-Socket-进行抓包解析"><a href="#就是要你懂Unix-Socket-进行抓包解析" class="headerlink" title="就是要你懂Unix Socket 进行抓包解析"></a><a href="/2018/01/01/%E9%80%9A%E8%BF%87tcpdump%E5%AF%B9Unix%20Socket%20%E8%BF%9B%E8%A1%8C%E6%8A%93%E5%8C%85%E8%A7%A3%E6%9E%90/">就是要你懂Unix Socket 进行抓包解析</a></h4><h4 id="就是要你懂网络监控–ss用法大全"><a href="#就是要你懂网络监控–ss用法大全" class="headerlink" title="就是要你懂网络监控–ss用法大全"></a><a href="/2016/10/12/ss%E7%94%A8%E6%B3%95%E5%A4%A7%E5%85%A8/">就是要你懂网络监控–ss用法大全</a></h4><h4 id="就是要你懂抓包–WireShark之命令行版tshark"><a href="#就是要你懂抓包–WireShark之命令行版tshark" class="headerlink" title="就是要你懂抓包–WireShark之命令行版tshark"></a><a href="/2019/06/21/%E5%B0%B1%E6%98%AF%E8%A6%81%E4%BD%A0%E6%87%82%E6%8A%93%E5%8C%85--WireShark%E4%B9%8B%E5%91%BD%E4%BB%A4%E8%A1%8C%E7%89%88tshark/">就是要你懂抓包–WireShark之命令行版tshark</a></h4><h4 id="netstat-timer-keepalive-explain"><a href="#netstat-timer-keepalive-explain" class="headerlink" title="netstat timer keepalive explain"></a><a href="/2017/08/28/netstat%20%E7%AD%89%E7%BD%91%E7%BB%9C%E5%B7%A5%E5%85%B7/">netstat timer keepalive explain</a></h4><h4 id="Git-HTTP-Proxy-and-SSH-Proxy"><a href="#Git-HTTP-Proxy-and-SSH-Proxy" class="headerlink" title="Git HTTP Proxy and SSH Proxy"></a><a href="/2018/03/14/%E5%A6%82%E4%BD%95%E8%AE%BE%E7%BD%AEgit%20Proxy/">Git HTTP Proxy and SSH Proxy</a></h4>]]></content>
<summary type="html">
<h2 id="关于本博"><a href="#关于本博" class="headerlink" title="关于本博"></a>关于本博</h2><p>find me on twitter: <a href="https://twitter.com/plantegg" tar
</summary>
<category term="others" scheme="https://plantegg.github.io/categories/others/"/>
<category term="performance" scheme="https://plantegg.github.io/tags/performance/"/>
<category term="LVS" scheme="https://plantegg.github.io/tags/LVS/"/>
<category term="network" scheme="https://plantegg.github.io/tags/network/"/>
<category term="tcpdump" scheme="https://plantegg.github.io/tags/tcpdump/"/>
<category term="TCP queue" scheme="https://plantegg.github.io/tags/TCP-queue/"/>
</entry>
<entry>
<title>为什么你的 SYN 包被丢 net.ipv4.tcp_tw_recycle</title>
<link href="https://plantegg.github.io/2024/12/29/net.ipv4.tcp_tw_recycle/"/>
<id>https://plantegg.github.io/2024/12/29/net.ipv4.tcp_tw_recycle/</id>
<published>2024-12-29T09:30:03.000Z</published>
<updated>2024-12-30T02:31:18.776Z</updated>
<content type="html"><![CDATA[<h1 id="为什么你的-SYN-包被丢-net-ipv4-tcp-tw-recycle"><a href="#为什么你的-SYN-包被丢-net-ipv4-tcp-tw-recycle" class="headerlink" title="为什么你的 SYN 包被丢 net.ipv4.tcp_tw_recycle"></a>为什么你的 SYN 包被丢 net.ipv4.tcp_tw_recycle</h1><p>本来这是我计划在知识星球里要写的<a href="https://articles.zsxq.com/id_1fdbevh4fzf0.html" target="_blank" rel="noopener">连续剧</a>,我打算好好多写几篇的(每篇都计划重现一个场景/坑点),后来没看到任何一个同学参与,这样的话写了你们看完也没有体感,所以我直接公布答案吧,还能节省点你们的时间,记住干货就好:<strong>不要开 net.ipv4.tcp_tw_recycle</strong></p><p>作为全网最权威/最全面的 net.ipv4.tcp_tw_recycle 问题分析还是从知识星球分享出来,希望更多的人避免踩坑</p><h2 id="答案"><a href="#答案" class="headerlink" title="答案"></a>答案</h2><p>首先不通了是因为服务端开启了 net.ipv4.tcp_tw_recycle,需要判断握手包的时间得保持递增(T2 - T1 >1)</p><p>tcpping 一直是通的,因为服务端没有记录到 T1,T1 是每次 FIN 断开时记录,T2 是每个 SYN 包中携带。当 curl 然后断开时走了 FIN 服务端记录下 T1,下次 tcpping 就可以比较了,所以有一半概率不通,直到 1 分钟后 T1 一直没有跟新,超过 60 秒的 T1 失效,后面连接正常</p><h2 id="为什么要有-net-ipv4-tcp-tw-recycle?"><a href="#为什么要有-net-ipv4-tcp-tw-recycle?" class="headerlink" title="为什么要有 net.ipv4.tcp_tw_recycle?"></a>为什么要有 net.ipv4.tcp_tw_recycle?</h2><p>net.ipv4.tcp_tw_recycle 是一个 Linux 内核参数,用于控制 TCP 连接的 TIME_WAIT 状态的处理方式。这个参数的主要作用是加速 TIME_WAIT 套接字的回收。</p><p>参考:<a href="https://vincent.bernat.ch/en/blog/2014-tcp-time-wait-state-linux" target="_blank" rel="noopener">Coping with the TCP TIME-WAIT state on busy Linux servers</a> </p><h3 id="PAWS-Protection-Against-Wrapped-Sequences"><a href="#PAWS-Protection-Against-Wrapped-Sequences" class="headerlink" title="PAWS(Protection Against Wrapped Sequences)"></a>PAWS(Protection Against Wrapped Sequences)</h3><p>TCP 包的 seq 是有限的(4字节 32bit),会在达到最大值后回绕到零,这种情况称为”seq回绕”,seq 回绕后怎么判断这个 seq 是重复的(丢弃) 还是可以接受的?</p><p>引入 <a href="https://perthcharles.github.io/2015/08/27/timestamp-intro/" target="_blank" rel="noopener">PAWS</a> 的目的是确保即使seq 回绕发生,也能正确地处理序列号,除了 seq 外额外在 TCP options 里面增加了 timestamp 来作为维护数据包的seq 正确的判断。时间戳随每个数据包发送,并且单调增加,因此即使序列号回绕,接收方也可以使用时间戳来确定数据包的真实顺序,这就是 PAWS</p><p><a href="https://perthcharles.github.io/2015/08/27/timestamp-NAT/" target="_blank" rel="noopener">PAWS会检查syn 网络包的 timestamps</a> ,来判断这个syn包的发送时间是否早于上一次同 ip/stream(3.10 是 per ip/4.10 是 per stream) 的 fin包,如果早就扔掉,这也是导致syn 握手失败的一个高发原因,尤其是在NAT场景下。原本 PAWS 是每个连接的维度,但同时开启tcp_timestamp和tcp_tw_recycle之后,PAWS就变成per host粒度了</p><figure class="highlight plain"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br></pre></td><td class="code"><pre><span class="line">timestamp为TCP/IP协议栈提供了两个功能: </span><br><span class="line"> a. 更加准确的RTT测量数据,尤其是有丢包时 -- RTTM </span><br><span class="line"> b. 保证了在极端情况下,TCP的可靠性 -- PAWS</span><br></pre></td></tr></table></figure><p>不同 OS 内核版本因为 timestamp 生成不一样导致 PAWS 行为还不一样,通过参数来控制:net.ipv4.tcp_timestamps</p><h2 id="服务端如何通过判断时间戳来丢包?"><a href="#服务端如何通过判断时间戳来丢包?" class="headerlink" title="服务端如何通过判断时间戳来丢包?"></a>服务端如何通过判断时间戳来丢包?</h2><p>对同一个 src-ip 记录最后一次 FIN 包的时间戳为 T1,当这个 src-ip 有 SYN 包时取 SYN 包中的时间戳为 T2</p><p>如果 T2-T1 小于 1 就扔掉这个 SYN 包</p><p>一旦发生这种 SYN 包被丢弃,对应的监控指标(LINUX_MIB_PAWSPASSIVEREJECTED):</p><figure class="highlight plain"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br></pre></td><td class="code"><pre><span class="line">//第二个指标包含第一个,passive connections rejected 了也一定会是 SYN dropped</span><br><span class="line">#netstat -s |egrep "SYNs to LISTEN sockets dropped|passive connections rejected because"</span><br><span class="line"> 960055 passive connections rejected because of time stamp</span><br><span class="line"> 1049368 SYNs to LISTEN sockets dropped</span><br><span class="line"></span><br><span class="line">#netstat -s |egrep "SYNs to LISTEN sockets dropped|passive connections rejected because"</span><br><span class="line"> 960535 passive connections rejected because of time stamp</span><br><span class="line"> 1049848 SYNs to LISTEN sockets dropped</span><br><span class="line"></span><br><span class="line">#netstat -s |egrep "SYNs to LISTEN sockets dropped|passive connections rejected because"</span><br><span class="line"> 961015 passive connections rejected because of time stamp</span><br><span class="line"> 1050328 SYNs to LISTEN sockets dropped</span><br></pre></td></tr></table></figure><p>这个指标也很重要,我喜欢这种</p><h3 id="服务端丢包条件更多细节"><a href="#服务端丢包条件更多细节" class="headerlink" title="服务端丢包条件更多细节"></a>服务端丢包条件更多细节</h3><p>服务端设置 net.ipv4.tcp_tw_recycle 为 1 是必要条件,然后同时满足了这两个条件:</p><ol><li><code>(u32)get_seconds() - tm->tcpm_ts_stamp < TCP_PAWS_MSL(=60)</code>:容易满足,几乎总是满足。<strong>对比的是本地时间</strong>。收到syn的<strong>本地时间</strong>相比上次收包记录的<strong>本地时间</strong>,小于60s</li><li><code>(s32)(tm->tcpm_ts - req->ts_recent) > TCP_PAWS_WINDOW(=1)</code>:对比的是tcp时间戳,上次更新的tcp时间戳 - 这次syn的tcp时间戳,大于1(并且小于231)。也就是这次syn的tcp时间戳,如果<strong>小于</strong>上次记录到的时间戳(ms级),就会被丢掉。</li></ol><p>这里tm和req对应什么?一个四元组,还是ip地址,还是其他?3.10<strong>对应的是ip地址</strong>(不同内核版本不一样)</p><p>上次记录的时间戳是什么?注意这里对比的都是tm时间,是在连接关闭相关阶段,通过<code>tcp_remember_stamp</code>或<code>tcp_tw_remember_stamp</code>函数记录的,具体情况比较多。</p><h4 id="服务端将客户端的时间戳保存在哪里?"><a href="#服务端将客户端的时间戳保存在哪里?" class="headerlink" title="服务端将客户端的时间戳保存在哪里?"></a>服务端将客户端的时间戳保存在哪里?</h4><p>6u(2.6.32)代码:</p><p>由于inet_timewait_sock在连接进入tw状态会被释放掉,其中记录最近一次接收报文的timestamp信息会丢失;VJ 的思路,把此tcp stamp信息放入路由cache表的rtable中struct inet_peer中,rtable中只保srcIP,dstIP的PATH信息,没有端口号信息,也就是同src-dstIP(即使端口不同)的所有连接受同一个timestamp限制。</p><p>7u(3.10.0)代码:</p><p>3.5版本以后的内核版本不再使用rtable记录,tcp stamp信息改为存放在目标地址出接口net中存放的tcp_metrics_block,timestamp判断逻辑跟6u比增加了“如果之前有记录timestamp且在一个MSL内,而本次连接无timestamp时,请求被丢弃”的逻辑,这么修改的原因参见:</p><p><a href="https://patchwork.ozlabs.org/patch/380021/" target="_blank" rel="noopener">https://patchwork.ozlabs.org/patch/380021/</a></p><p><a href="https://patchwork.ozlabs.org/patch/379163/" target="_blank" rel="noopener">https://patchwork.ozlabs.org/patch/379163/</a></p><p>2017 年的这个讨论<a href="https://patchwork.ozlabs.org/project/netdev/patch/[email protected]/" target="_blank" rel="noopener">https://patchwork.ozlabs.org/project/netdev/patch/[email protected]/</a> 要去掉这个全局存放,改成可以按客户端 port 来记录</p><h2 id="客户端如何生成时间戳?"><a href="#客户端如何生成时间戳?" class="headerlink" title="客户端如何生成时间戳?"></a>客户端如何生成时间戳?</h2><ul><li>3.10 内核是按 <strong>客户端 ip</strong> 来生成 timestamp,也就是不管跟谁通信都是全局单调递增</li><li>4.19(4.12)是按 <strong>ip 对</strong>(per-destination timestamp<strong>)</strong>来生 timestamp ,也就是一对 ip 之间保证单调递增;</li><li>4.10之前是 per-client 生成递增 timestamp ,4.10 改成 per-connection 生成递增 timestamp(导致了兼容 net.ipv4.tcp_tw_recycle问题严重),4.11 改成 per-destination-host 生成递增 timestamp(<strong>downgrade to per-host timestamp offsets</strong>);4.12 去掉 net.ipv4.tcp_tw_recycle 参数永远解决问题</li></ul><h2 id="有哪些场景会触发-net-ipv4-tcp-tw-recycle-丢包"><a href="#有哪些场景会触发-net-ipv4-tcp-tw-recycle-丢包" class="headerlink" title="有哪些场景会触发 net.ipv4.tcp_tw_recycle 丢包"></a>有哪些场景会触发 net.ipv4.tcp_tw_recycle 丢包</h2><p>服务端的内核参数 net.ipv4.tcp_tw_recycle(<a href="https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=4396e46187ca5070219b81773c4e65088dac50cc" target="_blank" rel="noopener">4.12内核 </a> 中删除这个参数了) 和 net.ipv4.tcp_timestamps 的值都为 1时,服务器会检查每一个 SYN报文中的时间戳(Timestamp,跟同一ip下最近一次 FIN包时间对比),若 <a href="https://vincent.bernat.ch/en/blog/2014-tcp-time-wait-state-linux" target="_blank" rel="noopener">Timestamp 不是递增的关系</a>,就扔掉这个SYN包(<strong>诊断</strong>:netstat -s | grep “ passive connections rejected because of time stamp”),常见触发时间戳非递增场景:</p><ol><li><a href="https://lwn.net/Articles/708021/" target="_blank" rel="noopener">4.10 内核</a>,一直必现大概率性丢包。<a href="https://github.com/torvalds/linux/commit/95a22caee396cef0bb2ca8fafdd82966a49367bb" target="_blank" rel="noopener">4.11 改成了</a> per-destination host的算法 //内核改来改去也是坑点</li><li>tcpping 这种时间戳按连接随机的,必现大概率持续丢包</li><li><strong>同一个客户端通过直连或者 NAT 后两条链路到同一个服务端</strong>,客户端生成时间戳是 by dst ip,导致大概率持续丢包</li><li>经过NAT/LVS 后多个客户端被当成一个客户端,小概率偶尔出现——通过 tc qdisc 可以来构造丢包重现该场景</li><li>网路链路复杂/链路长容易导致包乱序,进而出发丢包,取决于网络会小概率出现</li><li>客户端修改 net.ipv4.tcp_timestamps <ul><li>1->0,触发持续60秒大概率必现的丢包,60秒后恢复</li><li>0->1 持续大概率一直丢包60秒; 60秒过后如果网络延时略高且客户端并发大一直有上一次 FIN 时间戳大于后续SYN 会一直概率性丢包持续下去;如果停掉所有流量,重启客户端流量,恢复正常</li><li>2->1 丢包,情况同2</li><li>1->2 不触发丢包</li></ul></li></ol><p>其它 SYN 连不上的场景延伸阅读:<a href="https://plantegg.github.io/2020/05/24/%E7%A8%8B%E5%BA%8F%E5%91%98%E5%A6%82%E4%BD%95%E5%AD%A6%E4%B9%A0%E5%92%8C%E6%9E%84%E5%BB%BA%E7%BD%91%E7%BB%9C%E7%9F%A5%E8%AF%86%E4%BD%93%E7%B3%BB/">程序员如何学习和构建网络知识体系</a> </p><h3 id="一些特殊场景"><a href="#一些特殊场景" class="headerlink" title="一些特殊场景"></a>一些特殊场景</h3><p>这些特殊场景很可怕,不知不觉会产生 T2 不大于 T1 的情况,导致连接异常</p><h4 id="DNAT-x2F-ENAT"><a href="#DNAT-x2F-ENAT" class="headerlink" title="DNAT/ENAT"></a>DNAT/ENAT</h4><p>请求经过 DNAT 后 Server 端看到的 src-ip 是 client 的 IP,客户端同时通过直连(绿色)和走 LVS(黑色)两条链路就会大概率不通:</p><p><img src="https://cdn.jsdelivr.net/gh/plantegg/plantegg.github.io/images/951413iMgBlog//image-20240822161109563.png" alt="image-20240822161109563"></p><h4 id="没有挥手断开场景"><a href="#没有挥手断开场景" class="headerlink" title="没有挥手断开场景"></a>没有挥手断开场景</h4><p>有些 HA 探测都是握手/select 1/ RESET 连接,不走 FIN 四次挥手(比如 Jedis,见小作业应用断开连接的时候如何让 OS 走 RST 流程:<a href="https://articles.zsxq.com/id_v0mhaadx3cx5.html" target="_blank" rel="noopener">https://articles.zsxq.com/id_v0mhaadx3cx5.html</a> ),Server 端没有机会记录 T1,也就永远不会触发丢包,看着一切正常,直到某天来了个用户 curl 一下系统就崩了</p><p>比如 Jedis 就是直接 RST 断开连接,从不走 FIN 四次挥手</p><h2 id="延伸"><a href="#延伸" class="headerlink" title="延伸"></a>延伸</h2><p>如果服务端所用<a href="https://developer.aliyun.com/article/1262180" target="_blank" rel="noopener">端口是 time_wait 状态</a>,这时新连接 SYN 握手包刚好和 time_wait 的5元组重复,这个时候服务端不会回复 SYN+ACK 而是回复 time_wait 前的ack </p><h2 id="其它"><a href="#其它" class="headerlink" title="其它"></a>其它</h2><p>Server 在握手的第三阶段(TCP_NEW_SYN_RECV),等待对端进行握手的第三步回 ACK时候,如果收到RST 内核会对报文进行PAWS校验,如果 RST 带的 timestamp(TVal) 不递增就会因为通不过 PAWS 校验而被扔掉</p><p><a href="https://github.com/torvalds/linux/commit/7faee5c0d514162853a343d93e4a0b6bb8bfec21" target="_blank" rel="noopener">https://github.com/torvalds/linux/commit/7faee5c0d514162853a343d93e4a0b6bb8bfec21</a> 这个 commit 去掉了TCP_SKB_CB(skb)->when = tcp_time_stamp,导致 3.18 的内核版本linger close主动发送的 RST 中 ts_val为0,而<a href="https://github.com/torvalds/linux/commit/675ee231d960af2af3606b4480324e26797eb010" target="_blank" rel="noopener">修复的commit在 675ee231d960af2af3606b4480324e26797eb010</a>,直到 4.10 才合并进内核</p><h2 id="参考资料"><a href="#参考资料" class="headerlink" title="参考资料"></a>参考资料</h2><p>per-connection random offset:<a href="https://lwn.net/Articles/708021/" target="_blank" rel="noopener">https://lwn.net/Articles/708021/</a></p><h2 id="如果你觉得看完对你很有帮助可以通过如下方式找到我"><a href="#如果你觉得看完对你很有帮助可以通过如下方式找到我" class="headerlink" title="如果你觉得看完对你很有帮助可以通过如下方式找到我"></a>如果你觉得看完对你很有帮助可以通过如下方式找到我</h2><p>find me on twitter: <a href="https://twitter.com/plantegg" target="_blank" rel="noopener">@plantegg</a></p><p>知识星球:<a href="https://t.zsxq.com/0cSFEUh2J" target="_blank" rel="noopener">https://t.zsxq.com/0cSFEUh2J</a></p><p>开了一个星球,在里面讲解一些案例、知识、学习方法,肯定没法让大家称为顶尖程序员(我自己都不是),只是希望用我的方法、知识、经验、案例作为你的垫脚石,帮助你快速、早日成为一个基本合格的程序员。</p><p>争取在星球内:</p><ul><li>养成基本动手能力</li><li>拥有起码的分析推理能力–按我接触的程序员,大多都是没有逻辑的</li><li>知识上教会你几个关键的知识点</li></ul><img src="https://cdn.jsdelivr.net/gh/plantegg/plantegg.github.io/images/951413iMgBlog/image-20240324161113874-5525702.png" alt="image-20240324161113874" style="zoom:50%;">]]></content>
<summary type="html">
<h1 id="为什么你的-SYN-包被丢-net-ipv4-tcp-tw-recycle"><a href="#为什么你的-SYN-包被丢-net-ipv4-tcp-tw-recycle" class="headerlink" title="为什么你的 SYN 包被丢 net.
</summary>
<category term="tcp" scheme="https://plantegg.github.io/categories/tcp/"/>
<category term="tcp" scheme="https://plantegg.github.io/tags/tcp/"/>
<category term="tcp_tw_recycle" scheme="https://plantegg.github.io/tags/tcp-tw-recycle/"/>
<category term="tcp_timestamp" scheme="https://plantegg.github.io/tags/tcp-timestamp/"/>
<category term="PAWS" scheme="https://plantegg.github.io/tags/PAWS/"/>
</entry>
<entry>
<title>一次 Sysbench opening tables 卡慢的分析过程</title>
<link href="https://plantegg.github.io/2024/12/29/%E4%B8%80%E6%AC%A1%20Sysbench%20opening%20tables%20%E5%8D%A1%E6%85%A2%E7%9A%84%E5%88%86%E6%9E%90%E8%BF%87%E7%A8%8B/"/>
<id>https://plantegg.github.io/2024/12/29/一次 Sysbench opening tables 卡慢的分析过程/</id>
<published>2024-12-29T09:30:03.000Z</published>
<updated>2024-12-30T02:31:21.043Z</updated>
<content type="html"><![CDATA[<h1 id="一次-Sysbench-opening-tables-卡慢的分析过程"><a href="#一次-Sysbench-opening-tables-卡慢的分析过程" class="headerlink" title="一次 Sysbench opening tables 卡慢的分析过程"></a>一次 Sysbench opening tables 卡慢的分析过程</h1><h2 id="背景"><a href="#背景" class="headerlink" title="背景"></a>背景</h2><p>用 Sysbench 随便跑个压力,然后我用如下命令起压力,只达到了我预期的性能的 10%</p><figure class="highlight plain"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">sysbench --mysql-user=root --mysql-password=123 --mysql-db=sbtest --mysql-host=e237 --mysql-port=3306 --tables=64 --threads=256 --table-size=2000000 --range-size=5 --db-ps-mode=disable --skip-trx=on --mysql-ignore-errors=all --time=1200 --report-interval=1 --histogram=off oltp_point_select run</span><br></pre></td></tr></table></figure><h2 id="分析"><a href="#分析" class="headerlink" title="分析"></a>分析</h2><p>看了下 MySQL 的进程状态,<strong>CPU 消耗很低</strong>,再看 processlist 都是 Opening tables,这问题我熟啊,table_open_cache 设置太小,直接干大 10 倍,悲催的是性能依然没有任何变化看了下 MySQL 的进程状态,CPU 消耗很低,再看 processlist 都是 Opening tables,这问题我熟啊,table_open_cache 设置太小,直接干大 10 倍,悲催的是性能依然没有任何变化</p><p><img src="https://cdn.jsdelivr.net/gh/plantegg/plantegg.github.io/images/951413iMgBlog/image-20241022175210162.png" alt="image-20241022175210162"></p><p><img src="https://cdn.jsdelivr.net/gh/plantegg/plantegg.github.io/images/951413iMgBlog/image-20241023104502834.png" alt="image-20241023104502834"></p><p>难道还有别的地方限制了?我去查了下 status 发现 Table_open_cache_overflows 一直是 0,从状态来看 table_open_cache 肯定够了:</p><figure class="highlight plain"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br><span class="line">39</span><br></pre></td><td class="code"><pre><span class="line">#mysql -he237 -P3306 -uroot -p123 -e "show global status like '%open%' "</span><br><span class="line">mysql: [Warning] Using a password on the command line interface can be insecure.</span><br><span class="line">+----------------------------+---------+</span><br><span class="line">| Variable_name | Value |</span><br><span class="line">+----------------------------+---------+</span><br><span class="line">| Com_ha_open | 0 |</span><br><span class="line">| Com_show_open_tables | 0 |</span><br><span class="line">| Innodb_num_open_files | 48 |</span><br><span class="line">| Open_files | 14 |</span><br><span class="line">| Open_streams | 0 |</span><br><span class="line">| Open_table_definitions | 159 |</span><br><span class="line">| Open_tables | 1161 |</span><br><span class="line">| Opened_files | 173 |</span><br><span class="line">| Opened_table_definitions | 138 |</span><br><span class="line">| Opened_tables | 1168 |</span><br><span class="line">| Slave_open_temp_tables | 0 |</span><br><span class="line">| Table_open_cache_hits | 8125315 |</span><br><span class="line">| Table_open_cache_misses | 1168 |</span><br><span class="line">| Table_open_cache_overflows | 0 |</span><br><span class="line">+----------------------------+---------+</span><br><span class="line"></span><br><span class="line">#mysql -he237 -P3306 -uroot -p123 -e "show global status like '%Table_open%' "</span><br><span class="line">mysql: [Warning] Using a password on the command line interface can be insecure.</span><br><span class="line">+----------------------------+---------+</span><br><span class="line">| Variable_name | Value |</span><br><span class="line">+----------------------------+---------+</span><br><span class="line">| Table_open_cache_hits | 9039467 |</span><br><span class="line">| Table_open_cache_misses | 1170 |</span><br><span class="line">| Table_open_cache_overflows | 0 |</span><br><span class="line">+----------------------------+---------+</span><br><span class="line"></span><br><span class="line">#mysql -he237 -P3306 -uroot -p123 -e "show global variables like '%Table_open%' "</span><br><span class="line">mysql: [Warning] Using a password on the command line interface can be insecure.</span><br><span class="line">+----------------------------+-------+</span><br><span class="line">| Variable_name | Value |</span><br><span class="line">+----------------------------+-------+</span><br><span class="line">| table_open_cache | 8192 |</span><br><span class="line">| table_open_cache_instances | 16 |</span><br><span class="line">+----------------------------+-------+</span><br></pre></td></tr></table></figure><p>这些有点难绷,因为我用的别人的 sysbench, 于是自己编译了一个重压性能一下就正常了,于是我开始 dump 别人的 sysbench 完整参数,最后发现是我使用的时候配置错误将:–tables=32 设置成了 –tables=64 也就是我的 database 总共只有 32 张表,而我压测的时候写成了 64 张,还有 32 张表不存在导致。</p><p>而别人的 sysbench 默认添加了:–mysql-ignore-errors=all 也就是把报错信息都忽略了,导致控制台看不到异常信息</p><h3 id="碰到这种问题怎么办?"><a href="#碰到这种问题怎么办?" class="headerlink" title="碰到这种问题怎么办?"></a>碰到这种问题怎么办?</h3><p>我们经常碰到业务代码把报错信息吃掉了(类似设置了 –mysql-ignore-errors=all ),同时 SQL 里面拼错了表明或者写错了 Database 名也导致表不存在</p><p>所以这里的必杀技(银弹) 抓包(或者堆栈热点分析):</p><p><img src="https://cdn.jsdelivr.net/gh/plantegg/plantegg.github.io/images/951413iMgBlog//image-20241023101835801.png" alt="image-20241023101835801"></p><p>上图中只要不是 1146 的都是表明正确的请求,可以看到 RT 是 0.1-0.2 毫秒之间;但是 response Error 1146 报错的 RT 就很大了,同时抓包里 1146 也给出了错误原因</p><figure class="highlight plain"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br></pre></td><td class="code"><pre><span class="line">+-------+------+---------------------------------------+</span><br><span class="line">| Level | Code | Message |</span><br><span class="line">+-------+------+---------------------------------------+</span><br><span class="line">| Error | 1146 | Table 'sbtest.sbtest42' doesn't exist |</span><br><span class="line">+-------+------+---------------------------------------+</span><br></pre></td></tr></table></figure><p>正常时 50 万 QPS 的 RT:</p><figure class="highlight plain"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br></pre></td><td class="code"><pre><span class="line"> timeportavg_rtsvc_rt up_rt QPS droprtt</span><br><span class="line">2024-10-23 10:14:57 P3306 227 228 0 532688 0 34</span><br><span class="line">2024-10-23 10:14:58 P3306 227 228 0 533439 0 34</span><br></pre></td></tr></table></figure><p>异常时 5 万 QPS 的 RT:</p><figure class="highlight plain"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br></pre></td><td class="code"><pre><span class="line"> timeportavg_rtsvc_rt up_rt QPS droprtt</span><br><span class="line">2024-10-23 10:13:56 P3306 2201 2201 0 58910 0 34</span><br><span class="line">2024-10-23 10:13:57 P3306 2195 2195 0 59141 0 34</span><br><span class="line">2024-10-23 10:13:58 P3306 2203 2203 0 58923 0 34</span><br><span class="line">2024-10-23 10:13:59 P3306 2190 2191 0 59266 0 34</span><br><span class="line">2024-10-23 10:14:00 P3306 2198 2198 0 59018 0 34</span><br><span class="line">2024-10-23 10:14:01 P3306 2242 2242 0 57926 0 34</span><br></pre></td></tr></table></figure><p>从 RT 确实可以看出来是 3306 端口返回/响应慢了,我在 MySQLD 的日志里也搜索了,应该是没有记录这种 1146 错误</p><p>如果多看几次 processlist 的话还会发现 Opening table 的 SQL 对应的表明都是大于 31 的,表名小的 SQL 就不会出现 Opening table </p><h2 id="总结"><a href="#总结" class="headerlink" title="总结"></a>总结</h2><p>这个问题我第一时间没有想到抓包,显示根据经验 Opening tables 就是打开表慢了,然后调大 cache 参数,还不好用就觉得超出我的理解有点慌!</p><p>然后想到去比较参数/版本的差异,运气好发现了参数的差异;如果运气不好我重新编译然后复制白屏的命令参数估计还是发现不了。</p><p>所以我在想有什么更好的办法能识别这种问题,最后的结论居然还是抓个包看看,并且真管用,正好和这篇方法论呼应一下:<a href="https://articles.zsxq.com/id_mnp5z56gl0wi.html" target="_blank" rel="noopener">https://articles.zsxq.com/id_mnp5z56gl0wi.html</a> </p><h2 id="延伸"><a href="#延伸" class="headerlink" title="延伸"></a>延伸</h2><p>很多时候开发很坑人,把业务异常堆栈吃了不输出,就拿这个例子来说也有业务写错表名,然后报错又不输出就会出现和问题一样的问题,导致分析问题的时候发现很奇怪好好的系统就是慢,这个时候除了抓包还可以通过 perf/jstack 去看看堆栈,抓下热点</p><p>推上也有一些讨论,可以参考下别人的思路:<a href="https://x.com/plantegg/status/1851066206163521712" target="_blank" rel="noopener">https://x.com/plantegg/status/1851066206163521712</a> </p><h2 id="如果你觉得看完对你很有帮助可以通过如下方式找到我"><a href="#如果你觉得看完对你很有帮助可以通过如下方式找到我" class="headerlink" title="如果你觉得看完对你很有帮助可以通过如下方式找到我"></a>如果你觉得看完对你很有帮助可以通过如下方式找到我</h2><p>find me on twitter: <a href="https://twitter.com/plantegg" target="_blank" rel="noopener">@plantegg</a></p><p>知识星球:<a href="https://t.zsxq.com/0cSFEUh2J" target="_blank" rel="noopener">https://t.zsxq.com/0cSFEUh2J</a></p><p>开了一个星球,在里面讲解一些案例、知识、学习方法,肯定没法让大家称为顶尖程序员(我自己都不是),只是希望用我的方法、知识、经验、案例作为你的垫脚石,帮助你快速、早日成为一个基本合格的程序员。</p><p>争取在星球内:</p><ul><li>养成基本动手能力</li><li>拥有起码的分析推理能力–按我接触的程序员,大多都是没有逻辑的</li><li>知识上教会你几个关键的知识点</li></ul><img src="https://cdn.jsdelivr.net/gh/plantegg/plantegg.github.io/images/951413iMgBlog/image-20240324161113874-5525694.png" alt="image-20240324161113874" style="zoom:50%;">]]></content>
<summary type="html">
<h1 id="一次-Sysbench-opening-tables-卡慢的分析过程"><a href="#一次-Sysbench-opening-tables-卡慢的分析过程" class="headerlink" title="一次 Sysbench opening tabl
</summary>
<category term="MySQL" scheme="https://plantegg.github.io/categories/MySQL/"/>
<category term="performance" scheme="https://plantegg.github.io/tags/performance/"/>
<category term="opening tables" scheme="https://plantegg.github.io/tags/opening-tables/"/>
<category term="trubleshooting" scheme="https://plantegg.github.io/tags/trubleshooting/"/>
</entry>
<entry>
<title>一次网络连接残留的分析</title>
<link href="https://plantegg.github.io/2024/12/09/%E4%B8%80%E6%AC%A1%E7%BD%91%E7%BB%9C%E8%BF%9E%E6%8E%A5%E6%AE%8B%E7%95%99%E7%9A%84%E5%88%86%E6%9E%90/"/>
<id>https://plantegg.github.io/2024/12/09/一次网络连接残留的分析/</id>
<published>2024-12-09T09:30:03.000Z</published>
<updated>2024-12-30T02:31:19.116Z</updated>
<content type="html"><![CDATA[<h1 id="一次网络连接残留的分析"><a href="#一次网络连接残留的分析" class="headerlink" title="一次网络连接残留的分析"></a>一次网络连接残留的分析</h1><p>本来放在知识星球的收费文章,也网络直播给星球成员讲解过这个问题以及这篇文章的内容,作删减和调整后也发博客吧</p><h2 id="问题描述"><a href="#问题描述" class="headerlink" title="问题描述"></a>问题描述</h2><p>LVS TCP 探活一般是 3 次握手(验证服务节点还在)后立即发送一个 RST packet 来断开连接(效率高,不需要走四次挥手),但是在我们的LVS 后面的 RS 上发现有大量的探活连接残留,需要分析为什么?</p><p>一通分析下来发现是 RST 包 和第三次握手的 ack 到达对端乱序了,导致 RST 被drop 掉了。但是还需要进一步分析 drop 的时候和 RST 包里面带的 timestamp 有没有关系?</p><p>可以用 Scapy 来实验验证如下 4 个场景:</p><ol><li>正常三次握手,然后发送 RST 看看是否被 drop —— 期望 RST 不被 drop,连接正常释放,作为对比项</li><li>正常 2 次握手,然后立即发送 RST(正常带 timestamp),再发送 ack(制造乱序),看看 RST 会不会被 drop,如果 RST drop 后连接还能正常握手成功并残留吗?</li><li>正常 2 次握手,然后立即发送 RST(不带 timestamp),再发送 ack(制造乱序),看看 RST 会不会被 drop</li><li>正常 2 次握手,然后立即发送 RST(带 timestamp,但是 timestamp 为 0),再发送 ack(制造乱序),看看 RST 会不会被 drop</li></ol><p>重现场景构造如下:通过客户端+服务端来尝试重现,客户端用 scapy 来构造任意网络包,服务端通过 python 起一个 WEB 服务</p><h3 id="客户端"><a href="#客户端" class="headerlink" title="客户端"></a>客户端</h3><p>因为最新的 scapy 需要 python3.7 ,可以搞一个内核版本较高的 Linux 来测试(星球统一 99 块的实验 ECS 就符合要求),安装命令大概是这样:yum install python3-scapy</p><p>用 scapy 脚本构造如上 3 个场景的网络包,代码和使用帮助我放到这里了:<a href="https://github.com/plantegg/programmer_case/commit/e71ade38050c48170c7d6fb5922f78188a96435b#diff-3d18b8aa76586e6c59227e020ba22ef1ef8c5416764d0a923b198ad824996eda" target="_blank" rel="noopener">https://github.com/plantegg/programmer_case/commit/e71ade38050c48170c7d6fb5922f78188a96435b#diff-3d18b8aa76586e6c59227e020ba22ef1ef8c5416764d0a923b198ad824996eda</a></p><p>如果需要构造带 timestamp 的RST 用如下代码段,乱序通过调整 ack和 RST 的顺序来实现</p><figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment"># 构造 ACK 包</span></span><br><span class="line">ack = TCP(sport=source_port,</span><br><span class="line"> dport=target_port,</span><br><span class="line"> flags=<span class="string">'A'</span>,</span><br><span class="line"> seq=syn_ack.ack,</span><br><span class="line"> ack=syn_ack.seq + <span class="number">1</span>,</span><br><span class="line"> options=[(<span class="string">'NOP'</span>, <span class="literal">None</span>), (<span class="string">'NOP'</span>, <span class="literal">None</span>),</span><br><span class="line"> (<span class="string">'Timestamp'</span>, (int(time.time()), <span class="number">0</span>))]) //重点调整这里的时间戳,以及 rst 和 ack 包的顺序</span><br><span class="line"></span><br><span class="line"><span class="comment"># 发送 ACK</span></span><br><span class="line">send(ip/ack)</span><br></pre></td></tr></table></figure><p>在scapy 机器上drop 掉OS 自动发送的 RST(因为连接是 scapy 伪造的,OS 收到 syn+ack 后会 OS系统会发 RST(这个 RST不带 timestamp))</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br></pre></td><td class="code"><pre><span class="line">iptables -A OUTPUT -p tcp --dport 8000 --tcp-flags RST RST ! --tcp-option 8 -j DROP</span><br><span class="line"></span><br><span class="line">//清理</span><br><span class="line">iptables -D OUTPUT -p tcp --dport 8000 --tcp-flags RST RST ! --tcp-option 8 -j DROP</span><br></pre></td></tr></table></figure><p>scapy 构造的包流程,可以看到不走内核 tcp 协议栈,也不走 nf_hook(防火墙),不受上面的 iptables 规则限制,所以能发送到服务端:</p><figure class="highlight plain"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br></pre></td><td class="code"><pre><span class="line">***************** c7d8ea00 ***************</span><br><span class="line">[100167.011693] [__dev_queue_xmit ] TCP: 172.26.137.131:8146 -> 172.26.137.130:8000 seq:12346, ack:0, flags:R</span><br><span class="line">[100167.011702] [dev_hard_start_xmit ] TCP: 172.26.137.131:8146 -> 172.26.137.130:8000 seq:12346, ack:0, flags:R *skb is successfully sent to the NIC driver*</span><br><span class="line">[100167.011714] [consume_skb ] TCP: 172.26.137.131:8146 -> 172.26.137.130:8000 seq:12346, ack:0, flags:R *packet is freed (normally)*</span><br><span class="line"></span><br><span class="line">***************** c700e300 ***************</span><br><span class="line">[100167.024811] [__dev_queue_xmit ] TCP: 172.26.137.131:8146 -> 172.26.137.130:8000 seq:12346, ack:2680597246, flags:A</span><br><span class="line">[100167.024821] [dev_hard_start_xmit ] TCP: 172.26.137.131:8146 -> 172.26.137.130:8000 seq:12346, ack:2680597246, flags:A *skb is successfully sent to the NIC driver*</span><br><span class="line">[100167.024891] [consume_skb ] TCP: 172.26.137.131:8146 -> 172.26.137.130:8000 seq:12346, ack:2680597246, flags:A *packet is freed (normally)*</span><br></pre></td></tr></table></figure><h3 id="Server-端"><a href="#Server-端" class="headerlink" title="Server 端"></a>Server 端</h3><p>先记住一个知识点,后面看内核调用堆栈会用得上确认是否被丢包</p><blockquote><p>一个网络包正常处理流程最后调 consume_skb 来释放,如果网络包需要 Drop 就调 <code>kfree_skb</code> 来丢包</p></blockquote><p>server端 安装 netstrace来监控包是否被drop,并通过 python 拉起一个端口:</p><figure class="highlight plain"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">python -m http.server 8000</span><br></pre></td></tr></table></figure><h4 id="tcpdump-确认-8000-端口收到的包"><a href="#tcpdump-确认-8000-端口收到的包" class="headerlink" title="tcpdump 确认 8000 端口收到的包"></a>tcpdump 确认 8000 端口收到的包</h4><p>在 8000端口机器上执行抓包验证收到的包顺序和所携带的 timestamp,包含 3 个场景的包:</p><figure class="highlight plain"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br></pre></td><td class="code"><pre><span class="line">#tcpdump -i eth0 port 8000 -nn</span><br><span class="line">//场景 2:正常 2 次握手,然后立即发送 RST(带 timestamp)</span><br><span class="line">13:56:26.614701 IP 172.26.137.131.54321 > 172.26.137.130.8000: Flags [S], seq 2754757912, win 8192, options [mss 1460,nop,nop,TS val 1732514186 ecr 0], length 0</span><br><span class="line">13:56:26.614815 IP 172.26.137.130.8000 > 172.26.137.131.54321: Flags [S.], seq 1579697129, ack 2754757913, win 65160, options [mss 1460,nop,nop,TS val 2888180099 ecr 1732514186], length 0</span><br><span class="line">13:56:26.633997 IP 172.26.137.131.54321 > 172.26.137.130.8000: Flags [R], seq 2754757913, win 8192, options [mss 1460,nop,nop,TS val 1732514186 ecr 0], length 0 //留意端口号 54321 和 seq 2754757913 跟 nettrace 对应</span><br><span class="line">13:56:26.654954 IP 172.26.137.131.54321 > 172.26.137.130.8000: Flags [.], ack 1, win 8192, options [nop,nop,TS val 1732514186 ecr 0], length 0</span><br><span class="line">13:56:26.655042 IP 172.26.137.130.8000 > 172.26.137.131.54321: Flags [R], seq 1579697130, win 0, length 0</span><br><span class="line"></span><br><span class="line">//场景 3:正常 2 次握手,然后立即发送 RST(不带 timestamp), 注意这里的 tcp options 是 null</span><br><span class="line">13:56:28.993723 IP 172.26.137.131.12345 > 172.26.137.130.8000: Flags [S], seq 54243194, win 8192, options [mss 1460,nop,nop,TS val 1732514188 ecr 0], length 0</span><br><span class="line">13:56:28.993809 IP 172.26.137.130.8000 > 172.26.137.131.12345: Flags [S.], seq 1983242893, ack 54243195, win 65160, options [mss 1460,nop,nop,TS val 2888182478 ecr 1732514188], length 0</span><br><span class="line">13:56:29.012982 IP 172.26.137.131.12345 > 172.26.137.130.8000: Flags [R], seq 54243195, win 8192, length 0 //留意端口号 12345 和 seq 54243195 跟 nettrace 对应</span><br><span class="line">13:56:29.029886 IP 172.26.137.131.12345 > 172.26.137.130.8000: Flags [.], ack 1, win 8192, options [nop,nop,TS val 1732514189 ecr 0], length 0</span><br><span class="line">13:56:29.029983 IP 172.26.137.130.8000 > 172.26.137.131.12345: Flags [R], seq 1983242894, win 0, length 0 //OS 触发</span><br><span class="line">13:56:29.050888 IP 172.26.137.131.12345 > 172.26.137.130.8000: Flags [R], seq 54243195, win 8192, length 0</span><br><span class="line"></span><br><span class="line">//场景 1:正常握手,然后 RST</span><br><span class="line">13:56:30.399672 IP 172.26.137.131.22345 > 172.26.137.130.8000: Flags [S], seq 1038081714, win 8192, options [mss 1460,nop,nop,TS val 1732514190 ecr 0], length 0</span><br><span class="line">13:56:30.399770 IP 172.26.137.130.8000 > 172.26.137.131.22345: Flags [S.], seq 3263478059, ack 1038081715, win 65160, options [mss 1460,nop,nop,TS val 2888183884 ecr 1732514190], length 0</span><br><span class="line">13:56:30.426005 IP 172.26.137.131.22345 > 172.26.137.130.8000: Flags [.], ack 1, win 8192, options [nop,nop,TS val 1732514190 ecr 0], length 0</span><br><span class="line">13:56:30.448876 IP 172.26.137.131.22345 > 172.26.137.130.8000: Flags [R], seq 1038081715, win 8192, length 0</span><br></pre></td></tr></table></figure><h4 id="场景-1:正常三次握手后再-RST,作为对比"><a href="#场景-1:正常三次握手后再-RST,作为对比" class="headerlink" title="场景 1:正常三次握手后再 RST,作为对比"></a>场景 1:正常三次握手后再 RST,作为对比</h4><p>netstrace 命令和结果</p><figure class="highlight plain"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br><span class="line">39</span><br><span class="line">40</span><br><span class="line">41</span><br><span class="line">42</span><br><span class="line">43</span><br><span class="line">44</span><br><span class="line">45</span><br><span class="line">46</span><br><span class="line">47</span><br><span class="line">48</span><br><span class="line">49</span><br><span class="line">50</span><br><span class="line">51</span><br><span class="line">52</span><br><span class="line">53</span><br><span class="line">54</span><br><span class="line">55</span><br><span class="line">56</span><br><span class="line">57</span><br><span class="line">58</span><br><span class="line">59</span><br><span class="line">60</span><br><span class="line">61</span><br><span class="line">62</span><br><span class="line">63</span><br><span class="line">64</span><br><span class="line">65</span><br><span class="line">66</span><br><span class="line">67</span><br></pre></td><td class="code"><pre><span class="line">#netstat -P 8000</span><br><span class="line">***************** c22c8c00,c22c8000 ***************</span><br><span class="line">[4912187.018483] [__ip_local_out ] TCP: 172.26.137.130:8000 -> 172.26.137.131:22345 seq:3263478059, ack:1038081715, flags:SA</span><br><span class="line">[4912187.018485] [nf_hook_slow ] TCP: 172.26.137.130:8000 -> 172.26.137.131:22345 seq:3263478059, ack:1038081715, flags:SA *ipv4 in chain: OUTPUT*</span><br><span class="line">[4912187.018487] [nft_do_chain ] TCP: 172.26.137.130:8000 -> 172.26.137.131:22345 seq:3263478059, ack:1038081715, flags:SA *iptables table:, chain:OUTPUT*</span><br><span class="line">[4912187.018489] [nft_do_chain ] TCP: 172.26.137.130:8000 -> 172.26.137.131:22345 seq:3263478059, ack:1038081715, flags:SA *iptables table:, chain:OUTPUT*</span><br><span class="line">[4912187.018493] [ip_output ] TCP: 172.26.137.130:8000 -> 172.26.137.131:22345 seq:3263478059, ack:1038081715, flags:SA</span><br><span class="line">[4912187.018495] [nf_hook_slow ] TCP: 172.26.137.130:8000 -> 172.26.137.131:22345 seq:3263478059, ack:1038081715, flags:SA *ipv4 in chain: POST_ROUTING*</span><br><span class="line">[4912187.018496] [nft_do_chain ] TCP: 172.26.137.130:8000 -> 172.26.137.131:22345 seq:3263478059, ack:1038081715, flags:SA *iptables table:, chain:POSTROU*</span><br><span class="line">[4912187.018499] [ip_finish_output ] TCP: 172.26.137.130:8000 -> 172.26.137.131:22345 seq:3263478059, ack:1038081715, flags:SA</span><br><span class="line">[4912187.018502] [ip_finish_output2 ] TCP: 172.26.137.130:8000 -> 172.26.137.131:22345 seq:3263478059, ack:1038081715, flags:SA</span><br><span class="line">[4912187.018506] [__dev_queue_xmit ] TCP: 172.26.137.130:8000 -> 172.26.137.131:22345 seq:3263478059, ack:1038081715, flags:SA</span><br><span class="line">[4912187.018510] [dev_hard_start_xmit ] TCP: 172.26.137.130:8000 -> 172.26.137.131:22345 seq:3263478059, ack:1038081715, flags:SA *skb is successfully sent to the NIC driver*</span><br><span class="line">[4912187.018512] [skb_clone ] TCP: 172.26.137.130:8000 -> 172.26.137.131:22345 seq:3263478059, ack:1038081715, flags:SA</span><br><span class="line">[4912187.018516] [tpacket_rcv ] TCP: 172.26.137.130:8000 -> 172.26.137.131:22345 seq:3263478059, ack:1038081715, flags:SA</span><br><span class="line">[4912187.018519] [consume_skb ] TCP: 172.26.137.130:8000 -> 172.26.137.131:22345 seq:3263478059, ack:1038081715, flags:SA *packet is freed (normally)*</span><br><span class="line">[4912187.018533] [consume_skb ] TCP: 172.26.137.130:8000 -> 172.26.137.131:22345 seq:3263478059, ack:1038081715, flags:SA *packet is freed (normally)*</span><br><span class="line"></span><br><span class="line">***************** c22c8a00,c22c8f00 ***************</span><br><span class="line">[4912187.044742] [napi_gro_receive_entry] TCP: 172.26.137.131:22345 -> 172.26.137.130:8000 seq:1038081715, ack:3263478060, flags:A</span><br><span class="line">[4912187.044749] [dev_gro_receive ] TCP: 172.26.137.131:22345 -> 172.26.137.130:8000 seq:1038081715, ack:3263478060, flags:A</span><br><span class="line">[4912187.044751] [__netif_receive_skb_core] TCP: 172.26.137.131:22345 -> 172.26.137.130:8000 seq:1038081715, ack:3263478060, flags:A</span><br><span class="line">[4912187.044753] [tpacket_rcv ] TCP: 172.26.137.131:22345 -> 172.26.137.130:8000 seq:1038081715, ack:3263478060, flags:A</span><br><span class="line">[4912187.044758] [ip_rcv ] TCP: 172.26.137.131:22345 -> 172.26.137.130:8000 seq:1038081715, ack:3263478060, flags:A</span><br><span class="line">[4912187.044760] [ip_rcv_core ] TCP: 172.26.137.131:22345 -> 172.26.137.130:8000 seq:1038081715, ack:3263478060, flags:A</span><br><span class="line">[4912187.044762] [skb_clone ] TCP: 172.26.137.131:22345 -> 172.26.137.130:8000 seq:1038081715, ack:3263478060, flags:A</span><br><span class="line">[4912187.044766] [nf_hook_slow ] TCP: 172.26.137.131:22345 -> 172.26.137.130:8000 seq:1038081715, ack:3263478060, flags:A *ipv4 in chain: PRE_ROUTING*</span><br><span class="line">[4912187.044769] [nft_do_chain ] TCP: 172.26.137.131:22345 -> 172.26.137.130:8000 seq:1038081715, ack:3263478060, flags:A *iptables table:, chain:PREROUT*</span><br><span class="line">[4912187.044772] [ip_rcv_finish ] TCP: 172.26.137.131:22345 -> 172.26.137.130:8000 seq:1038081715, ack:3263478060, flags:A</span><br><span class="line">[4912187.044776] [ip_route_input_slow ] TCP: 172.26.137.131:22345 -> 172.26.137.130:8000 seq:1038081715, ack:3263478060, flags:A</span><br><span class="line">[4912187.044781] [fib_validate_source ] TCP: 172.26.137.131:22345 -> 172.26.137.130:8000 seq:1038081715, ack:3263478060, flags:A</span><br><span class="line">[4912187.044785] [ip_local_deliver ] TCP: 172.26.137.131:22345 -> 172.26.137.130:8000 seq:1038081715, ack:3263478060, flags:A</span><br><span class="line">[4912187.044786] [nf_hook_slow ] TCP: 172.26.137.131:22345 -> 172.26.137.130:8000 seq:1038081715, ack:3263478060, flags:A *ipv4 in chain: INPUT*</span><br><span class="line">[4912187.044787] [nft_do_chain ] TCP: 172.26.137.131:22345 -> 172.26.137.130:8000 seq:1038081715, ack:3263478060, flags:A *iptables table:, chain:INPUT*</span><br><span class="line">[4912187.044789] [nft_do_chain ] TCP: 172.26.137.131:22345 -> 172.26.137.130:8000 seq:1038081715, ack:3263478060, flags:A *iptables table:, chain:INPUT*</span><br><span class="line">[4912187.044791] [ip_local_deliver_finish] TCP: 172.26.137.131:22345 -> 172.26.137.130:8000 seq:1038081715, ack:3263478060, flags:A</span><br><span class="line">[4912187.044794] [tcp_v4_rcv ] TCP: 172.26.137.131:22345 -> 172.26.137.130:8000 seq:1038081715, ack:3263478060, flags:A</span><br><span class="line">[4912187.044806] [tcp_child_process ] TCP: 172.26.137.131:22345 -> 172.26.137.130:8000 seq:1038081715, ack:3263478060, flags:A</span><br><span class="line">[4912187.044810] [tcp_rcv_state_process] TCP: 172.26.137.131:22345 -> 172.26.137.130:8000 seq:1038081715, ack:3263478060, flags:A *TCP socket state has changed*</span><br><span class="line">[4912187.044813] [tcp_ack ] TCP: 172.26.137.131:22345 -> 172.26.137.130:8000 seq:1038081715, ack:3263478060, flags:A</span><br><span class="line">[4912187.044818] [__kfree_skb ] TCP: 172.26.137.131:22345 -> 172.26.137.130:8000 seq:1038081715, ack:3263478060, flags:A</span><br><span class="line">[4912187.044825] [packet_rcv ] TCP: 172.26.137.131:22345 -> 172.26.137.130:8000 seq:1038081715, ack:3263478060, flags:A</span><br><span class="line">[4912187.044827] [consume_skb ] TCP: 172.26.137.131:22345 -> 172.26.137.130:8000 seq:1038081715, ack:3263478060, flags:A *packet is freed (normally)*</span><br><span class="line"></span><br><span class="line">***************** c22c8900,c22c8a00 ***************</span><br><span class="line">[4912187.067611] [napi_gro_receive_entry] TCP: 172.26.137.131:22345 -> 172.26.137.130:8000 seq:1038081715, ack:3263478060, flags:R</span><br><span class="line">[4912187.067617] [dev_gro_receive ] TCP: 172.26.137.131:22345 -> 172.26.137.130:8000 seq:1038081715, ack:3263478060, flags:R</span><br><span class="line">[4912187.067622] [__netif_receive_skb_core] TCP: 172.26.137.131:22345 -> 172.26.137.130:8000 seq:1038081715, ack:3263478060, flags:R</span><br><span class="line">[4912187.067624] [tpacket_rcv ] TCP: 172.26.137.131:22345 -> 172.26.137.130:8000 seq:1038081715, ack:3263478060, flags:R</span><br><span class="line">[4912187.067628] [ip_rcv ] TCP: 172.26.137.131:22345 -> 172.26.137.130:8000 seq:1038081715, ack:3263478060, flags:R</span><br><span class="line">[4912187.067630] [ip_rcv_core ] TCP: 172.26.137.131:22345 -> 172.26.137.130:8000 seq:1038081715, ack:3263478060, flags:R</span><br><span class="line">[4912187.067631] [skb_clone ] TCP: 172.26.137.131:22345 -> 172.26.137.130:8000 seq:1038081715, ack:3263478060, flags:R</span><br><span class="line">[4912187.067634] [nf_hook_slow ] TCP: 172.26.137.131:22345 -> 172.26.137.130:8000 seq:1038081715, ack:3263478060, flags:R *ipv4 in chain: PRE_ROUTING*</span><br><span class="line">[4912187.067636] [nft_do_chain ] TCP: 172.26.137.131:22345 -> 172.26.137.130:8000 seq:1038081715, ack:3263478060, flags:R *iptables table:, chain:PREROUT*</span><br><span class="line">[4912187.067639] [ip_rcv_finish ] TCP: 172.26.137.131:22345 -> 172.26.137.130:8000 seq:1038081715, ack:3263478060, flags:R</span><br><span class="line">[4912187.067640] [ip_local_deliver ] TCP: 172.26.137.131:22345 -> 172.26.137.130:8000 seq:1038081715, ack:3263478060, flags:R</span><br><span class="line">[4912187.067642] [nf_hook_slow ] TCP: 172.26.137.131:22345 -> 172.26.137.130:8000 seq:1038081715, ack:3263478060, flags:R *ipv4 in chain: INPUT*</span><br><span class="line">[4912187.067643] [nft_do_chain ] TCP: 172.26.137.131:22345 -> 172.26.137.130:8000 seq:1038081715, ack:3263478060, flags:R *iptables table:, chain:INPUT*</span><br><span class="line">[4912187.067644] [nft_do_chain ] TCP: 172.26.137.131:22345 -> 172.26.137.130:8000 seq:1038081715, ack:3263478060, flags:R *iptables table:, chain:INPUT*</span><br><span class="line">[4912187.067646] [ip_local_deliver_finish] TCP: 172.26.137.131:22345 -> 172.26.137.130:8000 seq:1038081715, ack:3263478060, flags:R</span><br><span class="line">[4912187.067648] [tcp_v4_rcv ] TCP: 172.26.137.131:22345 -> 172.26.137.130:8000 seq:1038081715, ack:3263478060, flags:R</span><br><span class="line">[4912187.067650] [tcp_filter ] TCP: 172.26.137.131:22345 -> 172.26.137.130:8000 seq:1038081715, ack:3263478060, flags:R</span><br><span class="line">[4912187.067651] [tcp_v4_do_rcv ] TCP: 172.26.137.131:22345 -> 172.26.137.130:8000 seq:1038081715, ack:3263478060, flags:R</span><br><span class="line">[4912187.067653] [tcp_rcv_established ] TCP: 172.26.137.131:22345 -> 172.26.137.130:8000 seq:1038081715, ack:3263478060, flags:R</span><br><span class="line">[4912187.067659] [__kfree_skb ] TCP: 172.26.137.131:22345 -> 172.26.137.130:8000 seq:1038081715, ack:3263478060, flags:R</span><br><span class="line">[4912187.067685] [packet_rcv ] TCP: 172.26.137.131:22345 -> 172.26.137.130:8000 seq:1038081715, ack:3263478060, flags:R</span><br><span class="line">[4912187.067687] [consume_skb ] TCP: 172.26.137.131:22345 -> 172.26.137.130:8000 seq:1038081715, ack:3263478060, flags:R *packet is freed (normally)* //RST packet 被正常处理,没有发生 drop</span><br></pre></td></tr></table></figure><h4 id="场景-2:正常-2-次握手,然后立即发送-RST-带-timestamp"><a href="#场景-2:正常-2-次握手,然后立即发送-RST-带-timestamp" class="headerlink" title="场景 2:正常 2 次握手,然后立即发送 RST(带 timestamp)"></a>场景 2:正常 2 次握手,然后立即发送 RST(带 timestamp)</h4><figure class="highlight plain"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br><span class="line">39</span><br><span class="line">40</span><br><span class="line">41</span><br><span class="line">42</span><br><span class="line">43</span><br><span class="line">44</span><br><span class="line">45</span><br><span class="line">46</span><br><span class="line">47</span><br><span class="line">48</span><br><span class="line">49</span><br><span class="line">50</span><br><span class="line">51</span><br><span class="line">52</span><br><span class="line">53</span><br><span class="line">54</span><br><span class="line">55</span><br><span class="line">56</span><br><span class="line">57</span><br><span class="line">58</span><br><span class="line">59</span><br><span class="line">60</span><br><span class="line">61</span><br><span class="line">62</span><br><span class="line">63</span><br><span class="line">64</span><br><span class="line">65</span><br><span class="line">66</span><br><span class="line">67</span><br><span class="line">68</span><br><span class="line">69</span><br></pre></td><td class="code"><pre><span class="line">#netstat -P 8000</span><br><span class="line">//场景 2:正常 2 次握手,然后立即发送 RST(带 timestamp)—— RST 被 drop 了</span><br><span class="line">***************** c22c8900,c22c8300 *************** //8000 端口回复的 syn+ack</span><br><span class="line">[4912183.233533] [__ip_local_out ] TCP: 172.26.137.130:8000 -> 172.26.137.131:54321 seq:1579697129, ack:2754757913, flags:SA</span><br><span class="line">[4912183.233535] [nf_hook_slow ] TCP: 172.26.137.130:8000 -> 172.26.137.131:54321 seq:1579697129, ack:2754757913, flags:SA *ipv4 in chain: OUTPUT*</span><br><span class="line">[4912183.233537] [nft_do_chain ] TCP: 172.26.137.130:8000 -> 172.26.137.131:54321 seq:1579697129, ack:2754757913, flags:SA *iptables table:, chain:OUTPUT*</span><br><span class="line">[4912183.233538] [nft_do_chain ] TCP: 172.26.137.130:8000 -> 172.26.137.131:54321 seq:1579697129, ack:2754757913, flags:SA *iptables table:, chain:OUTPUT*</span><br><span class="line">[4912183.233541] [ip_output ] TCP: 172.26.137.130:8000 -> 172.26.137.131:54321 seq:1579697129, ack:2754757913, flags:SA</span><br><span class="line">[4912183.233542] [nf_hook_slow ] TCP: 172.26.137.130:8000 -> 172.26.137.131:54321 seq:1579697129, ack:2754757913, flags:SA *ipv4 in chain: POST_ROUTING*</span><br><span class="line">[4912183.233543] [nft_do_chain ] TCP: 172.26.137.130:8000 -> 172.26.137.131:54321 seq:1579697129, ack:2754757913, flags:SA *iptables table:, chain:POSTROU*</span><br><span class="line">[4912183.233546] [ip_finish_output ] TCP: 172.26.137.130:8000 -> 172.26.137.131:54321 seq:1579697129, ack:2754757913, flags:SA</span><br><span class="line">[4912183.233549] [ip_finish_output2 ] TCP: 172.26.137.130:8000 -> 172.26.137.131:54321 seq:1579697129, ack:2754757913, flags:SA</span><br><span class="line">[4912183.233552] [__dev_queue_xmit ] TCP: 172.26.137.130:8000 -> 172.26.137.131:54321 seq:1579697129, ack:2754757913, flags:SA</span><br><span class="line">[4912183.233555] [dev_hard_start_xmit ] TCP: 172.26.137.130:8000 -> 172.26.137.131:54321 seq:1579697129, ack:2754757913, flags:SA *skb is successfully sent to the NIC driver*</span><br><span class="line">[4912183.233557] [skb_clone ] TCP: 172.26.137.130:8000 -> 172.26.137.131:54321 seq:1579697129, ack:2754757913, flags:SA</span><br><span class="line">[4912183.233561] [tpacket_rcv ] TCP: 172.26.137.130:8000 -> 172.26.137.131:54321 seq:1579697129, ack:2754757913, flags:SA</span><br><span class="line">[4912183.233565] [consume_skb ] TCP: 172.26.137.130:8000 -> 172.26.137.131:54321 seq:1579697129, ack:2754757913, flags:SA *packet is freed (normally)*</span><br><span class="line">[4912183.233581] [consume_skb ] TCP: 172.26.137.130:8000 -> 172.26.137.131:54321 seq:1579697129, ack:2754757913, flags:SA *packet is freed (normally)*</span><br><span class="line"></span><br><span class="line">***************** c22c8000,c22c8c00 ***************//客户端发送的 RST 比 ack 先到</span><br><span class="line">[4912183.252733] [napi_gro_receive_entry] TCP: 172.26.137.131:54321 -> 172.26.137.130:8000 seq:2754757913, ack:1579697130, flags:R</span><br><span class="line">[4912183.252741] [dev_gro_receive ] TCP: 172.26.137.131:54321 -> 172.26.137.130:8000 seq:2754757913, ack:1579697130, flags:R</span><br><span class="line">[4912183.252743] [__netif_receive_skb_core] TCP: 172.26.137.131:54321 -> 172.26.137.130:8000 seq:2754757913, ack:1579697130, flags:R</span><br><span class="line">[4912183.252745] [tpacket_rcv ] TCP: 172.26.137.131:54321 -> 172.26.137.130:8000 seq:2754757913, ack:1579697130, flags:R</span><br><span class="line">[4912183.252749] [ip_rcv ] TCP: 172.26.137.131:54321 -> 172.26.137.130:8000 seq:2754757913, ack:1579697130, flags:R</span><br><span class="line">[4912183.252750] [ip_rcv_core ] TCP: 172.26.137.131:54321 -> 172.26.137.130:8000 seq:2754757913, ack:1579697130, flags:R</span><br><span class="line">[4912183.252752] [skb_clone ] TCP: 172.26.137.131:54321 -> 172.26.137.130:8000 seq:2754757913, ack:1579697130, flags:R</span><br><span class="line">[4912183.252757] [nf_hook_slow ] TCP: 172.26.137.131:54321 -> 172.26.137.130:8000 seq:2754757913, ack:1579697130, flags:R *ipv4 in chain: PRE_ROUTING*</span><br><span class="line">[4912183.252759] [nft_do_chain ] TCP: 172.26.137.131:54321 -> 172.26.137.130:8000 seq:2754757913, ack:1579697130, flags:R *iptables table:, chain:PREROUT*</span><br><span class="line">[4912183.252761] [ip_rcv_finish ] TCP: 172.26.137.131:54321 -> 172.26.137.130:8000 seq:2754757913, ack:1579697130, flags:R</span><br><span class="line">[4912183.252765] [ip_route_input_slow ] TCP: 172.26.137.131:54321 -> 172.26.137.130:8000 seq:2754757913, ack:1579697130, flags:R</span><br><span class="line">[4912183.252771] [fib_validate_source ] TCP: 172.26.137.131:54321 -> 172.26.137.130:8000 seq:2754757913, ack:1579697130, flags:R</span><br><span class="line">[4912183.252773] [ip_local_deliver ] TCP: 172.26.137.131:54321 -> 172.26.137.130:8000 seq:2754757913, ack:1579697130, flags:R</span><br><span class="line">[4912183.252775] [nf_hook_slow ] TCP: 172.26.137.131:54321 -> 172.26.137.130:8000 seq:2754757913, ack:1579697130, flags:R *ipv4 in chain: INPUT*</span><br><span class="line">[4912183.252777] [nft_do_chain ] TCP: 172.26.137.131:54321 -> 172.26.137.130:8000 seq:2754757913, ack:1579697130, flags:R *iptables table:, chain:INPUT*</span><br><span class="line">[4912183.252779] [nft_do_chain ] TCP: 172.26.137.131:54321 -> 172.26.137.130:8000 seq:2754757913, ack:1579697130, flags:R *iptables table:, chain:INPUT*</span><br><span class="line">[4912183.252782] [ip_local_deliver_finish] TCP: 172.26.137.131:54321 -> 172.26.137.130:8000 seq:2754757913, ack:1579697130, flags:R</span><br><span class="line">[4912183.252783] [tcp_v4_rcv ] TCP: 172.26.137.131:54321 -> 172.26.137.130:8000 seq:2754757913, ack:1579697130, flags:R</span><br><span class="line">[4912183.252789] [kfree_skb ] TCP: 172.26.137.131:54321 -> 172.26.137.130:8000 seq:2754757913, ack:1579697130, flags:R *tcp_v4_rcv+0x65* *packet is dropped by kernel* //被 drop 了</span><br><span class="line">[4912183.252792] [packet_rcv ] TCP: 172.26.137.131:54321 -> 172.26.137.130:8000 seq:2754757913, ack:1579697130, flags:R</span><br><span class="line">[4912183.252794] [consume_skb ] TCP: 172.26.137.131:54321 -> 172.26.137.130:8000 seq:2754757913, ack:1579697130, flags:R *packet is freed (normally)*</span><br><span class="line"></span><br><span class="line">***************** c22c8900,c22c8200 ***************</span><br><span class="line">[4912183.273690] [napi_gro_receive_entry] TCP: 172.26.137.131:54321 -> 172.26.137.130:8000 seq:2754757913, ack:1579697130, flags:A</span><br><span class="line">[4912183.273697] [dev_gro_receive ] TCP: 172.26.137.131:54321 -> 172.26.137.130:8000 seq:2754757913, ack:1579697130, flags:A</span><br><span class="line">[4912183.273700] [__netif_receive_skb_core] TCP: 172.26.137.131:54321 -> 172.26.137.130:8000 seq:2754757913, ack:1579697130, flags:A</span><br><span class="line">[4912183.273701] [tpacket_rcv ] TCP: 172.26.137.131:54321 -> 172.26.137.130:8000 seq:2754757913, ack:1579697130, flags:A</span><br><span class="line">[4912183.273705] [ip_rcv ] TCP: 172.26.137.131:54321 -> 172.26.137.130:8000 seq:2754757913, ack:1579697130, flags:A</span><br><span class="line">[4912183.273707] [ip_rcv_core ] TCP: 172.26.137.131:54321 -> 172.26.137.130:8000 seq:2754757913, ack:1579697130, flags:A</span><br><span class="line">[4912183.273708] [skb_clone ] TCP: 172.26.137.131:54321 -> 172.26.137.130:8000 seq:2754757913, ack:1579697130, flags:A</span><br><span class="line">[4912183.273711] [nf_hook_slow ] TCP: 172.26.137.131:54321 -> 172.26.137.130:8000 seq:2754757913, ack:1579697130, flags:A *ipv4 in chain: PRE_ROUTING*</span><br><span class="line">[4912183.273714] [nft_do_chain ] TCP: 172.26.137.131:54321 -> 172.26.137.130:8000 seq:2754757913, ack:1579697130, flags:A *iptables table:, chain:PREROUT*</span><br><span class="line">[4912183.273716] [ip_rcv_finish ] TCP: 172.26.137.131:54321 -> 172.26.137.130:8000 seq:2754757913, ack:1579697130, flags:A</span><br><span class="line">[4912183.273719] [ip_route_input_slow ] TCP: 172.26.137.131:54321 -> 172.26.137.130:8000 seq:2754757913, ack:1579697130, flags:A</span><br><span class="line">[4912183.273724] [fib_validate_source ] TCP: 172.26.137.131:54321 -> 172.26.137.130:8000 seq:2754757913, ack:1579697130, flags:A</span><br><span class="line">[4912183.273726] [ip_local_deliver ] TCP: 172.26.137.131:54321 -> 172.26.137.130:8000 seq:2754757913, ack:1579697130, flags:A</span><br><span class="line">[4912183.273728] [nf_hook_slow ] TCP: 172.26.137.131:54321 -> 172.26.137.130:8000 seq:2754757913, ack:1579697130, flags:A *ipv4 in chain: INPUT*</span><br><span class="line">[4912183.273733] [nft_do_chain ] TCP: 172.26.137.131:54321 -> 172.26.137.130:8000 seq:2754757913, ack:1579697130, flags:A *iptables table:, chain:INPUT*</span><br><span class="line">[4912183.273735] [nft_do_chain ] TCP: 172.26.137.131:54321 -> 172.26.137.130:8000 seq:2754757913, ack:1579697130, flags:A *iptables table:, chain:INPUT*</span><br><span class="line">[4912183.273737] [ip_local_deliver_finish] TCP: 172.26.137.131:54321 -> 172.26.137.130:8000 seq:2754757913, ack:1579697130, flags:A</span><br><span class="line">[4912183.273738] [tcp_v4_rcv ] TCP: 172.26.137.131:54321 -> 172.26.137.130:8000 seq:2754757913, ack:1579697130, flags:A</span><br><span class="line">[4912183.273742] [__inet_lookup_listener] TCP: 172.26.137.131:54321 -> 172.26.137.130:8000 seq:2754757913, ack:1579697130, flags:A</span><br><span class="line">[4912183.273744] [tcp_filter ] TCP: 172.26.137.131:54321 -> 172.26.137.130:8000 seq:2754757913, ack:1579697130, flags:A</span><br><span class="line">[4912183.273746] [tcp_v4_do_rcv ] TCP: 172.26.137.131:54321 -> 172.26.137.130:8000 seq:2754757913, ack:1579697130, flags:A</span><br><span class="line">[4912183.273750] [tcp_rcv_state_process] TCP: 172.26.137.131:54321 -> 172.26.137.130:8000 seq:2754757913, ack:1579697130, flags:A *TCP socket state has changed*</span><br><span class="line">[4912183.273754] [tcp_v4_send_reset ] TCP: 172.26.137.131:54321 -> 172.26.137.130:8000 seq:2754757913, ack:1579697130, flags:A</span><br><span class="line">[4912183.273798] [kfree_skb ] TCP: 172.26.137.131:54321 -> 172.26.137.130:8000 seq:2754757913, ack:1579697130, flags:A *tcp_v4_do_rcv+0x6c* *packet is dropped by kernel*</span><br><span class="line">[4912183.273801] [packet_rcv ] TCP: 172.26.137.131:54321 -> 172.26.137.130:8000 seq:2754757913, ack:1579697130, flags:A</span><br><span class="line">[4912183.273803] [consume_skb ] TCP: 172.26.137.131:54321 -> 172.26.137.130:8000 seq:2754757913, ack:1579697130, flags:A *packet is freed (normally)*</span><br></pre></td></tr></table></figure><h4 id="场景-3:正常-2-次握手,然后立即发送-RST(不带-timestamp)"><a href="#场景-3:正常-2-次握手,然后立即发送-RST(不带-timestamp)" class="headerlink" title="场景 3:正常 2 次握手,然后立即发送 RST(不带 timestamp)"></a>场景 3:正常 2 次握手,然后立即发送 RST(不带 timestamp)</h4><p>可以看到 RST 被 drop 然后 握手失败</p><figure class="highlight plain"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br><span class="line">39</span><br><span class="line">40</span><br><span class="line">41</span><br><span class="line">42</span><br><span class="line">43</span><br><span class="line">44</span><br><span class="line">45</span><br><span class="line">46</span><br><span class="line">47</span><br><span class="line">48</span><br><span class="line">49</span><br><span class="line">50</span><br><span class="line">51</span><br><span class="line">52</span><br><span class="line">53</span><br><span class="line">54</span><br><span class="line">55</span><br><span class="line">56</span><br><span class="line">57</span><br><span class="line">58</span><br><span class="line">59</span><br><span class="line">60</span><br><span class="line">61</span><br><span class="line">62</span><br><span class="line">63</span><br><span class="line">64</span><br><span class="line">65</span><br><span class="line">66</span><br><span class="line">67</span><br></pre></td><td class="code"><pre><span class="line">***************** c22c8900,c22c8f00 ***************</span><br><span class="line">[4912185.612533] [__ip_local_out ] TCP: 172.26.137.130:8000 -> 172.26.137.131:12345 seq:1983242893, ack:54243195, flags:SA</span><br><span class="line">[4912185.612535] [nf_hook_slow ] TCP: 172.26.137.130:8000 -> 172.26.137.131:12345 seq:1983242893, ack:54243195, flags:SA *ipv4 in chain: OUTPUT*</span><br><span class="line">[4912185.612536] [nft_do_chain ] TCP: 172.26.137.130:8000 -> 172.26.137.131:12345 seq:1983242893, ack:54243195, flags:SA *iptables table:, chain:OUTPUT*</span><br><span class="line">[4912185.612538] [nft_do_chain ] TCP: 172.26.137.130:8000 -> 172.26.137.131:12345 seq:1983242893, ack:54243195, flags:SA *iptables table:, chain:OUTPUT*</span><br><span class="line">[4912185.612539] [ip_output ] TCP: 172.26.137.130:8000 -> 172.26.137.131:12345 seq:1983242893, ack:54243195, flags:SA</span><br><span class="line">[4912185.612541] [nf_hook_slow ] TCP: 172.26.137.130:8000 -> 172.26.137.131:12345 seq:1983242893, ack:54243195, flags:SA *ipv4 in chain: POST_ROUTING*</span><br><span class="line">[4912185.612542] [nft_do_chain ] TCP: 172.26.137.130:8000 -> 172.26.137.131:12345 seq:1983242893, ack:54243195, flags:SA *iptables table:, chain:POSTROU*</span><br><span class="line">[4912185.612544] [ip_finish_output ] TCP: 172.26.137.130:8000 -> 172.26.137.131:12345 seq:1983242893, ack:54243195, flags:SA</span><br><span class="line">[4912185.612546] [ip_finish_output2 ] TCP: 172.26.137.130:8000 -> 172.26.137.131:12345 seq:1983242893, ack:54243195, flags:SA</span><br><span class="line">[4912185.612547] [__dev_queue_xmit ] TCP: 172.26.137.130:8000 -> 172.26.137.131:12345 seq:1983242893, ack:54243195, flags:SA</span><br><span class="line">[4912185.612550] [dev_hard_start_xmit ] TCP: 172.26.137.130:8000 -> 172.26.137.131:12345 seq:1983242893, ack:54243195, flags:SA *skb is successfully sent to the NIC driver*</span><br><span class="line">[4912185.612552] [skb_clone ] TCP: 172.26.137.130:8000 -> 172.26.137.131:12345 seq:1983242893, ack:54243195, flags:SA</span><br><span class="line">[4912185.612555] [tpacket_rcv ] TCP: 172.26.137.130:8000 -> 172.26.137.131:12345 seq:1983242893, ack:54243195, flags:SA</span><br><span class="line">[4912185.612558] [consume_skb ] TCP: 172.26.137.130:8000 -> 172.26.137.131:12345 seq:1983242893, ack:54243195, flags:SA *packet is freed (normally)*</span><br><span class="line">[4912185.612573] [consume_skb ] TCP: 172.26.137.130:8000 -> 172.26.137.131:12345 seq:1983242893, ack:54243195, flags:SA *packet is freed (normally)*</span><br><span class="line"></span><br><span class="line">***************** c22c8f00,c22c8800 ***************</span><br><span class="line">[4912185.631719] [napi_gro_receive_entry] TCP: 172.26.137.131:12345 -> 172.26.137.130:8000 seq:54243195, ack:1983242894, flags:R</span><br><span class="line">[4912185.631726] [dev_gro_receive ] TCP: 172.26.137.131:12345 -> 172.26.137.130:8000 seq:54243195, ack:1983242894, flags:R</span><br><span class="line">[4912185.631728] [__netif_receive_skb_core] TCP: 172.26.137.131:12345 -> 172.26.137.130:8000 seq:54243195, ack:1983242894, flags:R</span><br><span class="line">[4912185.631730] [tpacket_rcv ] TCP: 172.26.137.131:12345 -> 172.26.137.130:8000 seq:54243195, ack:1983242894, flags:R</span><br><span class="line">[4912185.631734] [ip_rcv ] TCP: 172.26.137.131:12345 -> 172.26.137.130:8000 seq:54243195, ack:1983242894, flags:R</span><br><span class="line">[4912185.631736] [ip_rcv_core ] TCP: 172.26.137.131:12345 -> 172.26.137.130:8000 seq:54243195, ack:1983242894, flags:R</span><br><span class="line">[4912185.631737] [skb_clone ] TCP: 172.26.137.131:12345 -> 172.26.137.130:8000 seq:54243195, ack:1983242894, flags:R</span><br><span class="line">[4912185.631744] [nf_hook_slow ] TCP: 172.26.137.131:12345 -> 172.26.137.130:8000 seq:54243195, ack:1983242894, flags:R *ipv4 in chain: PRE_ROUTING*</span><br><span class="line">[4912185.631746] [nft_do_chain ] TCP: 172.26.137.131:12345 -> 172.26.137.130:8000 seq:54243195, ack:1983242894, flags:R *iptables table:, chain:PREROUT*</span><br><span class="line">[4912185.631748] [ip_rcv_finish ] TCP: 172.26.137.131:12345 -> 172.26.137.130:8000 seq:54243195, ack:1983242894, flags:R</span><br><span class="line">[4912185.631754] [ip_route_input_slow ] TCP: 172.26.137.131:12345 -> 172.26.137.130:8000 seq:54243195, ack:1983242894, flags:R</span><br><span class="line">[4912185.631759] [fib_validate_source ] TCP: 172.26.137.131:12345 -> 172.26.137.130:8000 seq:54243195, ack:1983242894, flags:R</span><br><span class="line">[4912185.631762] [ip_local_deliver ] TCP: 172.26.137.131:12345 -> 172.26.137.130:8000 seq:54243195, ack:1983242894, flags:R</span><br><span class="line">[4912185.631763] [nf_hook_slow ] TCP: 172.26.137.131:12345 -> 172.26.137.130:8000 seq:54243195, ack:1983242894, flags:R *ipv4 in chain: INPUT*</span><br><span class="line">[4912185.631765] [nft_do_chain ] TCP: 172.26.137.131:12345 -> 172.26.137.130:8000 seq:54243195, ack:1983242894, flags:R *iptables table:, chain:INPUT*</span><br><span class="line">[4912185.631767] [nft_do_chain ] TCP: 172.26.137.131:12345 -> 172.26.137.130:8000 seq:54243195, ack:1983242894, flags:R *iptables table:, chain:INPUT*</span><br><span class="line">[4912185.631770] [ip_local_deliver_finish] TCP: 172.26.137.131:12345 -> 172.26.137.130:8000 seq:54243195, ack:1983242894, flags:R</span><br><span class="line">[4912185.631772] [tcp_v4_rcv ] TCP: 172.26.137.131:12345 -> 172.26.137.130:8000 seq:54243195, ack:1983242894, flags:R</span><br><span class="line">[4912185.631777] [kfree_skb ] TCP: 172.26.137.131:12345 -> 172.26.137.130:8000 seq:54243195, ack:1983242894, flags:R *tcp_v4_rcv+0x65* *packet is dropped by kernel*</span><br><span class="line">[4912185.631780] [packet_rcv ] TCP: 172.26.137.131:12345 -> 172.26.137.130:8000 seq:54243195, ack:1983242894, flags:R</span><br><span class="line">[4912185.631783] [consume_skb ] TCP: 172.26.137.131:12345 -> 172.26.137.130:8000 seq:54243195, ack:1983242894, flags:R *packet is freed (normally)*</span><br><span class="line"></span><br><span class="line">***************** c22c8600,c22c8100 ***************</span><br><span class="line">[4912185.648623] [napi_gro_receive_entry] TCP: 172.26.137.131:12345 -> 172.26.137.130:8000 seq:54243195, ack:1983242894, flags:A</span><br><span class="line">[4912185.648630] [dev_gro_receive ] TCP: 172.26.137.131:12345 -> 172.26.137.130:8000 seq:54243195, ack:1983242894, flags:A</span><br><span class="line">[4912185.648632] [__netif_receive_skb_core] TCP: 172.26.137.131:12345 -> 172.26.137.130:8000 seq:54243195, ack:1983242894, flags:A</span><br><span class="line">[4912185.648633] [tpacket_rcv ] TCP: 172.26.137.131:12345 -> 172.26.137.130:8000 seq:54243195, ack:1983242894, flags:A</span><br><span class="line">[4912185.648637] [ip_rcv ] TCP: 172.26.137.131:12345 -> 172.26.137.130:8000 seq:54243195, ack:1983242894, flags:A</span><br><span class="line">[4912185.648639] [ip_rcv_core ] TCP: 172.26.137.131:12345 -> 172.26.137.130:8000 seq:54243195, ack:1983242894, flags:A</span><br><span class="line">[4912185.648640] [skb_clone ] TCP: 172.26.137.131:12345 -> 172.26.137.130:8000 seq:54243195, ack:1983242894, flags:A</span><br><span class="line">[4912185.648643] [nf_hook_slow ] TCP: 172.26.137.131:12345 -> 172.26.137.130:8000 seq:54243195, ack:1983242894, flags:A *ipv4 in chain: PRE_ROUTING*</span><br><span class="line">[4912185.648645] [nft_do_chain ] TCP: 172.26.137.131:12345 -> 172.26.137.130:8000 seq:54243195, ack:1983242894, flags:A *iptables table:, chain:PREROUT*</span><br><span class="line">[4912185.648647] [ip_rcv_finish ] TCP: 172.26.137.131:12345 -> 172.26.137.130:8000 seq:54243195, ack:1983242894, flags:A</span><br><span class="line">[4912185.648650] [ip_route_input_slow ] TCP: 172.26.137.131:12345 -> 172.26.137.130:8000 seq:54243195, ack:1983242894, flags:A</span><br><span class="line">[4912185.648656] [fib_validate_source ] TCP: 172.26.137.131:12345 -> 172.26.137.130:8000 seq:54243195, ack:1983242894, flags:A</span><br><span class="line">[4912185.648659] [ip_local_deliver ] TCP: 172.26.137.131:12345 -> 172.26.137.130:8000 seq:54243195, ack:1983242894, flags:A</span><br><span class="line">[4912185.648660] [nf_hook_slow ] TCP: 172.26.137.131:12345 -> 172.26.137.130:8000 seq:54243195, ack:1983242894, flags:A *ipv4 in chain: INPUT*</span><br><span class="line">[4912185.648662] [nft_do_chain ] TCP: 172.26.137.131:12345 -> 172.26.137.130:8000 seq:54243195, ack:1983242894, flags:A *iptables table:, chain:INPUT*</span><br><span class="line">[4912185.648664] [nft_do_chain ] TCP: 172.26.137.131:12345 -> 172.26.137.130:8000 seq:54243195, ack:1983242894, flags:A *iptables table:, chain:INPUT*</span><br><span class="line">[4912185.648667] [ip_local_deliver_finish] TCP: 172.26.137.131:12345 -> 172.26.137.130:8000 seq:54243195, ack:1983242894, flags:A</span><br><span class="line">[4912185.648672] [tcp_v4_rcv ] TCP: 172.26.137.131:12345 -> 172.26.137.130:8000 seq:54243195, ack:1983242894, flags:A</span><br><span class="line">[4912185.648677] [__inet_lookup_listener] TCP: 172.26.137.131:12345 -> 172.26.137.130:8000 seq:54243195, ack:1983242894, flags:A</span><br><span class="line">[4912185.648679] [tcp_filter ] TCP: 172.26.137.131:12345 -> 172.26.137.130:8000 seq:54243195, ack:1983242894, flags:A</span><br><span class="line">[4912185.648681] [tcp_v4_do_rcv ] TCP: 172.26.137.131:12345 -> 172.26.137.130:8000 seq:54243195, ack:1983242894, flags:A</span><br><span class="line">[4912185.648685] [tcp_rcv_state_process] TCP: 172.26.137.131:12345 -> 172.26.137.130:8000 seq:54243195, ack:1983242894, flags:A *TCP socket state has changed*</span><br><span class="line">[4912185.648689] [tcp_v4_send_reset ] TCP: 172.26.137.131:12345 -> 172.26.137.130:8000 seq:54243195, ack:1983242894, flags:A</span><br><span class="line">[4912185.648739] [kfree_skb ] TCP: 172.26.137.131:12345 -> 172.26.137.130:8000 seq:54243195, ack:1983242894, flags:A *tcp_v4_do_rcv+0x6c* *packet is dropped by kernel*</span><br><span class="line">[4912185.648741] [packet_rcv ] TCP: 172.26.137.131:12345 -> 172.26.137.130:8000 seq:54243195, ack:1983242894, flags:A</span><br><span class="line">[4912185.648743] [consume_skb ] TCP: 172.26.137.131:12345 -> 172.26.137.130:8000 seq:54243195, ack:1983242894, flags:A *packet is freed (normally)*</span><br></pre></td></tr></table></figure><p>上面三个场景都没能重现问题,所以继续构造场景 4</p><h4 id="场景-4-timestamp-不递增"><a href="#场景-4-timestamp-不递增" class="headerlink" title="场景 4 timestamp 不递增"></a>场景 4 timestamp 不递增</h4><p>保证 tcp options 里面有 timestamp,且不递增,这时终于重现了连接残留:</p><figure class="highlight plain"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br></pre></td><td class="code"><pre><span class="line">//这表示有 tcp 连接残留在 8000 端口上,而实际上期望连接要因为有 RST 而被释放</span><br><span class="line">#netstat -ant |grep 8000</span><br><span class="line">tcp 4 0 0.0.0.0:8000 0.0.0.0:* LISTEN</span><br><span class="line">tcp 0 0 172.26.137.130:8000 172.26.137.131:19723 ESTABLISHED</span><br><span class="line"></span><br><span class="line"></span><br><span class="line">//此时对应的抓包,注意这里 Server 端也没有回复 RST,前面 3 个场景 Server 端 8000 都会回 RST,从而不会残留</span><br><span class="line">//连接残留:ts 为 0,RST 被忽略,导致连接残留</span><br><span class="line">16:09:55.669693 IP 172.26.137.131.19723 > 172.26.137.130.8000: Flags [S], seq 12345, win 8192, options [TS val 1732608595 ecr 0,eol], length 0</span><br><span class="line">16:09:55.669708 IP 172.26.137.130.8000 > 172.26.137.131.19723: Flags [S.], seq 3736478060, ack 12346, win 65160, options [mss 1460,nop,nop,TS val 2982589154 ecr 1732608595], length 0</span><br><span class="line">16:09:55.687943 IP 172.26.137.131.19723 > 172.26.137.130.8000: Flags [R], seq 12346, win 8192, options [TS val 0 ecr 2982589154,eol], length 0</span><br><span class="line">16:09:55.703896 IP 172.26.137.131.19723 > 172.26.137.130.8000: Flags [.], ack 1, win 8192, options [TS val 1732608595 ecr 0,eol], length 0</span><br><span class="line"></span><br><span class="line">//连接残留:ts 没递增,RST 被忽略,导致连接残留</span><br><span class="line">17:18:26.739344 IP 172.26.137.131.59541 > 172.26.137.130.8000: Flags [S], seq 12345, win 8192, options [TS val 1732612706 ecr 0,eol], length 0</span><br><span class="line">17:18:26.739358 IP 172.26.137.130.8000 > 172.26.137.131.59541: Flags [S.], seq 3510510105, ack 12346, win 65160, options [mss 1460,nop,nop,TS val 2986700224 ecr 1732612706], length 0</span><br><span class="line">17:18:26.756574 IP 172.26.137.131.59541 > 172.26.137.130.8000: Flags [R], seq 12346, win 8192, options [mss 1460,TS val 1732611916 ecr 0,eol], length 0</span><br><span class="line">17:18:26.870569 IP 172.26.137.131.59541 > 172.26.137.130.8000: Flags [.], ack 1, win 8192, options [TS val 1732612706 ecr 0,eol], length 0</span><br></pre></td></tr></table></figure><p>不会导致连接残留的情况:</p><figure class="highlight plain"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br></pre></td><td class="code"><pre><span class="line">//连接不残留, ts 递增</span><br><span class="line">16:24:33.516952 IP 172.26.137.131.19544 > 172.26.137.130.8000: Flags [S], seq 12345, win 8192, options [TS val 1732609473 ecr 0,eol], length 0</span><br><span class="line">16:24:33.516967 IP 172.26.137.130.8000 > 172.26.137.131.19544: Flags [S.], seq 1834771950, ack 12346, win 65160, options [mss 1460,nop,nop,TS val 2983467001 ecr 1732609473], length 0</span><br><span class="line">16:24:33.539178 IP 172.26.137.131.19544 > 172.26.137.130.8000: Flags [R], seq 12346, win 8192, options [TS val 1732609473 ecr 0,eol], length 0</span><br><span class="line">16:24:33.556153 IP 172.26.137.131.19544 > 172.26.137.130.8000: Flags [.], ack 1, win 8192, options [TS val 1732609473 ecr 0,eol], length 0</span><br><span class="line">16:24:33.556164 IP 172.26.137.130.8000 > 172.26.137.131.19544: Flags [R], seq 1834771951, win 0, length 0</span><br><span class="line"></span><br><span class="line">//连接不残留, 有 options 但是没有 ts</span><br><span class="line">17:05:16.217333 IP 172.26.137.131.22567 > 172.26.137.130.8000: Flags [S], seq 12345, win 8192, options [TS val 1732611916 ecr 0,eol], length 0</span><br><span class="line">17:05:16.217351 IP 172.26.137.130.8000 > 172.26.137.131.22567: Flags [S.], seq 3503286934, ack 12346, win 65160, options [mss 1460,nop,nop,TS val 2985909702 ecr 1732611916], length 0</span><br><span class="line">17:05:16.229589 IP 172.26.137.131.22567 > 172.26.137.130.8000: Flags [R], seq 12346, win 8192, options [mss 1460], length 0</span><br><span class="line">17:05:16.346564 IP 172.26.137.131.22567 > 172.26.137.130.8000: Flags [.], ack 1, win 8192, options [TS val 1732611916 ecr 0,eol], length 0</span><br><span class="line">17:05:16.346578 IP 172.26.137.130.8000 > 172.26.137.131.22567: Flags [R], seq 3503286935, win 0, length 0</span><br><span class="line"></span><br><span class="line">//连接不残留,options 为 null</span><br><span class="line">16:29:38.618811 IP 172.26.137.131.33190 > 172.26.137.130.8000: Flags [S], seq 12345, win 8192, options [TS val 1732609778 ecr 0,eol], length 0</span><br><span class="line">16:29:38.618824 IP 172.26.137.130.8000 > 172.26.137.131.33190: Flags [S.], seq 1867663284, ack 12346, win 65160, options [mss 1460,nop,nop,TS val 2983772103 ecr 1732609778], length 0</span><br><span class="line">16:29:38.647039 IP 172.26.137.131.33190 > 172.26.137.130.8000: Flags [R], seq 12346, win 8192, length 0</span><br><span class="line">16:29:38.670061 IP 172.26.137.131.33190 > 172.26.137.130.8000: Flags [.], ack 1, win 8192, options [TS val 1732609778 ecr 0,eol], length 0</span><br><span class="line">16:29:38.670073 IP 172.26.137.130.8000 > 172.26.137.131.33190: Flags [R], seq 1867663285, win 0, length 0</span><br><span class="line"></span><br><span class="line">//连接不残留, 有 options ,但 ts 为 nop</span><br><span class="line">17:37:37.476343 IP 172.26.137.131.12345 > 172.26.137.130.8000: Flags [S], seq 2331525453, win 8192, options [mss 1460,nop,nop,TS val 1732613857 ecr 0], length 0</span><br><span class="line">17:37:37.476460 IP 172.26.137.130.8000 > 172.26.137.131.12345: Flags [S.], seq 230155727, ack 2331525454, win 65160, options [mss 1460,nop,nop,TS val 2987850961 ecr 1732613857], length 0</span><br><span class="line">17:37:37.494579 IP 172.26.137.131.12345 > 172.26.137.130.8000: Flags [R], seq 2331525454, win 8192, options [nop,nop,eol], length 0</span><br><span class="line">17:37:37.511431 IP 172.26.137.131.12345 > 172.26.137.130.8000: Flags [.], ack 1, win 8192, options [nop,nop,TS val 1732613857 ecr 0], length 0</span><br><span class="line">17:37:37.511546 IP 172.26.137.130.8000 > 172.26.137.131.12345: Flags [R], seq 230155728, win 0, length 0</span><br><span class="line">17:37:37.526369 IP 172.26.137.131.12345 > 172.26.137.130.8000: Flags [R], seq 2331525454, win 8192, length 0</span><br></pre></td></tr></table></figure><h3 id="结论"><a href="#结论" class="headerlink" title="结论"></a>结论</h3><p>最终重现的必要条件:<strong>内核在三次握手阶段(TCP_NEW_SYN_RECV),收到的RST 包里有 timestamp 且不递增</strong> 就会丢弃 RST</p><p>注意:</p><ul><li>如果 RST 的 seq 不递增也会导致连接残留,这属于 seq 回绕了 // /proc/net/netstat 中没找到 有哪个指标对应的监控</li><li>要区分 timestamp 没有和 timestamp 为 0 的情况,为 0 表示有,大概率回绕了//场景 1-3 都忽略了这个问题</li><li>options=[(‘NOP’, None), (‘NOP’, None)]) 表示没有 timestamp,也不能重现问题</li><li>以上案例 2/3/4 场景下 nettrace 看到的 RST 都被 drop 了,但是不妨碍连接的释放 //这个还需要分析为什么连接 RST 起作用了但是还是会 drop RST 包</li><li>如果出现连接残留,也会导致全连接队列增大直到溢出</li><li>三次握手成功后的通信阶段(established),此时只校验 RST 的 seq 有没有回绕,不校验 timestamp,这样连接能正确释放</li></ul><figure class="highlight plain"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br></pre></td><td class="code"><pre><span class="line">//三次握手成功,如果 RST 带的 timestamp 不递增也会正确触发释放连接,也就是 ESTABLISHED 时只校验 RST 的 seq 有没有回绕,不校验 timestamp</span><br><span class="line">//如下抓包的连接被正确释放了,所以 LVS 会用这个逻辑来释放连接,但是一旦乱序就嗝屁了</span><br><span class="line">12:19:58.588218 IP 172.26.137.131.1406 > 172.26.137.130.8000: Flags [S], seq 2800159571, win 8192, options [mss 1460,nop,nop,TS val 1732681198 ecr 0], length 0</span><br><span class="line">12:19:58.588233 IP 172.26.137.130.8000 > 172.26.137.131.1406: Flags [S.], seq 3011503126, ack 2800159572, win 65160, options [mss 1460,nop,nop,TS val 3055192072 ecr 1732681198], length 0</span><br><span class="line">12:19:58.606594 IP 172.26.137.131.1406 > 172.26.137.130.8000: Flags [.], ack 1, win 8192, options [nop,nop,TS val 1732681198 ecr 0], length 0</span><br><span class="line">12:19:58.624392 IP 172.26.137.131.1406 > 172.26.137.130.8000: Flags [R], seq 2800159572, win 8192, options [nop,nop,TS val 0 ecr 0], length 0</span><br></pre></td></tr></table></figure><h4 id="对应的内核-commit"><a href="#对应的内核-commit" class="headerlink" title="对应的内核 commit"></a>对应的内核 commit</h4><p>Server 在握手的第三阶段(TCP_NEW_SYN_RECV),等待对端进行握手的第三步回 ACK时候,如果收到RST 内核会对报文进行PAWS校验,如果 RST 带的 timestamp(TVal) 不递增就会因为通不过 PAWS 校验而被扔掉</p><p>问题引入:<a href="https://github.com/torvalds/linux/commit/7faee5c0d514162853a343d93e4a0b6bb8bfec21" target="_blank" rel="noopener">https://github.com/torvalds/linux/commit/7faee5c0d514162853a343d93e4a0b6bb8bfec21</a> 这个 commit 去掉了TCP_SKB_CB(skb)->when = tcp_time_stamp,导致 3.18 的内核版本linger close主动发送的 RST 中 ts_val为0</p><p>问题修复:<a href="https://github.com/torvalds/linux/commit/675ee231d960af2af3606b4480324e26797eb010" target="_blank" rel="noopener">修复的commit在 675ee231d960af2af3606b4480324e26797eb010</a>,直到 4.10 才合并进内核</p><h4 id="监控"><a href="#监控" class="headerlink" title="监控"></a>监控</h4><p>对应这种握手阶段连接建立如何监控呢?</p><p>从内核代码 net/ipv4/tcp_minisocks.c/tcp_check_req 函数会对报文调用 tcp_paws_reject 函数进行 paws_reject 检测,tcp_paws_reject 如果返回值为true,则 tcp_check_req 返回NULL,并且记录 LINUX_MIB_PAWSESTABREJECTED 计数</p><p>可以观察 /proc/net/netstat 中的监控指标:PAWSEstab</p><figure class="highlight plain"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br></pre></td><td class="code"><pre><span class="line">//内核中的指标</span><br><span class="line">SNMP_MIB_ITEM("PAWSEstab", LINUX_MIB_PAWSESTABREJECTED)</span><br><span class="line"></span><br><span class="line">//尝试了 5 次 RST的 timestamp 不递增导致的残留,监控到这个值每次变化累加 1</span><br><span class="line">TcpExt:PAWSEstab 1 -> 1 -> 1 -> 1 -> 1</span><br></pre></td></tr></table></figure><p>虽然三次握手没有完成,但是在服务端连接已经是 ESTABLISHED,所以这里的统计指标还是 PAWSEstab,可以通过 netstat -s 来查看:</p><figure class="highlight plain"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br></pre></td><td class="code"><pre><span class="line">#netstat -s |grep -E -i "timestamp|paws"</span><br><span class="line"> 71 packets rejected in established connections because of timestamp //无论是三次握手阶段的 RST 还是握手成功后的请求只要 timestamp 不递增就会 drop</span><br></pre></td></tr></table></figure><p>这个指标对应在 netstat 源码(net-tools) 中的解释:</p><figure class="highlight plain"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br></pre></td><td class="code"><pre><span class="line">{"PAWSEstab", N_("%llu packets rejected in established connections because of timestamp"), opt_number},</span><br><span class="line"> {"PAWSPassive", N_("%llu passive connections rejected because of time stamp"), opt_number},</span><br></pre></td></tr></table></figure><h2 id="总结"><a href="#总结" class="headerlink" title="总结"></a>总结</h2><p>星球里之前也写过 scapy 的入门以及使用案例: <a href="https://articles.zsxq.com/id_6r1xkzwdb8zp.html" target="_blank" rel="noopener">scapy 重现网络问题真香</a></p><p>就像学英语的时候要精读,分析 case 也需要深挖,可以挖上一到两周,不要每天假学习(似乎啥都看了,当时啥都懂,过几个月啥都不懂)</p><p>掌握技能比掌握知识点和问题的原因更重要</p><p>nettrace 也真的很好用/很好玩,可以帮你学到很多内核知识</p><h2 id="参考资料"><a href="#参考资料" class="headerlink" title="参考资料"></a>参考资料</h2><p><a href="https://cloud.tencent.com/developer/article/2210423" target="_blank" rel="noopener">https://cloud.tencent.com/developer/article/2210423</a></p><p><a href="https://articles.zsxq.com/id_52ha2j6r5gow.html" target="_blank" rel="noopener">为什么你的 SYN 包被丢 net.ipv4.tcp_tw_recycle</a></p><p><a href="https://articles.zsxq.com/id_6r1xkzwdb8zp.html" target="_blank" rel="noopener">从一个fin 卡顿问题到 scapy 的使用</a></p><h2 id="如果你觉得看完对你很有帮助可以通过如下方式找到我"><a href="#如果你觉得看完对你很有帮助可以通过如下方式找到我" class="headerlink" title="如果你觉得看完对你很有帮助可以通过如下方式找到我"></a>如果你觉得看完对你很有帮助可以通过如下方式找到我</h2><p>find me on twitter: <a href="https://twitter.com/plantegg" target="_blank" rel="noopener">@plantegg</a></p><p>知识星球:<a href="https://t.zsxq.com/0cSFEUh2J" target="_blank" rel="noopener">https://t.zsxq.com/0cSFEUh2J</a></p><p>开了一个星球,在里面讲解一些案例、知识、学习方法,肯定没法让大家称为顶尖程序员(我自己都不是),只是希望用我的方法、知识、经验、案例作为你的垫脚石,帮助你快速、早日成为一个基本合格的程序员。</p><p>争取在星球内:</p><ul><li>养成基本动手能力</li><li>拥有起码的分析推理能力–按我接触的程序员,大多都是没有逻辑的</li><li>知识上教会你几个关键的知识点</li></ul><img src="https://cdn.jsdelivr.net/gh/plantegg/plantegg.github.io/images/951413iMgBlog/image-20240324161113874-5525714.png" alt="image-20240324161113874" style="zoom:50%;">]]></content>
<summary type="html">
<h1 id="一次网络连接残留的分析"><a href="#一次网络连接残留的分析" class="headerlink" title="一次网络连接残留的分析"></a>一次网络连接残留的分析</h1><p>本来放在知识星球的收费文章,也网络直播给星球成员讲解过这个问题以及这
</summary>
<category term="tcp" scheme="https://plantegg.github.io/categories/tcp/"/>
<category term="scapy" scheme="https://plantegg.github.io/tags/scapy/"/>
<category term="tcp" scheme="https://plantegg.github.io/tags/tcp/"/>
<category term="debug" scheme="https://plantegg.github.io/tags/debug/"/>
<category term="nettrace" scheme="https://plantegg.github.io/tags/nettrace/"/>
</entry>
<entry>
<title>tcp会偶尔3秒timeout的分析以及如何用php规避这个问题</title>
<link href="https://plantegg.github.io/2024/11/02/tcp%E4%BC%9A%E5%81%B6%E5%B0%943%E7%A7%92timeout/"/>
<id>https://plantegg.github.io/2024/11/02/tcp会偶尔3秒timeout/</id>
<published>2024-11-02T09:30:03.000Z</published>
<updated>2024-11-20T07:08:06.252Z</updated>
<content type="html"><![CDATA[<h1 id="tcp会偶尔3秒timeout的分析以及如何用php规避这个问题"><a href="#tcp会偶尔3秒timeout的分析以及如何用php规避这个问题" class="headerlink" title="tcp会偶尔3秒timeout的分析以及如何用php规避这个问题"></a><a href="https://web.archive.org/web/20170317084941/http://mogu.io/tcp-three-second-timeout-with-php-3" target="_blank" rel="noopener">tcp会偶尔3秒timeout的分析以及如何用php规避这个问题</a></h1><blockquote><p>这是一篇好文章,随着蘑菇街的完蛋,蘑菇街技术博客也没了,所以特意备份一下这篇</p></blockquote><ul><li><p>作者:蚩尤 </p></li><li><p>时间:May 27, 2014</p></li></ul><p>2年前做一个cache中间件调用的时候,发现很多通过php的curl调用一个的服务会出现偶尔的connect_time超时, 表现为get_curlinfo的connect_time在3秒左右, 本来没怎么注意, 因为客户端的curl_timeout设置的就是3秒, 某天, 我把这个timeout改到了5秒后, 发现了一个奇怪的现象, 很多慢请求依旧表现为connect_time在3秒左右..看来这个3秒并不是因为客户端设置的timeout引起的.于是开始查找这个原因.</p><hr><p>首先, 凭借经验调整了linux内核关于tcp的几个参数</p><figure class="highlight plain"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br></pre></td><td class="code"><pre><span class="line">net.core.netdev_max_backlog = 862144</span><br><span class="line">net.core.somaxconn = 262144</span><br></pre></td></tr></table></figure><p>经过观察发现依旧会有3秒超时, 而且数量并没有减少.</p><p>第二步, 排除是大并发导致的问题, 在一台空闲机器上也部署同样的服务, 仅让线上一台机器跑空闲机器的服务, 结果发现依旧会有报错.排除并发导致的问题.</p><p>最后, 通过查了大量的资料才发现并不是我们才遇到过这个问题, 而且这个问题并不是curl的问题, 它影响到所有tcp的调用, 网上各种说法, 但结论都指向linux内核对于tcp的实现.(某些版本会出现这些问题), 有兴趣的可以看下下面这两个资料.<br><a href="https://web.archive.org/web/20170317084941/http://www.spinics.net/lists/linux-net/msg17545.html" target="_blank" rel="noopener">资料1</a><br><a href="https://web.archive.org/web/20170317084941/http://marc.info/?t=120655182600018&r=1&w=2" target="_blank" rel="noopener">资料2</a></p><p>一看深入到linux内核..不管怎样修改的成本一定很大..于是乎, 发挥我们手中的php来规避这个问题的时间到了.</p><p>原本的代码, 简单实现,常规curl调用:</p><figure class="highlight plain"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br></pre></td><td class="code"><pre><span class="line">function curl_call($p1, $p2 ...) {</span><br><span class="line"> $ch = curl_init();</span><br><span class="line"> curl_setopt($ch, CURLOPT_TIMEOUT, 5);</span><br><span class="line"> curl_setopt($ch, CURLOPT_URL, 'http://demon.at');</span><br><span class="line"> $res = curl_exec($ch);</span><br><span class="line"> if (false === $res) {</span><br><span class="line"> //失败..抛异常..</span><br><span class="line"> }</span><br><span class="line"> return $res;</span><br><span class="line">}</span><br></pre></td></tr></table></figure><p>可以看出, 如果用上面的代码, 无法避免3秒connect_time的问题..这种实现对curl版本会有要求(CURLOPT_CONNECTTIMEOUT_MS),主要的思路是,通过对链接时间进行毫秒级的控制(因为超时往往发生在connect的时候),加上失败重试机制,来最大限度保证调用的正确性。所以,下面的代码就诞生了:</p><figure class="highlight plain"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br></pre></td><td class="code"><pre><span class="line">function curl_call($p1, $p2, $times = 1) {</span><br><span class="line"> $ch = curl_init();</span><br><span class="line"> curl_setopt($ch, CURLOPT_TIMEOUT, 5);</span><br><span class="line"> curl_setopt($ch, CURLOPT_URL, 'http://demon.at');</span><br><span class="line"> $curl_version = curl_version();</span><br><span class="line"> if ($curl_version['version_number'] >= 462850) {</span><br><span class="line"> curl_setopt($ch, CURLOPT_CONNECTTIMEOUT_MS, 20);</span><br><span class="line"> curl_setopt($ch, CURLOPT_NOSIGNAL, 1);</span><br><span class="line"> } else {</span><br><span class="line"> throw new Exception('this curl version is too low, version_num : ' </span><br><span class="line"> . $curl_version['version']);</span><br><span class="line"> }</span><br><span class="line"> $res = curl_exec($ch);</span><br><span class="line"> curl_close($ch);</span><br><span class="line"> if (false === $res) {</span><br><span class="line"> if (curl_errno($ch) == CURLE_OPERATION_TIMEOUTED</span><br><span class="line"> and $times != 最大重试阀值 ) {</span><br><span class="line"> $times += 1;</span><br><span class="line"> return curl_call($p1, $p2, $times);</span><br><span class="line"> }</span><br><span class="line"> }</span><br><span class="line"></span><br><span class="line"> return $res;</span><br><span class="line">}</span><br></pre></td></tr></table></figure><p>上面这段代码只是一个规避的简单实例, 一些小细节并没有可以完善..比如抛出异常常以后curl资源的手动释放等等..这里不做讨论..当然还漏了一点要说的是,对重试次数最好加上限制 :)</p><p>说明一下上面几个数字值的含义:</p><figure class="highlight plain"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br></pre></td><td class="code"><pre><span class="line">462850 //因为php的CURLOPT_CONNECTTIMEOUT_MS需要 curl_version 7.16.2,这个值就是这个版本的数字版本号,还需要注意的是, php版本要大于5.2.3</span><br><span class="line">20 //连接超时的时间, 单位:ms</span><br></pre></td></tr></table></figure><hr><p>这样这个问题就这样通过php的代码来规避开了.<br>如果有对这个问题有更好的解决方法,欢迎指教.</p><hr><h2 id="总结"><a href="#总结" class="headerlink" title="总结"></a>总结</h2><p><a href="https://mp.weixin.qq.com/s/-pRA12sLJktbXa-srWn02w" target="_blank" rel="noopener">tcp connect 的流程是这样的</a>:<br>1、tcp发出SYN建链报文后,报文到ip层需要进行路由查询<br>2、路由查询完成后,报文到arp层查询下一跳mac地址<br>3、如果本地没有对应网关的arp缓存,就需要缓存住这个报文,发起arp请求<br>4、arp层收到arp回应报文之后,从缓存中取出SYN报文,完成mac头填写并发送给驱动。</p><p>问题在于,arp层缓存队列长度默认为3。如果你运气不好,刚好赶上缓存已满,这个报文就会被丢弃。</p><p>TCP层发现SYN报文发出去3s(默认值)还没有回应,就会重发一个SYN。这就是为什么少数连接会3s后才能建链。</p><p>幸运的是,arp层缓存队列长度是可配置的,用 sysctl -a | grep unres_qlen 就能看到,默认值为3。</p>]]></content>
<summary type="html">
<h1 id="tcp会偶尔3秒timeout的分析以及如何用php规避这个问题"><a href="#tcp会偶尔3秒timeout的分析以及如何用php规避这个问题" class="headerlink" title="tcp会偶尔3秒timeout的分析以及如何用php规避
</summary>
<category term="TCP" scheme="https://plantegg.github.io/categories/TCP/"/>
<category term="TCP" scheme="https://plantegg.github.io/tags/TCP/"/>
<category term="TCP connection" scheme="https://plantegg.github.io/tags/TCP-connection/"/>
<category term="unres_qlen" scheme="https://plantegg.github.io/tags/unres-qlen/"/>
<category term="arp" scheme="https://plantegg.github.io/tags/arp/"/>
</entry>
<entry>
<title>tcpdump 抓包卡顿分析</title>
<link href="https://plantegg.github.io/2024/10/13/tcpdump%E6%8A%93%E5%8C%85%E5%8D%A1%E9%A1%BF%E5%88%86%E6%9E%90/"/>
<id>https://plantegg.github.io/2024/10/13/tcpdump抓包卡顿分析/</id>
<published>2024-10-13T09:30:03.000Z</published>
<updated>2024-11-20T10:00:54.801Z</updated>
<content type="html"><![CDATA[<h2 id="tcpdump-抓包卡顿分析"><a href="#tcpdump-抓包卡顿分析" class="headerlink" title="tcpdump 抓包卡顿分析"></a>tcpdump 抓包卡顿分析</h2><h3 id="背景"><a href="#背景" class="headerlink" title="背景"></a>背景</h3><p>从 192.168.104.1 上执行 ping 192.168.104.4 -c 1 ping 命令很快通了, 同时在ubuntu 机(192.168.104.4) 上抓包</p><p>在192.168.104.4 上的 tcpdump 要卡很久(几十秒)后才输出几十秒前抓到的包 :(,最一开始以为是自己通过 lima 虚拟化的 ubuntu 机器慢 or tcpdump 初始化慢导致的,但是发现等了几十秒后能看到几十秒前抓到的包,感觉有点诡异,所以分析了一下原因。</p><p>既然几十秒后能看到几十秒前的包,说明抓包正常,只是哪里卡了,所以用 strace 看看卡在了哪里。</p><p>下文用到的主要的 Debug 命令:</p><figure class="highlight plain"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br></pre></td><td class="code"><pre><span class="line">//-r 打印相对时间</span><br><span class="line">//-s 256 表示--string-limit,设置 limit 为 256,可以显示 sendto(下图黄底) 系统调用完整的 DNS 查询字符串(下图绿线)</span><br><span class="line">strace -r -s 256 tcpdump -i eth0 icmp</span><br></pre></td></tr></table></figure><p>分析步骤如下:</p><h3 id="步骤-1"><a href="#步骤-1" class="headerlink" title="步骤 1"></a>步骤 1</h3><p>如下图是 strace -r -s 256 tcpdump -i eth0 icmp 命令的输出 ,发现抓到包后对 IP 192.168.104.4 去做了 DNS 解析,而这个解析发给 127.0.0.53 后长时间没有响应,5 秒超时后并重试(下图红框),导致多次 5 秒超时卡顿:</p><p><img src="https://cdn.jsdelivr.net/gh/plantegg/plantegg.github.io/images/951413iMgBlog/image-20241008144023596.png" alt="image-20241008144023596"></p><p>于是在 /etc/hosts 添加 192.168.104.4 localhost 后不再对 192.168.104.4 进行解析,但是仍然会对对端的 IP 192.168.104.1 进行解析:</p><p><img src="https://cdn.jsdelivr.net/gh/plantegg/plantegg.github.io/images/951413iMgBlog/image-20241008144145663.png" alt="image-20241008144145663"></p><p>上图说明:</p><ul><li>上图最后一个绿线表示 tcpdump 抓到了 ping 包(ICMP 协议包)</li><li>\0011\003104\003168\003192 表示:192.168.104.1 ,\0011 前面的 \001 表示 1 位,1 表示 ip 地址值的最后一个 //把整个双引号内容丢给 GPT 会给你一个很好的解释</li></ul><h3 id="步骤-2"><a href="#步骤-2" class="headerlink" title="步骤 2"></a>步骤 2</h3><p>从上面两个图中的 connect 内核函数可以看到每次都把 ip 丢给了 127.0.0.53 这个特殊 IP 来解析,下面是 GPT 给出的解释,我试了下将 DNSStubListener=no(修改配置文件:/etc/systemd/resolved.conf 后执行 systemctl restart systemd-resolved) 后 tcpdump 完全不卡了:</p><p>systemd-resolved:</p><ol><li>systemd-resolved 是一个系统服务,负责为本地应用程序提供网络名称解析。</li><li>它作为一个本地 DNS 解析器和缓存,可以提高 DNS 查询的效率。</li><li>systemd-resolved 支持多种 DNS 协议,如 DNSSEC、DNS over TLS 等。</li><li>它可以管理多个网络接口的 DNS 设置,适合复杂的网络环境。</li></ol><p>DNSStubListener 参数:</p><ol><li>DNSStubListener 是 systemd-resolved 的一个功能,默认情况下是启用的(yes)。</li><li>当启用时,systemd-resolved 会在本地 127.0.0.53 地址上运行一个 DNS 存根监听器。</li><li>这个存根监听器会接收本地应用程序的 DNS 查询请求,然后转发给实际的 DNS 服务器。</li><li>当设置 DNSStubListener=no 时:<ul><li>存根监听器被禁用。</li><li>本地应用程序的 DNS 查询将直接发送到配置的 DNS 服务器,而不经过 systemd-resolved</li></ul></li></ol><p>现在 tcpdump 虽然不卡了,但是抓包的时候通过 strace 看到还是会走 DNS 解析流程,这个时候的 DNS 解析都发给了 192.168.104.2:53 (配置在 /etc/resolv.conf 中),也就是 systemd-resolved 的 127.0.0.53:53 udp 端口虽然在监听,但是不响应任何查询导致了超时,而 192.168.104.2:53 服务正常</p><p>这个时候的 strace 日志:</p><figure class="highlight plain"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br></pre></td><td class="code"><pre><span class="line"> 0.000308 socket(AF_INET, SOCK_DGRAM|SOCK_CLOEXEC|SOCK_NONBLOCK, IPPROTO_IP) = 5 //SOCK_DGRAM UDP 模式</span><br><span class="line"> 0.000134 setsockopt(5, SOL_IP, IP_RECVERR, [1], 4) = 0</span><br><span class="line"> 0.000414 connect(5, {sa_family=AF_INET, sin_port=htons(53), sin_addr=inet_addr("192.168.104.2")}, 16) = 0 //目标主机 192.168.104.2</span><br><span class="line"> 0.000373 ppoll([{fd=5, events=POLLOUT}], 1, {tv_sec=0, tv_nsec=0}, NULL, 0) = 1 ([{fd=5, revents=POLLOUT}], left {tv_sec=0, tv_nsec=0})</span><br><span class="line"> 0.000348 sendto(5, "e\323\1\0\0\1\0\0\0\0\0\0\0014\003104\003168\003192\7in-addr\4arpa\0\0\f\0\1", 44, MSG_NOSIGNAL, NULL, 0) = 44 //发送 DNS 查询,这里可能会超时等待</span><br><span class="line"> 0.000610 ppoll([{fd=5, events=POLLIN}], 1, {tv_sec=5, tv_nsec=0}, NULL, 0) = 1 ([{fd=5, revents=POLLIN}], left {tv_sec=4, tv_nsec=999999042})</span><br><span class="line"> 0.000203 ioctl(5, FIONREAD, [44]) = 0</span><br><span class="line"> //这次 0.000136 秒后收到了响应</span><br><span class="line"> 0.000136 recvfrom(5, "e\323\201\200\0\1\0\0\0\0\0\0\0014\003104\003168\003192\7in-addr\4arpa\0\0\f\0\1", 1024, 0, {sa_family=AF_INET, sin_port=htons(53), sin_addr=inet_addr("192.168.104.2")}, [28 => 16]) = 44</span><br><span class="line"> 0.000462 close(5) = 0</span><br><span class="line"> 0.000249 write(1, "17:01:20.316738 IP 192.168.104.1 > 192.168.104.4: ICMP echo request, id 31, seq 1, length 64\n", 9317:01:20.316738 IP 192.168.104.1 > 192.168.104.4: ICMP echo request, id 31, seq 1, length 64</span><br><span class="line">) = 93</span><br><span class="line"> 0.000306 newfstatat(AT_FDCWD, "/etc/localtime", {st_mode=S_IFREG|0644, st_size=561, ...}, 0) = 0</span><br><span class="line"> 0.000269 write(1, "17:01:20.316795 IP 192.168.104.4 > 192.168.104.1: ICMP echo reply, id 31, seq 1, length 64\n", 9117:01:20.316795 IP 192.168.104.4 > 192.168.104.1: ICMP echo reply, id 31, seq 1, length 64</span><br></pre></td></tr></table></figure><h3 id="步骤-3"><a href="#步骤-3" class="headerlink" title="步骤 3"></a>步骤 3</h3><p>到这里大概理解这是 tcpdump 引入的 DNS 反查,看了下 tcpdump 帮助完全可以用 -n 参数彻底关闭 DNS 反查 IP:</p><blockquote><p>tcpdump 命令可以关闭 DNS 反查功能。要禁用 DNS 反查,你可以使用 <code>-n</code> 选项;// 我用 tcpdump -n 这么久真没留意这个 -n 具体干啥的,每次都是条件反射写上去的 :( </p></blockquote><h3 id="小结"><a href="#小结" class="headerlink" title="小结"></a>小结</h3><p>其实很多应用中会偶尔卡顿,网络操作超时就是典型的导致这种卡顿的原因,从 CPU 资源使用率上还发现不了。比如<a href="https://plantegg.github.io/2019/06/02/%E5%8F%B2%E4%B8%8A%E6%9C%80%E5%85%A8_SSH_%E6%9A%97%E9%BB%91%E6%8A%80%E5%B7%A7%E8%AF%A6%E8%A7%A3--%E6%94%B6%E8%97%8F%E4%BF%9D%E5%B9%B3%E5%AE%89/#%E4%B8%BA%E4%BB%80%E4%B9%88%E6%9C%89%E6%97%B6%E5%80%99ssh-%E6%AF%94%E8%BE%83%E6%85%A2%EF%BC%8C%E6%AF%94%E5%A6%82%E6%80%BB%E6%98%AF%E9%9C%80%E8%A6%8130%E7%A7%92%E9%92%9F%E5%90%8E%E6%89%8D%E8%83%BD%E6%AD%A3%E5%B8%B8%E7%99%BB%E5%BD%95">日常 ssh 连服务器有时候就会卡 30 秒</a></p><p>关于 GSSAPIAuthentication 解释如下,一看也是需要走网络进行授权认证,如果没有配置 kerberos 服务就会卡在网络等待上:</p><blockquote><p>[!TIP]</p><p>SSH 中的 GSSAPIAuthentication(Generic Security Services Application Program Interface Authentication)是一种身份验证机制,主要用于实现单点登录(Single Sign-On, SSO)功能。它允许用户在已经通过 Kerberos 认证的环境中,无需再次输入密码就可以登录到支持 GSSAPI 的 SSH 服务器。</p></blockquote><p>类似的网络卡顿/DNS 解析卡顿是很常见的,大家掌握好 Debug 手段。</p><p>实际生产中可能没这么好重现也不太好分析,比如我就碰到过 Java 程序都卡在 DNS 解析的问题,Java 中这个 DNS 解析是串行的,所以一般可以通过 jstack 看看堆栈,多个锁窜行等待肯定不正常;多次抓到 DNS 解析肯定也不正常</p><p>比如下面这个 jstack 堆栈正常是不应该出现的,如果频繁出现就说明在走 DNS 查机器名啥的</p><figure class="highlight plain"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br><span class="line">39</span><br><span class="line">40</span><br><span class="line">41</span><br><span class="line">42</span><br><span class="line">43</span><br><span class="line">44</span><br><span class="line">45</span><br><span class="line">46</span><br><span class="line">47</span><br><span class="line">48</span><br><span class="line">49</span><br><span class="line">50</span><br><span class="line">51</span><br><span class="line">52</span><br></pre></td><td class="code"><pre><span class="line">"Diagnose@diagnose-2-61" #616 daemon prio=5 os_prio=0 tid=0x00007f7668ba6000 nid=0x2fc runnable [0x00007f75dbea8000]</span><br><span class="line"> java.lang.Thread.State: RUNNABLE</span><br><span class="line"> at java.net.Inet4AddressImpl.lookupAllHostAddr(Native Method)</span><br><span class="line"> at java.net.InetAddress$2.lookupAllHostAddr(InetAddress.java:870)</span><br><span class="line"> at java.net.InetAddress.getAddressesFromNameService(InetAddress.java:1312)</span><br><span class="line"> at java.net.InetAddress$NameServiceAddresses.get(InetAddress.java:818)</span><br><span class="line"> - locked <0x0000000500340c10> (a java.net.InetAddress$NameServiceAddresses)</span><br><span class="line"> at java.net.InetAddress.getAllByName0(InetAddress.java:1301)</span><br><span class="line"> at java.net.InetAddress.getAllByName0(InetAddress.java:1221)</span><br><span class="line"> at java.net.InetAddress.getHostFromNameService(InetAddress.java:640)</span><br><span class="line"> at java.net.InetAddress.getHostName(InetAddress.java:565)</span><br><span class="line"> at java.net.InetAddress.getHostName(InetAddress.java:537)</span><br><span class="line"> at java.net.InetSocketAddress$InetSocketAddressHolder.getHostName(InetSocketAddress.java:82)</span><br><span class="line"> at java.net.InetSocketAddress$InetSocketAddressHolder.access$600(InetSocketAddress.java:56)</span><br><span class="line"> at java.net.InetSocketAddress.getHostName(InetSocketAddress.java:345)</span><br><span class="line"> at io.grpc.internal.ProxyDetectorImpl.detectProxy(ProxyDetectorImpl.java:127)</span><br><span class="line"> at io.grpc.internal.ProxyDetectorImpl.proxyFor(ProxyDetectorImpl.java:118)</span><br><span class="line"> at io.grpc.internal.InternalSubchannel.startNewTransport(InternalSubchannel.java:207)</span><br><span class="line"> at io.grpc.internal.InternalSubchannel.obtainActiveTransport(InternalSubchannel.java:188)</span><br><span class="line"> - locked <0x0000000500344d38> (a java.lang.Object)</span><br><span class="line"> at io.grpc.internal.ManagedChannelImpl$SubchannelImpl.requestConnection(ManagedChannelImpl.java:1130)</span><br><span class="line"> at io.grpc.PickFirstBalancerFactory$PickFirstBalancer.handleResolvedAddressGroups(PickFirstBalancerFactory.java:79)</span><br><span class="line"> at io.grpc.internal.ManagedChannelImpl$NameResolverListenerImpl$1NamesResolved.run(ManagedChannelImpl.java:1032)</span><br><span class="line"> at io.grpc.internal.ChannelExecutor.drain(ChannelExecutor.java:73)</span><br><span class="line"> at io.grpc.internal.ManagedChannelImpl$4.get(ManagedChannelImpl.java:403)</span><br><span class="line"> at io.grpc.internal.ClientCallImpl.start(ClientCallImpl.java:238)</span><br><span class="line"></span><br><span class="line">"Check@diagnose-1-107" #849 daemon prio=5 os_prio=0 tid=0x00007f600ee44200 nid=0x3e5 runnable [0x00007f5f12545000]</span><br><span class="line"> java.lang.Thread.State: RUNNABLE</span><br><span class="line"> at java.net.Inet4AddressImpl.lookupAllHostAddr(Native Method)</span><br><span class="line"> at java.net.InetAddress$2.lookupAllHostAddr(InetAddress.java:870)</span><br><span class="line"> at java.net.InetAddress.getAddressesFromNameService(InetAddress.java:1312)</span><br><span class="line"> at java.net.InetAddress$NameServiceAddresses.get(InetAddress.java:818)</span><br><span class="line"> - locked <0x000000063ee00098> (a java.net.InetAddress$NameServiceAddresses)</span><br><span class="line"> at java.net.InetAddress.getAllByName0(InetAddress.java:1301)</span><br><span class="line"> at java.net.InetAddress.getAllByName(InetAddress.java:1154)</span><br><span class="line"> at java.net.InetAddress.getAllByName(InetAddress.java:1075)</span><br><span class="line"> at java.net.InetAddress.getByName(InetAddress.java:1025)</span><br><span class="line"> at *.*.*.*.*.check.Utils.isIPv6(Utils.java:59)</span><br><span class="line"> at *.*.*.*.*.check.checker.AbstractCustinsChecker.getVipCheckPoint(AbstractCustinsChecker.java:189)</span><br><span class="line"> at *.*.*.*.*.*.*.MySQLCustinsChecker.getVipCheckPoint(MySQLCustinsChecker.java:160)</span><br><span class="line"> at *.*.*.*.*.*.*.MySQLCustinsChecker.getCheckPoints(MySQLCustinsChecker.java:133)</span><br><span class="line"> at *.*.*.*.*.check.checker.AbstractCustinsChecker.checkNormal(AbstractCustinsChecker.java:314)</span><br><span class="line"> at *.*.*.*.*.check.checker.CheckExecutorImpl.check(CheckExecutorImpl.java:186)</span><br><span class="line"> at *.*.*.*.*.check.checker.CheckExecutorImpl.lambda$0(CheckExecutorImpl.java:118)</span><br><span class="line"> at *.*.*.*.*.check.checker.CheckExecutorImpl$$Lambda$302/130696248.call(Unknown Source)</span><br><span class="line"> at com.google.common.util.concurrent.TrustedListenableFutureTask$TrustedFutureInterruptibleTask.runInterruptibly(TrustedListenableFutureTask.java:111)</span><br><span class="line"> at com.google.common.util.concurrent.InterruptibleTask.run(InterruptibleTask.java:58)</span><br><span class="line"> at com.google.common.util.concurrent.TrustedListenableFutureTask.run(TrustedListenableFutureTask.java:75)</span><br><span class="line"> at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)</span><br><span class="line"> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)</span><br><span class="line"> at java.lang.Thread.run(Thread.java:879)</span><br></pre></td></tr></table></figure><p>这里以后可以加更多的 DNS 解析卡顿/网络卡顿导致的问题案例……</p>]]></content>
<summary type="html">
<h2 id="tcpdump-抓包卡顿分析"><a href="#tcpdump-抓包卡顿分析" class="headerlink" title="tcpdump 抓包卡顿分析"></a>tcpdump 抓包卡顿分析</h2><h3 id="背景"><a href="#背景"
</summary>
<category term="tcpdump" scheme="https://plantegg.github.io/categories/tcpdump/"/>
<category term="performance" scheme="https://plantegg.github.io/tags/performance/"/>
<category term="Linux" scheme="https://plantegg.github.io/tags/Linux/"/>
<category term="tcpdump" scheme="https://plantegg.github.io/tags/tcpdump/"/>
<category term="strace" scheme="https://plantegg.github.io/tags/strace/"/>
</entry>
<entry>
<title>教科书级的根因推导——必做题</title>
<link href="https://plantegg.github.io/2024/10/12/%E6%95%99%E7%A7%91%E4%B9%A6%E7%BA%A7%E7%9A%84%E6%A0%B9%E5%9B%A0%E6%8E%A8%E5%AF%BC%E2%80%94%E2%80%94%E5%BF%85%E5%81%9A%E9%A2%98/"/>
<id>https://plantegg.github.io/2024/10/12/教科书级的根因推导——必做题/</id>
<published>2024-10-12T09:30:03.000Z</published>
<updated>2024-11-20T10:00:55.574Z</updated>
<content type="html"><![CDATA[<h1 id="教科书级的根因推导——必做题"><a href="#教科书级的根因推导——必做题" class="headerlink" title="教科书级的根因推导——必做题"></a>教科书级的根因推导——必做题</h1><h2 id="问题描述"><a href="#问题描述" class="headerlink" title="问题描述"></a>问题描述</h2><p>A服务访问 B 服务,突然在某个时间点有个访问毛刺,RT 从50 ms飙到了80 ms,如下图</p><p><img src="https://cdn.jsdelivr.net/gh/plantegg/plantegg.github.io/images/951413iMgBlog/image-20240607210416189.png" alt="image-20240607210416189"></p><p>这个时候发现网络连接数也从10000 涨到了 11000</p><p><img src="https://cdn.jsdelivr.net/gh/plantegg/plantegg.github.io/images/951413iMgBlog/image-20240607210602281.png" alt="image-20240607210602281"></p><p>当时的QPS 一直是 2万,没有任何明显变化,任何其它指标都没有变化</p><h2 id="请回答问题"><a href="#请回答问题" class="headerlink" title="请回答问题"></a>请回答问题</h2><ol><li>到底是 B服务慢了所以 RT 上涨,RT 上涨后触发了新建连接,还是突然大量新建导致 B服务慢了,请写出你的详细推导</li><li>你如何在A 端来验证这个问题;你又如何在 B段来证明这个问题</li></ol><h2 id="我的分析"><a href="#我的分析" class="headerlink" title="我的分析"></a>我的分析</h2><p>首先是所有其他指标都正常,查下来看到的变化就是RT、总连接数同时抖了,所以以下分析都是基于在这个情形下,这两个指标到底谁是因、谁是果</p><p>分析的基本原则就是星球里最重要的概念:<a href="https://wx.zsxq.com/dweb2/index/topic_detail/814282542228452" target="_blank" rel="noopener">QPS、并发、RT 的关系</a></p><h3 id="为什么说连接数上涨是根因?"><a href="#为什么说连接数上涨是根因?" class="headerlink" title="为什么说连接数上涨是根因?"></a><strong>为什么说连接数上涨是根因?</strong></h3><p>抖动前 rt 50ms,QPS 2万,计算下来一个连接能扛的 QPS 是20( 1000ms/50ms =20 QPS 1秒等于1000ms)</p><p>1000个活跃连接就可以扛住这 2万的QPS,而总连接数在抖动前是10000,也就是连接数的水位只需要10% 就够了。按照抖动时的rt 80ms 则这10000个连接是可以扛 12.5万QPS 才会触发连接数不够创建新连接(理想值,也就是在QPS 到12.5万的80% 之前触发连接数不够的概率极小极小)</p><p>一个很关键的点:新建连接是业务端的行为,除非服务端太慢导致连接不够才会触发客户端新建,否则都是业务端的锅</p><p>几个注意的地方:</p><ul><li>另外一个注意下抖动的时候也没有触发业务端有超时报错(80ms 只是平均值),如果真有超时报错可能会丢掉老连接,创建或者取新连接重试</li><li>实际上连接有总连接数、活跃连接数,总连接就是我们这里说的1万,活跃连接对应的就是 1000——也就是你随机去看业务状态,有1000个连接在忙着做业务处理/查询,还有9000个连接在睡大觉</li></ul><h3 id="如何验证?"><a href="#如何验证?" class="headerlink" title="如何验证?"></a><strong>如何验证?</strong></h3><ol><li>让客户建1000-2000 个新连接看看——应该会触发RT 飚一下,但不一定是充分条件,实际在同一个客户的其他实例上也有抖动的场景里没有触发新建连接——相当于间接验证</li><li>或者让客户在他们的网卡上加 30ms模拟抖动从50ms加到80ms,看会不会触发新建几百个连接,如果没有触发新建说明RT 这个幅度的上涨不会触发新建连接</li></ol><p>不知道我解释清楚了没有</p>]]></content>
<summary type="html">
<h1 id="教科书级的根因推导——必做题"><a href="#教科书级的根因推导——必做题" class="headerlink" title="教科书级的根因推导——必做题"></a>教科书级的根因推导——必做题</h1><h2 id="问题描述"><a href="#问
</summary>
<category term="performance" scheme="https://plantegg.github.io/categories/performance/"/>
<category term="performance" scheme="https://plantegg.github.io/tags/performance/"/>
<category term="network" scheme="https://plantegg.github.io/tags/network/"/>
</entry>
<entry>
<title>为什么你的连接不均衡了?</title>
<link href="https://plantegg.github.io/2024/10/11/%E4%B8%BA%E4%BB%80%E4%B9%88%E4%BD%A0%E7%9A%84%E8%BF%9E%E6%8E%A5%E4%B8%8D%E5%9D%87%E8%A1%A1%E4%BA%86/"/>
<id>https://plantegg.github.io/2024/10/11/为什么你的连接不均衡了/</id>
<published>2024-10-11T09:30:03.000Z</published>
<updated>2024-11-20T10:00:55.418Z</updated>
<content type="html"><![CDATA[<h1 id="为什么你的连接不均衡了?"><a href="#为什么你的连接不均衡了?" class="headerlink" title="为什么你的连接不均衡了?"></a>为什么你的连接不均衡了?</h1><h2 id="场景"><a href="#场景" class="headerlink" title="场景"></a>场景</h2><p>假如你有两个Redis 服务,挂载在一个LVS 下,然后客户端使用的Jedis,Jedis 配置的最大连接池是200个连接,最小是100个(也就是超过100个,当闲置一段时间后就释放掉)。然后过一阵假设来了一个访问高峰,把连接数打到200,过一会高峰过去连接就会释放到100,客户端每次取连接然后随便 get 以下就归还连接</p><p><strong>场景构造小提示</strong>:</p><ol><li>用Jedis;</li><li>构造流量一波一波,就是有流量高峰(触发新建连接)、有流量低峰(触发连接释放),如此反复</li><li>不需要太大流量把Redis 节点打到出现瓶颈</li></ol><p>如下图:</p><p><img src="https://cdn.jsdelivr.net/gh/plantegg/plantegg.github.io/images/951413iMgBlog/image-20240618202012463.png" alt="image-20240618202012463"></p><p>期待场景:在这个过程中,Jedis 每次取一个连接随便get 一个key 就行了,无论怎么折腾两个Redis Service 的连接数基本是均衡的,实际也确实是这样</p><p>比如可以这样设置Jedis 参数(你也可以随便改),也可以用你们生产环境</p><figure class="highlight plain"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br></pre></td><td class="code"><pre><span class="line">JedisPoolConfig config = new JedisPoolConfig();</span><br><span class="line">config.setMaxIdle(100);</span><br><span class="line">config.setMaxTotal(200);</span><br><span class="line">config.setMinEvictableIdleTimeMillis(3000);</span><br><span class="line">config.setTimeBetweenEvictionRunsMillis(1000);</span><br><span class="line">config.setTestOnBorrow(false);</span><br><span class="line">config.setTestOnReturn(false);</span><br><span class="line">config.setTestWhileIdle(false);</span><br><span class="line">config.setTestOnCreate(false);</span><br></pre></td></tr></table></figure><p>验证代码</p><figure class="highlight plain"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br><span class="line">39</span><br><span class="line">40</span><br><span class="line">41</span><br><span class="line">42</span><br><span class="line">43</span><br><span class="line">44</span><br><span class="line">45</span><br><span class="line">46</span><br><span class="line">47</span><br><span class="line">48</span><br><span class="line">49</span><br><span class="line">50</span><br><span class="line">51</span><br><span class="line">52</span><br><span class="line">53</span><br><span class="line">54</span><br><span class="line">55</span><br><span class="line">56</span><br><span class="line">57</span><br><span class="line">58</span><br><span class="line">59</span><br><span class="line">60</span><br><span class="line">61</span><br><span class="line">62</span><br></pre></td><td class="code"><pre><span class="line">import com.taobao.eagleeye.redis.clients.jedis.Jedis;</span><br><span class="line">import com.taobao.eagleeye.redis.clients.jedis.JedisPool;</span><br><span class="line">import com.taobao.eagleeye.redis.clients.jedis.JedisPoolConfig;</span><br><span class="line"></span><br><span class="line">public class JedisPoolTest {</span><br><span class="line"> // 初始化连接超时时间</span><br><span class="line"> private static final int DEFAULT_CONNECTION_TIMEOUT = 5000;</span><br><span class="line"> // 查询超时时间</span><br><span class="line"> private static final int DEFAULT_SO_TIMEOUT = 2000;</span><br><span class="line"> private static final JedisPoolConfig config = new JedisPoolConfig();</span><br><span class="line"> private static JedisPool jedisPool = null;</span><br><span class="line"></span><br><span class="line"> public static void main(String args[]) {</span><br><span class="line"> // 代理连接地址,用控制台上的"代理地址"。</span><br><span class="line"> String host = "redis";</span><br><span class="line"> int port = 6379;</span><br><span class="line"> //String password = "1234";</span><br><span class="line"></span><br><span class="line"> // 设置参考上面</span><br><span class="line"> config.setMaxTotal(xx);</span><br><span class="line"> config.setMaxIdle(xx);</span><br><span class="line"> config.setMinIdle(xx);</span><br><span class="line"> </span><br><span class="line"></span><br><span class="line"> // 只需要初始化一次</span><br><span class="line"> try {</span><br><span class="line"> jedisPool = new JedisPool(config, host, port, </span><br><span class="line"> DEFAULT_CONNECTION_TIMEOUT, DEFAULT_SO_TIMEOUT, password, 0, null);</span><br><span class="line"> try (Jedis jedis = jedisPool.getResource()) {</span><br><span class="line"> if (!"PONG".equals(jedis.ping())) {</span><br><span class="line"> throw new RuntimeException("Init Failed");</span><br><span class="line"> }</span><br><span class="line"> }</span><br><span class="line"> } catch (Exception e) {</span><br><span class="line"> // 如果有exception,说明初始化失败。</span><br><span class="line"> e.printStackTrace();</span><br><span class="line"> }</span><br><span class="line"></span><br><span class="line"> // 每次 API 查询都像下面这么写</span><br><span class="line"> Jedis jedis = null;</span><br><span class="line"> try {</span><br><span class="line"> jedis = jedisPool.getResource(); // 查询前获取一个连接</span><br><span class="line"> String ret = jedis.set("key", "value");</span><br><span class="line"> if ("OK".equals(ret)) {</span><br><span class="line"> System.out.println(ret);</span><br><span class="line"> // SET success</span><br><span class="line"> }</span><br><span class="line"> } catch (Exception e) {</span><br><span class="line"> e.printStackTrace();</span><br><span class="line"> // 连接错误,超时等情况</span><br><span class="line"> } finally {</span><br><span class="line"> if (jedis != null) {</span><br><span class="line"> // 查询结束后还回连接池,不是销毁连接</span><br><span class="line"> // 必须尽快还回,否则会导致连接池资源不够</span><br><span class="line"> jedis.close(); </span><br><span class="line"> }</span><br><span class="line"> }</span><br><span class="line"></span><br><span class="line"> // 只需要最后程序退出时调用一次,不需要每次查询完之后都调用</span><br><span class="line"> jedisPool.close();</span><br><span class="line"> }</span><br><span class="line">}</span><br></pre></td></tr></table></figure><p>运行如上代码,应该看到一个负载均衡正常环境——符合预期</p><h2 id="不均衡重现"><a href="#不均衡重现" class="headerlink" title="不均衡重现"></a>不均衡重现</h2><p>背景里描述的是完全符合预期的,假设实际使用中两个 Redis中的一个节点的CPU有一个降频了/争抢/温度高 等种种原因,导致这个节点处理更慢了</p><p>如何模拟其中一个节点突然慢了(这些手段在之前的星球案例重现里都反复使用过了)</p><ol><li>你可以把Redis 进程绑到一个核上,然后在这这个核上跑一个死循环故意让;</li><li>或者,也可以在这个节点上给网络延迟加200ms 进去</li></ol><p>这个时候你再重新跑背景描述里的代码,一段时间后你会看到下图中红线对应的 Redis 节点上的连接数越来越高,QPS 越来越高(别用太大的压力,导致这个节点的访问超时哈)</p><p><img src="https://cdn.jsdelivr.net/gh/plantegg/plantegg.github.io/images/951413iMgBlog/image-20240618203121532.png" alt="image-20240618203121532"></p><p>到这里就算是问题重现出来了</p><p><strong>重现确认注意:</strong></p><p>如果只是看到瞬间连接数不均衡这应该没有重现出来,因为节点慢了所以 active 要变高才会维系住同样的QPS,这是符合预期的。</p><p>期望的是长期运行后慢的节点上统计意义上的<strong>连接数越来越多、QPS 越来越大</strong></p><p>比如下图是重现过程中的连接数监控,可以看到橙色线对应的Redis 节点上的连接越来越多:</p><p><img src="https://cdn.jsdelivr.net/gh/plantegg/plantegg.github.io/images/951413iMgBlog/FqdkiFCrWvrfNTY3CtmRSZNpa9Ju.jpeg" alt="img"></p><p>下图是对应的QPS 监控,问题Redis 节点(黄色线)的QPS 比另外一个节点大很多,长期下去会导致问题节点成为瓶颈:</p><p><img src="https://cdn.jsdelivr.net/gh/plantegg/plantegg.github.io/images/951413iMgBlog/Fk0WyAcGQeTrlhcgZzlF9wJP9Ria.jpeg" alt="img"></p><h2 id="重现脚本和代码"><a href="#重现脚本和代码" class="headerlink" title="重现脚本和代码"></a>重现脚本和代码</h2><p>以下涉及的脚本、代码提交到 github,这些脚本、手段在我们之前的实验、案例都反复出现过了,我就不给了</p><p>参考星球里扒老师的操作(不含客户端Java代码):<a href="https://malleable-elbow-b9f.notion.site/redis-f7dfcecb7f7441e1ba96f4da3ca8aee8?pvs=4" target="_blank" rel="noopener">https://malleable-elbow-b9f.notion.site/redis-f7dfcecb7f7441e1ba96f4da3ca8aee8?pvs=4</a> </p><p>星球里橘橘球用python 3.8 实现了一个python 版本的:<a href="https://github.com/gongyisheng/playground/blob/dev/network/lvs_case/readme.md" target="_blank" rel="noopener">https://github.com/gongyisheng/playground/blob/dev/network/lvs_case/readme.md</a> </p><p>好奇同学用Java/Jedis 和Go两个版本(Go 版本是没有Jedis,也能重现问题)的实现代码:<a href="https://github.com/haoqixu/case-reproduction-240618" target="_blank" rel="noopener">https://github.com/haoqixu/case-reproduction-240618</a> </p><h3 id="docker"><a href="#docker" class="headerlink" title="docker"></a>docker</h3><p>用 docker起两个Redis 节点</p><figure class="highlight plain"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">//这里提供Redis docker run脚本</span><br></pre></td></tr></table></figure><h3 id="ipvsadm"><a href="#ipvsadm" class="headerlink" title="ipvsadm"></a>ipvsadm</h3><figure class="highlight plain"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">//创建一个 LVS,将上面的两个Redis 加入到负载均衡里面</span><br></pre></td></tr></table></figure><h3 id="Java-客户端代码"><a href="#Java-客户端代码" class="headerlink" title="Java 客户端代码"></a>Java 客户端代码</h3><p>完整代码应该很简单,就是一个Java + Jedis 的HelloWorld 上传到 github,别人下载代码后,自己配置一个 LVS + Redis 的负载均衡环境就能重现以上问题</p><h3 id="tc-qdisc"><a href="#tc-qdisc" class="headerlink" title="tc qdisc"></a>tc qdisc</h3><figure class="highlight plain"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">//给其中的一个 节点构造 200ms 的延时</span><br></pre></td></tr></table></figure><p>也可以跑死循环抢 CPU </p><h2 id="分析"><a href="#分析" class="headerlink" title="分析"></a>分析</h2><p>原因:Jedis 连接池使用的是 <a href="https://github.com/apache/commons-pool" target="_blank" rel="noopener">apache commons-pool</a> 这个组件,默认从连接池取连接使用的是 LIFO(last in first out) ,如果两个节点负载正常两个节点上的连接基本能保持在队列里交叉均衡;如果连接闲置久了释放的时候就是均衡释放的</p><p>但如果有一个节点处理慢了,那么这个节点的连接被取出来使用的时候必然需要更多的时间在连接池外面处理请求,用完归还的时候就会更高概率出现在队列的顶部,导致下次首先被取出来使用,长期下去就会出现快的节点上的连接慢慢被释放,慢的节点的连接越来越多,进而慢的节点的QPS 越来越高,最后这个节点崩了</p><h3 id="泛化问题"><a href="#泛化问题" class="headerlink" title="泛化问题"></a>泛化问题</h3><p>针对这个问题就一定是Jedis 和 Redis 才有吗?本质是我们没法期望所有节点一样快,导致连接归还一定有慢的,进而只要是取连接用 LIFO(last in first out) 就会有这个问题,Jedis/Lettuce/MySQL dbcp 都用了 <a href="https://github.com/apache/commons-pool" target="_blank" rel="noopener">apache commons-pool</a> 这个组件来实现连接池功能,而 apache commons-pool 默认就是 LIFO ,所以这些组件全部中枪。应该是用的 LinkedBlockingDeque 队列,它有有 FIFO 和 FILO 两种策略</p><p>那么没有用 apache-commons-pools 的就安全吗?也不一定,得看取连接的逻辑,一般都是 LIFO,比如 Druid 连接池的实现用的 stack ,也就是 stack 顶部的几个连接被反复使用,可能底部连接完全用不到的情况。 且Druid 还不提供接口去设置是不是 stack/queue(LIFO/FIFO)</p><p>你们的微服务只要是用连接池大概率也会有同样的问题</p><p>那么有什么好办法来解决类似的问题吗?<a href="https://github.com/alibaba/druid/wiki/DruidDataSource%E9%85%8D%E7%BD%AE%E5%B1%9E%E6%80%A7%E5%88%97%E8%A1%A8" target="_blank" rel="noopener">Druid 有个设置</a> phyTimeoutMillis 和 phyMaxUseCount (就是一个长连接用多久、或者执行了多少次SQL ) 来将长连接主动断开,这就有概率修复这个问题;</p><p>另外如果LVS 用的 WLC 均衡算法也可以fix 这个问题,见参考资料。</p><p>php听说有个功能,进程跑一段时间后自行销毁重建;担心内存泄漏啥的 —— 是不是很像遇到问题就重启,又不是不work,不优雅但是管用,有点像通信基站半夜重启</p><p>你看虽然是一次 Jedis 客户端在某些条件下导致的问题,只要你去通用化问题的本质就可以发现很容易地跳出来看到各个不同场景下同样会引起的问题,无招胜有招啊</p><h2 id="参考资料"><a href="#参考资料" class="headerlink" title="参考资料"></a>参考资料</h2><p><a href="https://plantegg.github.io/categories/LVS">https://plantegg.github.io/categories/LVS/</a> 强调下这次的不均衡和我这个链接里的两篇文章描述的毫无关系,只是接着这个机会可以重温一下导致不均衡的其它原因,做个汇总</p>]]></content>
<summary type="html">
<h1 id="为什么你的连接不均衡了?"><a href="#为什么你的连接不均衡了?" class="headerlink" title="为什么你的连接不均衡了?"></a>为什么你的连接不均衡了?</h1><h2 id="场景"><a href="#场景" class="
</summary>
<category term="LVS" scheme="https://plantegg.github.io/categories/LVS/"/>
<category term="Linux" scheme="https://plantegg.github.io/tags/Linux/"/>
<category term="LVS" scheme="https://plantegg.github.io/tags/LVS/"/>
<category term="network" scheme="https://plantegg.github.io/tags/network/"/>
</entry>
<entry>
<title>一次抓包分析过程——Wireshark 新手上车</title>
<link href="https://plantegg.github.io/2024/10/10/%E4%B8%80%E6%AC%A1%E6%8A%93%E5%8C%85%E5%88%86%E6%9E%90%E8%BF%87%E7%A8%8B/"/>
<id>https://plantegg.github.io/2024/10/10/一次抓包分析过程/</id>
<published>2024-10-10T02:30:03.000Z</published>
<updated>2024-11-20T10:00:55.449Z</updated>
<content type="html"><![CDATA[<h1 id="一次抓包分析过程——Wireshark-新手上车"><a href="#一次抓包分析过程——Wireshark-新手上车" class="headerlink" title="一次抓包分析过程——Wireshark 新手上车"></a>一次抓包分析过程——Wireshark 新手上车</h1><h2 id="问题"><a href="#问题" class="headerlink" title="问题"></a>问题</h2><p>网友尝试做星球第一个必做实验的时候,什么内核参数都没改,发现请求经常会停滞 100ms,这种要怎么判断是局域网的网络问题还是应用问题呢? 服务是 python3 -m http.server 启动的,看上去没有出现什么重传、窗口也没看到什么问题</p><p>因为不能提供环境给我,我尝试对这个抓包进行了分析,因为只有客户端抓包,所以分析结果是没有结论的,但分析过程比较适合入门 Wireshark,适合刚加入星球的、没分析过网络包的同学可以参考,熟手请忽略</p><h2 id="分析"><a href="#分析" class="headerlink" title="分析"></a>分析</h2><p>整个抓包 28MB,跨度 600 毫秒,看得出带宽很大、RTT 极小(到Wireshark 里看看前几个包的交互 RT 就知道了)</p><p><img src="https://cdn.jsdelivr.net/gh/plantegg/plantegg.github.io/images/951413iMgBlog/image-20240715093847359.png" alt="image-20240715093847359"></p><h3 id="详细分析"><a href="#详细分析" class="headerlink" title="详细分析"></a>详细分析</h3><p>看第一次卡 100ms 之前的抓包,在100ms 以前客户端ack 了所有Server 发出来的的tcp包(红框),也就是说每一个发给客户端的包客户端都ack 完毕,证明客户端处理足够快,但是 8089端口不继续发包而是等了100ms再继续发,如下图:</p><p><img src="https://cdn.jsdelivr.net/gh/plantegg/plantegg.github.io/images/951413iMgBlog/image-20240715094218182.png" alt="image-20240715094218182"></p><p>到这里的结论:</p><p>不是因为发送buffer、接收buffer太小导致的卡;也不是因为拥塞窗口导致的,就是Server 端没有发包。大概率是Server 进程卡了,或者Server 进程读取物理文件往OS buffer 写这些环节卡了(可以在服务端通过 strace -tt 看看进程在这 100 毫秒有没有往内核怼数据)</p><p>所以要继续在 Server 端来分析这个问题</p><p>怎么快速定位到红框、红线这里的包?</p><blockquote><p>到 Time Sequence 图上点平台两边的点都可以自动跳转到这里,每个点代表一个网络包,横坐标代表时间</p></blockquote><h2 id="其它分析"><a href="#其它分析" class="headerlink" title="其它分析"></a>其它分析</h2><p>将如下 Time Sequence 图使劲放大,从第一个包开始看,可以观察到教科书所说的慢启动</p><p><img src="https://cdn.jsdelivr.net/gh/plantegg/plantegg.github.io/images/951413iMgBlog/image-20240715095134352.png" alt="image-20240715095134352"></p><p>整体看的话,慢启动几乎可以忽略,毕竟这个抓包是下载一个巨大的文件,如果是一个小文件这个慢启动还是影响很大的,如下图,红框部分看起来微不足道</p><p><img src="https://cdn.jsdelivr.net/gh/plantegg/plantegg.github.io/images/951413iMgBlog/image-20240715095506381.png" alt="image-20240715095506381"></p><p>把时间范围放大,继续看,在卡之前红色箭头很长的,代表带宽、buffer有能力一次发送很多网络包,但是后面每次只发一点点网络包(绿色箭头长度)就卡了</p><p><img src="https://cdn.jsdelivr.net/gh/plantegg/plantegg.github.io/images/951413iMgBlog/image-20240715095647702.png" alt="image-20240715095647702"></p><h2 id="重现"><a href="#重现" class="headerlink" title="重现"></a>重现</h2><p>我用 python3 当服务端未能重现这个卡100ms 的现象,拉取都很丝滑</p><p><img src="https://cdn.jsdelivr.net/gh/plantegg/plantegg.github.io/images/951413iMgBlog/image-20240715101505977.png" alt="image-20240715101505977"></p><p>非常细节地去分析的话,也是能看到一些小问题的,比如1.9ms的卡顿、比如zero_window</p><p><img src="https://cdn.jsdelivr.net/gh/plantegg/plantegg.github.io/images/951413iMgBlog/image-20240715103928266.png" alt="image-20240715103928266"></p><p>重现的时候,有1.9ms 这样的卡顿,但是不算有规律,因为这么小在整个传输过程中影响不大</p><p><img src="https://cdn.jsdelivr.net/gh/plantegg/plantegg.github.io/images/951413iMgBlog/image-20240715103708750.png" alt="image-20240715103708750"></p><p>我重现的时候正好抓到了 seq 回绕,seq 是个 32位的无符号整数,到了最大值就从0又开始:</p><p><img src="https://cdn.jsdelivr.net/gh/plantegg/plantegg.github.io/images/951413iMgBlog/image-20240715115500312.png" alt="image-20240715115500312"></p><p>此时的 Time Sequence: </p><p><img src="https://cdn.jsdelivr.net/gh/plantegg/plantegg.github.io/images/951413iMgBlog/image-20240715115655516.png" alt="image-20240715115655516"></p><h2 id="建议"><a href="#建议" class="headerlink" title="建议"></a>建议</h2><p>可以用实验1里面的一些手段debug 一下Server 为什么卡了,除了 strace -tt 还可以用 ebpf 试试看看 Server 的调度上哪里顿了 100ms</p><p>新手如何通过Wireshark 来看抓包?</p><p>首先不要纯粹为了学习去看,而是要问你的问题是什么?如果网络传输速度慢,我们就看 Time Sequence(斜率越陡速度越快),去看为什么发送端不发包了</p><ul><li>如正文里的卡顿平台,在250ms内差不多要卡240ms 不发包,速度自然不行</li><li>我重现抓包中的zero Windows</li><li>达到网络BDP 瓶颈了,去看拥塞窗口在最大值的时候会丢包,触发降速</li></ul><p>里面可以看、要看的东西太多,所以我也说不上要看什么,而是要问你的问题是什么</p>]]></content>
<summary type="html">
<h1 id="一次抓包分析过程——Wireshark-新手上车"><a href="#一次抓包分析过程——Wireshark-新手上车" class="headerlink" title="一次抓包分析过程——Wireshark 新手上车"></a>一次抓包分析过程——Wire
</summary>
<category term="tcpdump" scheme="https://plantegg.github.io/categories/tcpdump/"/>
<category term="tcpdump" scheme="https://plantegg.github.io/tags/tcpdump/"/>
<category term="wireshark" scheme="https://plantegg.github.io/tags/wireshark/"/>
</entry>
<entry>
<title>一次故障的诊断过程</title>
<link href="https://plantegg.github.io/2024/10/03/%E4%B8%80%E6%AC%A1%E6%95%85%E9%9A%9C%E7%9A%84%E8%AF%8A%E6%96%AD%E8%BF%87%E7%A8%8B--Sysbench%20%E9%87%8D%E8%BF%9E/"/>
<id>https://plantegg.github.io/2024/10/03/一次故障的诊断过程--Sysbench 重连/</id>
<published>2024-10-03T09:30:03.000Z</published>
<updated>2024-12-30T02:31:19.637Z</updated>
<content type="html"><![CDATA[<h1 id="一次故障的诊断过程–Sysbench"><a href="#一次故障的诊断过程–Sysbench" class="headerlink" title="一次故障的诊断过程–Sysbench"></a>一次故障的诊断过程–Sysbench</h1><h2 id="背景"><a href="#背景" class="headerlink" title="背景"></a>背景</h2><p>我们的数据库需要做在线升级丝滑的验证,所以构造了一个测试环境,客户端Sysbench 用长连接一直打压力,Server 端的数据库做在线升级,这个在线升级会让 MySQL Server进程重启,毫无疑问连接会断开重连,所以期望升级的时候 Sysbench端 QPS 跌0几秒钟然后快速恢复</p><p>但是每次升级都是 Sysbench端 QPS 永久跌0,再也不能恢复,所以需要分析为什么,问题出在哪里?有人说是服务端的问题因为只有服务端做了变更</p><p>整个测试过程中 Sysbench 是配置的1-2个连接去压 MySQL Server</p><h2 id="Sysbench-介绍"><a href="#Sysbench-介绍" class="headerlink" title="Sysbench 介绍"></a>Sysbench 介绍</h2><p>以下介绍来自 ChatGPT-4,用过Sysbench的同学可以跳过这节:</p><p>Sysbench 是一个适用于多个系统的多线程基准测试工具,被广泛用于评估不同系统服务的性能,包括数据库系统(如 MySQL、PostgreSQL)、文件I/O、CPU性能以及线程调度。</p><p>对于MySQL数据库,Sysbench 可以执行包括但不限于以下类型的测试:</p><ul><li><strong>OLTP (Online Transaction Processing) 测试</strong>: 这是最常见的数据库基准测试类型,模拟在线事务处理工作负载,包括事务性的Insert、Update、Delete和Select操作。</li><li><strong>点查找测试</strong>: 测试数据库针对特定索引的单行查找性能。</li><li><strong>简单写测试</strong>: 测试数据库进行插入操作的性能。</li><li><strong>复杂的选择查询测试</strong>: 运行复杂的Select查询,包含多个表和多个条件,测试数据库的读取性能。</li><li><strong>非事务性查询测试</strong>: 类似于事务查询测试,但不在事务框架内进行。</li></ul><p>Sysbench使用Lua脚本语言进行测试案例的开发,它预置了一些标准的测试模板如<code>oltp_read_only</code>、<code>oltp_read_write</code>、<code>oltp_write_only</code>等,这些可以针对数据库执行标准的过程以及自定义的工作负载。</p><p>进行Sysbench压力测试的基本步骤包括:</p><ol><li>安装Sysbench。</li><li>准备测试数据集,这通常涉及Sysbench创建数据库及表,然后填充数据。</li><li>执行测试,Sysbench以定义的并发线程数向数据库发送请求。</li><li>收集并分析结果,例如吞吐量(每秒事务数)、延迟以及一致性。</li></ol><p>一个简单的Sysbench测试命令可以是这样:</p><figure class="highlight plain"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br></pre></td><td class="code"><pre><span class="line">/usr/local/bin/sysbench --debug=on --mysql-user='root' --mysql-password='123' --mysql-db='test' --mysql-host='127.0.0.1' --mysql-port='3307' --tables='16' --table-size='10000' --range-size='5' --db-ps-mode='disable' --skip-trx='on' --mysql-ignore-errors='all' --time='11080' --report-interval='1' --histogram='on' --threads=2 oltp_read_write prepare</span><br><span class="line"></span><br><span class="line">/usr/local/bin/sysbench --debug=on --mysql-user='root' --mysql-password='123' --mysql-db='test' --mysql-host='127.0.0.1' --mysql-port='3307' --tables='16' --table-size='10000' --range-size='5' --db-ps-mode='disable' --skip-trx='on' --mysql-ignore-errors='all' --time='11080' --report-interval='1' --histogram='on' --threads=2 oltp_read_write run</span><br><span class="line"></span><br><span class="line">sysbench oltp_read_write --table-size=100000 --mysql-db=testdb --mysql-user=root --mysql-password=password cleanup</span><br></pre></td></tr></table></figure><p>这个命令序列分别准备数据、运行测试和清理环境。运行测试部分变量<code>--threads=4</code>表示使用4个线程,<code>--time=60</code>表示测试持续时间60秒。</p><p>使用Sysbench时,请确保执行的测试与你的用例相关,并考虑到可能的性能差异。例如,如果目标是测试Web应用程序的数据库后端,确保测试的查询和事务能够反映真实的使用案例。</p><p>Sysbench的使用可以<a href="https://www.alibabacloud.com/help/zh/polardb/polardb-for-xscale/sysbench-test" target="_blank" rel="noopener">参考这个链接</a></p><h3 id="Sysbench-编译"><a href="#Sysbench-编译" class="headerlink" title="Sysbench 编译"></a>Sysbench 编译</h3><p>从 <a href="https://github.com/akopytov/sysbench.git" target="_blank" rel="noopener">github 下载源代码</a></p><p>以5.10(ALinux3/CentOS8) 为例</p><figure class="highlight plain"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br></pre></td><td class="code"><pre><span class="line">yum install libtool -y //configure.ac:61: error: possibly undefined macro: AC_PROG_LIBTOOL</span><br><span class="line">yum install mysql-devel -y </span><br><span class="line"></span><br><span class="line">然后:</span><br><span class="line">./autogen.sh ; ./configure ; make ; make install </span><br><span class="line"></span><br><span class="line">起压力重现命令:</span><br><span class="line">sysbench --mysql-user='root' --mysql-password='123' --mysql-db='test' --mysql-host='127.0.0.1' --mysql-port='3306' --tables='16' --table-size='10000' --range-size='5' --db-ps-mode='disable' --skip-trx='on' --mysql-ignore-errors='all' --time='1180' --report-interval='1' --histogram='on' --threads=1 oltp_read_only run --verbosity=5</span><br></pre></td></tr></table></figure><h2 id="分析"><a href="#分析" class="headerlink" title="分析"></a>分析</h2><p>研发人员第一反应重启了Sysbench 所在的ECS 然后恢复了,但是也没有了现场,我告诉他们等有了现场通知我,不要重启。今天终于再次重现了,我连上ECS 速度看了几个指标,通过 top 看到Sysbench 进程占用CPU 400%(整个 ECS 是4核),如图:</p><p><img src="https://cdn.jsdelivr.net/gh/plantegg/plantegg.github.io/images/951413iMgBlog/image-20240320173723799.png" alt="image-20240320173723799"></p><p>再进一步看看 sys 都在干什么,用 perf top -p 16329 可以看到:</p><p><img src="https://cdn.jsdelivr.net/gh/plantegg/plantegg.github.io/images/951413iMgBlog/FiI8oMUdJ4hVcAAJp_TgYA_pRcUh.png" alt="img"></p><p>确实是内核态在网络里面有网络方面的函数占比很高,且 spin_lock 严重,所以速度用 ss -s 和 netstat -anto 看看网络连接情况:</p><figure class="highlight plain"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br></pre></td><td class="code"><pre><span class="line"># ss -s</span><br><span class="line">Total: 41682</span><br><span class="line">TCP: 41498 (estab 5241, closed 1, orphaned 0, timewait 0)</span><br><span class="line"></span><br><span class="line">Transport Total IP IPv6</span><br><span class="line">RAW 0 0 0</span><br><span class="line">UDP 8 5 3</span><br><span class="line">TCP 41497 41495 2 //用了几万个连接了,这不正常</span><br><span class="line">INET 41505 41500 5</span><br><span class="line">FRAG 0 0 0</span><br><span class="line"></span><br><span class="line"># netstat -anto | head -30 //确实可以看到几万个连接,几乎都是 CLOSE_WAIT 状态</span><br><span class="line">Active Internet connections (servers and established)</span><br><span class="line"> 注意第二列一直都是——79,Recv-Q的意思是3306端发给Sysbench的内容79字节,但这79自己还在 OS 的tcp buffer 中,等待Sysbench 读走</span><br><span class="line">Proto Recv-Q Send-Q Local Address Foreign Address State Timer</span><br><span class="line">tcp 79 0 192.168.0.1:48743 192.168.20.220:3306 CLOSE_WAIT off (0.00/0/0)</span><br><span class="line">tcp 79 0 192.168.0.1:32747 192.168.20.220:3306 CLOSE_WAIT off (0.00/0/0)</span><br><span class="line">tcp 79 0 192.168.0.1:40838 192.168.20.220:3306 CLOSE_WAIT off (0.00/0/0)</span><br><span class="line">tcp 79 0 192.168.0.1:40190 192.168.20.220:3306 CLOSE_WAIT off (0.00/0/0)</span><br><span class="line">tcp 79 0 192.168.0.1:58337 192.168.20.220:3306 CLOSE_WAIT off (0.00/0/0)</span><br><span class="line">tcp 79 0 192.168.0.1:23976 192.168.20.220:3306 CLOSE_WAIT off (0.00/0/0)</span><br><span class="line">tcp 79 0 192.168.0.1:41687 192.168.20.220:3306 CLOSE_WAIT off (0.00/0/0)</span><br><span class="line">tcp 79 0 192.168.0.1:57214 192.168.20.220:3306 CLOSE_WAIT off (0.00/0/0)</span><br><span class="line">tcp 79 0 192.168.0.1:30464 192.168.20.220:3306 CLOSE_WAIT off (0.00/0/0)</span><br><span class="line">tcp 79 0 192.168.0.1:2015 192.168.20.220:3306 CLOSE_WAIT off (0.00/0/0)</span><br><span class="line">tcp 79 0 192.168.0.1:16032 192.168.20.220:3306 CLOSE_WAIT off (0.00/0/0)</span><br><span class="line">tcp 79 0 192.168.0.1:47188 192.168.20.220:3306 CLOSE_WAIT off (0.00/0/0)</span><br><span class="line">tcp 79 0 192.168.0.1:3054 192.168.20.220:3306 CLOSE_WAIT off (0.00/0/0)</span><br><span class="line">tcp 79 0 192.168.0.1:46344 192.168.20.220:3306 CLOSE_WAIT off (0.00/0/0)</span><br></pre></td></tr></table></figure><p>延伸:Recv-Q 和 netstat <a href="https://plantegg.github.io/2019/04/21/netstat%E5%AE%9A%E4%BD%8D%E6%80%A7%E8%83%BD%E6%A1%88%E4%BE%8B/">定位性能案例可以看这篇</a></p><h3 id="内核代码"><a href="#内核代码" class="headerlink" title="内核代码"></a>内核代码</h3><p>前文通过 perf top 可以看到 __inet_check_established 这个函数占用非常高</p><p>不符合正常逻辑,<a href="https://github.com/plantegg/linux/blob/3157b476f8216d2655c1c85bad53c975190689ba/net/ipv4/inet_hashtables.c#L447" target="_blank" rel="noopener">github 内核源码地址</a>(我只加了注释)</p><figure class="highlight plain"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br><span class="line">39</span><br><span class="line">40</span><br><span class="line">41</span><br><span class="line">42</span><br><span class="line">43</span><br><span class="line">44</span><br><span class="line">45</span><br><span class="line">46</span><br><span class="line">47</span><br><span class="line">48</span><br><span class="line">49</span><br><span class="line">50</span><br><span class="line">51</span><br></pre></td><td class="code"><pre><span class="line">//connect()时进行随机端口四元组可用性的判断</span><br><span class="line">//如果本地地址和目标地址组成的元组之前已经存在了,则返回错误码EADDRNOTAVAIL: Cannot assign requested address</span><br><span class="line">//这个时候即使设置了REUSEADDR也要报错</span><br><span class="line">/* called with local bh disabled */</span><br><span class="line">static int __inet_check_established(struct inet_timewait_death_row *death_row,</span><br><span class="line"> struct sock *sk, __u16 lport,</span><br><span class="line"> struct inet_timewait_sock **twp)</span><br><span class="line">{</span><br><span class="line">struct inet_hashinfo *hinfo = death_row->hashinfo;</span><br><span class="line">struct inet_sock *inet = inet_sk(sk);</span><br><span class="line">__be32 daddr = inet->inet_rcv_saddr;</span><br><span class="line">__be32 saddr = inet->inet_daddr;</span><br><span class="line">int dif = sk->sk_bound_dev_if;</span><br><span class="line">struct net *net = sock_net(sk);</span><br><span class="line">int sdif = l3mdev_master_ifindex_by_index(net, dif);</span><br><span class="line">INET_ADDR_COOKIE(acookie, saddr, daddr);</span><br><span class="line">const __portpair ports = INET_COMBINED_PORTS(inet->inet_dport, lport);</span><br><span class="line">unsigned int hash = inet_ehashfn(net, daddr, lport,</span><br><span class="line"> saddr, inet->inet_dport);</span><br><span class="line">//inet_ehash_bucket存放ESTABLISHED状态的socket 哈希表</span><br><span class="line">struct inet_ehash_bucket *head = inet_ehash_bucket(hinfo, hash);</span><br><span class="line">spinlock_t *lock = inet_ehash_lockp(hinfo, hash);</span><br><span class="line">struct sock *sk2;</span><br><span class="line">const struct hlist_nulls_node *node;</span><br><span class="line">struct inet_timewait_sock *tw = NULL;</span><br><span class="line"></span><br><span class="line">spin_lock(lock);</span><br><span class="line">//遍历检查四元组是否冲突</span><br><span class="line">sk_nulls_for_each(sk2, node, &head->chain) {</span><br><span class="line">if (sk2->sk_hash != hash)</span><br><span class="line">continue;</span><br><span class="line">//INET_MATCH 执行四元组比较</span><br><span class="line">if (likely(INET_MATCH(sk2, net, acookie,</span><br><span class="line"> saddr, daddr, ports, dif, sdif))) {</span><br><span class="line">if (sk2->sk_state == TCP_TIME_WAIT) {</span><br><span class="line">tw = inet_twsk(sk2);</span><br><span class="line">if (twsk_unique(sk, sk2, twp))</span><br><span class="line">break;</span><br><span class="line">}</span><br><span class="line">goto not_unique;</span><br><span class="line">}</span><br><span class="line">}</span><br><span class="line"></span><br><span class="line">……</span><br><span class="line"> </span><br><span class="line">not_unique:</span><br><span class="line">spin_unlock(lock);</span><br><span class="line">return -EADDRNOTAVAIL; //Cannot assign requested address错误,在510行看到了下一节 telnet/strace 中的错误信息</span><br><span class="line"></span><br><span class="line"></span><br><span class="line">// #defineEADDRNOTAVAIL227/* Cannot assign requested address */</span><br></pre></td></tr></table></figure><p>到这里可以很清楚说明问题在客户端而不是服务端,但是要回答:</p><ol><li>为什么CPU这么高,CPU都在忙什么</li><li>什么原因会导致 CLOSE_WAIT 状态</li><li>为什么Sysbench 要疯狂创建4万多个连接;</li></ol><p>所以接下来我们就来分别回答这三个问题</p><h3 id="为什么CPU这么高,CPU都在忙什么"><a href="#为什么CPU这么高,CPU都在忙什么" class="headerlink" title="为什么CPU这么高,CPU都在忙什么"></a>为什么CPU这么高,CPU都在忙什么</h3><p>首先用 strace -p Sysbench-pid 看看 Sysbench 进程都在忙什么,下图最上面是 Sysbench 在疯狂不断地 connect:</p><p><img src="https://cdn.jsdelivr.net/gh/plantegg/plantegg.github.io/images/951413iMgBlog/image-20240320174806093.png" alt="image-20240320174806093"></p><p>从上图最上面的Strace 来看 Sysbench在疯狂创建连接,但是在Connect 的时候报错:<strong>无法指定被请求的地址</strong></p><p>那接下来我就要ping 一下 192.168.20.220 这个IP 是OK的,再然后telnet 192.168.20.220 22 发现没报错但是也没有 SSH 让我输密码,于是看了下 cat /proc/sys/net/ipv4/ip_local_port_range 是4万个Local Port 可用,这个时候可以去看看我<a href="https://plantegg.github.io/2020/11/30/%E4%B8%80%E5%8F%B0%E6%9C%BA%E5%99%A8%E4%B8%8A%E6%9C%80%E5%A4%9A%E8%83%BD%E5%88%9B%E5%BB%BA%E5%A4%9A%E5%B0%91%E4%B8%AATCP%E8%BF%9E%E6%8E%A5/">这篇关于可用端口的经典文章</a> </p><p>于是我改了下Port Range范围多加了1万Port 上去,然后很快看到如图 ss -s 就有5万连接了,说明你给多少Port 都不够用</p><p>同时我也用 telnet 192.168.20.220 3306 报错是:<strong>Cannot assign requested address</strong> —— 这个报错和 <strong>无法指定被请求的地址</strong> 很像了,到这里可以看到做一个基本结论:</p><ol><li>之所以内核 sys CPU 跑高到 100%,是因为当Local Port 用完,又要新建连接的时候内核会用死循环去找可用端口,导致CPU 跑高(这也是为什么telnet 22端口不会报错,也不会正常出来SSH login——因为抢不到CPU 资源去走选端口的流程) </li><li><strong>Cannot assign requested address</strong> 和 <strong>无法指定被请求的地址</strong> 报错是找不到可用端口导致的,还没有走到三次握手,也就是和服务端无关</li></ol><p>继续验证:</p><p><img src="https://cdn.jsdelivr.net/gh/plantegg/plantegg.github.io/images/951413iMgBlog/image-20240320180024067.png" alt="image-20240320180024067"></p><p>上图是先把local port 增多,然后立即 telnet 3306 发现成功了!这更是证明了上面的结论2</p><p>到这里分析清楚了为什么CPU 高—— Sysbench疯狂建连接导致端口用完,内核要用死循环不断去找可用端口导致了CPU使用率高,因为是内核态的行为所以表现出来就是 sys CPU 100%</p><p>而telnet 22端口不报这个错,是因为 22端口的可用端口几万个没有被使用掉,但是22端口也没让我输密码,这里应该是telnet 22时抢不到CPU 造成TCP 三次握手缓慢,但绝对不会报 Cannot assign requested address 错误</p><h3 id="什么原因会导致-CLOSE-WAIT-状态"><a href="#什么原因会导致-CLOSE-WAIT-状态" class="headerlink" title="什么原因会导致 CLOSE_WAIT 状态"></a>什么原因会导致 CLOSE_WAIT 状态</h3><p>在将这个问题前还是请先去看看 CLOSE_WAIT 代表了什么含义: <a href="https://plantegg.github.io/2021/04/06/%E4%B8%BA%E4%BB%80%E4%B9%88%E8%BF%99%E4%B9%88%E5%A4%9ACLOSE_WAIT/">为什么这么多CLOSE_WAIT</a></p><p>当同事们看到几万个连接的时候第一反应就是能不能改改这两 Linux 的系统参数:tcp_tw_reuse, tcp_tw_recycle 让端口/连接快速回收?</p><p>有没有你们都是这种同事,看到一个现象条件反射得出这个结论,这都是<strong>略知皮毛的经验太多了导致</strong>的</p><p>我在 《为什么这么多CLOSE_WAIT》一文中反复提到这张图,以及学霸是怎么从这张图推断原因的:</p><p><img src="https://cdn.jsdelivr.net/gh/plantegg/plantegg.github.io/images/951413iMgBlog/b3d075782450b0c8d2615c5d2b75d923.png" alt="image.png"></p><p>看完上面这个图和我的 《为什么这么多CLOSE_WAIT》就应该知道 <strong>CLOSE_WAIT 就是 Sysbench 没有调 Socket.close 导致的</strong> 和内核没有关系,所以改啥内核参数也没有用,因为在这次问题中很多研发同学看到 CLOSE_WAIT 第一反应是去改这些参数:</p><figure class="highlight plain"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br></pre></td><td class="code"><pre><span class="line">net.ipv4.tcp_tw_recycle = 0</span><br><span class="line">net.ipv4.tcp_tw_reuse = 0</span><br></pre></td></tr></table></figure><p>如何进一步证明是Sysbench的问题呢?可以抓包看看:</p><p><img src="https://cdn.jsdelivr.net/gh/plantegg/plantegg.github.io/images/951413iMgBlog/image-20240320181258965.png" alt="image-20240320181258965">上图是在 Sysbench 所在ECS 上抓包可以看到所有连接都是这样,注意第四个包是 Server端在3次握手成功后发了 Server Greeting 给客户端 Sysbench,此时Sysbench 应该发自己的账号密码来 Login但是抓包永远卡在这里,也就是Sysbench 建立完连接后跑了,不搭理服务端发了什么,这也是为什么最前面的 netstat -anto 看到 Recv-Q 这列总是79,这79长度的内容就是 Server 发给Sysbench 的 Server Greeting 内容,本该Sysbench 去读走 Server Greeting 然后按照MySQL 协议发账号密码,但是不,此时Sysbench 颠了,不管这个连接了,又去创建新连接于是重复上面的过程;直到本地端口用完,sys CPU 干到 100%</p><p>其实上面这个抓包的连接状态是 ESTABLISHED 状态,为什么最终看到的是 CLOSE_WAIT 呢,因为 Server发了 Server Greeting 后有一个超时时间,迟迟等不到Sysbench Client的账号密码就会发 FIN 给Client 端请求断开这个连接,导致Client断的连接状态从 ESTABLISHED 进入 CLOSE_WAIT ,这从上面的 TCP 状态图完全可以推导出来,扩大抓包时间的话会抓到 Server 发过来的 FIN 包</p><p>你要看不懂这个抓包,可以找个正常的MySQL-client 连 Server抓一次包,有个正常的对比会很幸福,我丢一个正常的给大家对比参考,上面错在 Sysbench 没有发如下红框的包:</p><p><img src="https://cdn.jsdelivr.net/gh/plantegg/plantegg.github.io/images/951413iMgBlog/FqnYYzopMqEd98ZTEfhNgCfKJIoI.png" alt="img"></p><p>Server 一重启就去看 netstat 的话确实都是 ESTABLISHED:</p><figure class="highlight plain"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br></pre></td><td class="code"><pre><span class="line"># netstat -anto | head -30 |grep -E "State|:3306 "</span><br><span class="line">Proto Recv-Q Send-Q Local Address Foreign Address State Timer</span><br><span class="line">tcp 78 0 192.168.0.1:46344 192.168.20.220:3306 ESTABLISHED off (0.00/0/0)</span><br><span class="line">tcp 78 0 192.168.0.1:44592 192.168.20.220:3306 ESTABLISHED off (0.00/0/0)</span><br><span class="line">tcp 78 0 192.168.0.1:45908 192.168.20.220:3306 ESTABLISHED off (0.00/0/0)</span><br><span class="line">tcp 78 0 192.168.0.1:44166 192.168.20.220:3306 ESTABLISHED off (0.00/0/0)</span><br><span class="line">tcp 78 0 192.168.0.1:59484 192.168.20.220:3306 ESTABLISHED off (0.00/0/0)</span><br><span class="line">tcp 78 0 192.168.0.1:60720 192.168.20.220:3306 ESTABLISHED off (0.00/0/0)</span><br><span class="line">tcp 78 0 192.168.0.1:53436 192.168.20.220:3306 ESTABLISHED off (0.00/0/0)</span><br><span class="line">tcp 78 0 192.168.0.1:58690 192.168.20.220:3306 ESTABLISHED off (0.00/0/0)</span><br><span class="line">tcp 78 0 192.168.0.1:35932 192.168.20.220:3306 ESTABLISHED off (0.00/0/0)</span><br><span class="line">tcp 78 0 192.168.0.1:53944 192.168.20.220:3306 ESTABLISHED off (0.00/0/0)</span><br><span class="line">tcp 78 0 192.168.0.1:59758 192.168.20.220:3306 ESTABLISHED off (0.00/0/0)</span><br><span class="line">tcp 78 0 192.168.0.1:53676 192.168.20.220:3306 ESTABLISHED off (0.00/0/0)</span><br><span class="line">tcp 78 0 192.168.0.1:59304 192.168.20.220:3306 ESTABLISHED off (0.00/0/0)</span><br><span class="line">tcp 78 0 192.168.0.1:41848 192.168.20.220:3306 ESTABLISHED off (0.00/0/0)</span><br><span class="line">tcp 78 0 192.168.0.1:44312 192.168.20.220:3306 ESTABLISHED off (0.00/0/0)</span><br><span class="line">tcp 78 0 192.168.0.1:56654 192.168.20.220:3306 ESTABLISHED off (0.00/0/0)</span><br><span class="line">tcp 78 0 192.168.0.1:3516 192.168.20.220:3306 ESTABLISHED off (0.00/0/0)</span><br><span class="line">tcp 78 0 192.168.0.1:39316 192.168.20.220:3306 ESTABLISHED off (0.00/0/0)</span><br><span class="line">tcp 78 0 192.168.0.1:55074 192.168.20.220:3306 ESTABLISHED off (0.00/0/0)</span><br><span class="line">tcp 78 0 192.168.0.1:59476 192.168.20.220:3306 ESTABLISHED off (0.00/0/0)</span><br><span class="line">tcp 78 0 192.168.0.1:48854 192.168.20.220:3306 ESTABLISHED off (0.00/0/0)</span><br></pre></td></tr></table></figure><p>此时端口还够的时候去 strace 看到Sysbench 确实在疯狂 connect 建连接,也不像端口不够的时候会报错:</p><p><img src="https://cdn.jsdelivr.net/gh/plantegg/plantegg.github.io/images/951413iMgBlog/image-20240320184307103.png" alt="image-20240320184307103"></p><p>到这里就可以回答:什么原因会导致 CLOSE_WAIT 状态?因为Sysbench 没有去正常 Login MySQL,也没有调用 Socket.close 导致的</p><h3 id="为什么Sysbench-要疯狂创建4万多个连接"><a href="#为什么Sysbench-要疯狂创建4万多个连接" class="headerlink" title="为什么Sysbench 要疯狂创建4万多个连接"></a>为什么Sysbench 要疯狂创建4万多个连接</h3><p>为什么Sysbench 要疯狂创建4万多个连接,且还在不停地创建,这就要涉及到 Sysbench 具体代码逻辑(这个版本的 Sysbench 被我厂同事魔改过) ,在一猛子扎进去看代码逻辑前,我换了个开源的 Sysbench 版本(Update 20240325 其实是换了个压测环境,用了不同的Sysbench而已),问题就消失了 —— 有时候猛干不去取巧</p><p>到此可以说明问题的原因就是:<strong>这个 Sysbench 版本在连接异常断开(Server升级主动断开连接)后,新建连接逻辑错误,疯狂建连接引起的</strong></p><h4 id="Update-20240327"><a href="#Update-20240327" class="headerlink" title="Update 20240327"></a><a href="https://malleable-elbow-b9f.notion.site/sysbench-kill-9eeaf1bf51b44510a7204de14b953705" target="_blank" rel="noopener">Update 20240327</a></h4><p>后来经过网友 <a href="https://malleable-elbow-b9f.notion.site/sysbench-centos-0445e571d34d40788a237507de34b371" target="_blank" rel="noopener">扒皮哥和 haoqixu的耐心分析</a>,发现这个问题不完是 Sysbench 本身代码的问题,Sysbench 依赖 libmysqlclient.so 包去连MySQL-Server 和处理MySQL 协议等,而这个 Bug 存在 libmysqlclient.so 中,准确来说是MariaDB的 libmysqlclient 中(和版本没关系,最新版还有这个问题),如果换成MySQL Community的 libmysqlclient 就不存在这个问题了。划重点:无论你怎么更换 Sysbench 版本这个问题也无法解决</p><p>另外 MySQL 社区和 MariaDB 的 libmysqlclient 只是接口一样,实现完全可以不同,<a href="https://mariadb.com/kb/en/mysql_real_connect/" target="_blank" rel="noopener">MariaDB 要求连接重连的时候先 close 再init 后才能使用</a>,而MySQL 社区版本没有这个要求,所以改 Sysbench 重连的代码也可以 fix 这个问题</p><p>这个问题也有人怀疑过OS 的问题,比如换个OS 就好了,但我始终坚持是 Sysbench的问题,因为建连接后不读走TCP buffer里的内容都是业务层面的逻辑(相对于OS Sysbench和libmysqlclient 都是业务层),所以这个错误肯定不在OS</p><p>但是很多人换了 OS 就正常了,其实这里不是你只换了 OS,而是换 OS 的时候你顺便把 libmysqlclient 也换了你自己都不知道,这就是我们常说的瞎蒙蒙对了,但是这种经验却是错误的</p><p>更换 libmysqlclient 来验证:</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br></pre></td><td class="code"><pre><span class="line">yum remove mariadb-devel -y //删掉 mariadb-devel 所带的 libmysqlclient 18</span><br><span class="line">rpm -i https://dev.mysql.com/get/mysql80-community-release-el8-9.noarch.rpm</span><br><span class="line">yum install mysql-community-devel -y //安装 mysql-community-deve ,会带上 libmysqlclient 21</span><br><span class="line">./configure</span><br><span class="line">make install</span><br><span class="line"></span><br><span class="line"><span class="meta">#</span> 查看依赖</span><br><span class="line">objdump -x /usr/local/bin/sysbench |grep libmysqlclient</span><br><span class="line"></span><br><span class="line"> NEEDED libmysqlclient.so.21</span><br><span class="line"> required from libmysqlclient.so.21:</span><br><span class="line"> 0x03532d60 0x00 08 libmysqlclient_21.0</span><br><span class="line">0000000000000000 F *UND* 0000000000000000 mysql_stmt_next_result@@libmysqlclient_21.0</span><br><span class="line">0000000000000000 F *UND* 0000000000000000 mysql_errno@@libmysqlclient_21.0</span><br><span class="line">... 省略若干</span><br><span class="line">0000000000000000 F *UND* 0000000000000000 mysql_server_end@@libmysqlclient_21.0</span><br></pre></td></tr></table></figure><h4 id="自己编译-libmariadb-so"><a href="#自己编译-libmariadb-so" class="headerlink" title="自己编译 libmariadb.so"></a>自己编译 libmariadb.so</h4><p>通过下载 mariadb-connector-c-3.3 源码,自己独立编译,新生成的 so 包不再导致CPU飙高,但是TPS 永远跌零,而通过yum 安装的是mariadb-connector-c-3.2.6</p><p>此时抓包,可以看到3.3 收到Server Greeting 后也不发送账号密码,但是直接 RST 了连接,这样使得连接被释放,占用端口被释放,CPU不会飙高,但是连接永远无法创建成功,TPS 永远跌零</p><figure class="highlight plain"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br></pre></td><td class="code"><pre><span class="line">mariadb-connector-c-3.3 抓包:</span><br><span class="line">15100Mar 27, 2024 17:07:53.422009421 CST0.001052729495483306049548 → 3306 [SYN] Seq=0 Win=65495 Len=0 MSS=65495 SACK_PERM=1 TSval=1021766452 TSecr=0 WS=128</span><br><span class="line">15101Mar 27, 2024 17:07:53.422013946 CST0.00000452533064954803306 → 49548 [SYN, ACK] Seq=0 Ack=1 Win=65483 Len=0 MSS=65495 SACK_PERM=1 TSval=1021766452 TSecr=1021766452 WS=128</span><br><span class="line">15102Mar 27, 2024 17:07:53.422018506 CST0.000004560495483306049548 → 3306 [ACK] Seq=1 Ack=1 Win=65536 Len=0 TSval=1021766452 TSecr=1021766452</span><br><span class="line">15103Mar 27, 2024 17:07:53.422116125 CST0.000097619495483306049548 → 3306 [FIN, ACK] Seq=1 Ack=1 Win=65536 Len=0 TSval=1021766452 TSecr=1021766452</span><br><span class="line">15104Mar 27, 2024 17:07:53.422146191 CST0.00003006633064954877Server Greeting proto=10 version=8.2.0 //Server Greeting</span><br><span class="line">15105Mar 27, 2024 17:07:53.422155229 CST0.000009038495483306049548 → 3306 [RST] Seq=2 Win=0 Len=0</span><br></pre></td></tr></table></figure><p><img src="https://cdn.jsdelivr.net/gh/plantegg/plantegg.github.io/images/951413iMgBlog/image-20240327171929387.png" alt="image-20240327171929387"></p><p>此时:</p><figure class="highlight plain"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br></pre></td><td class="code"><pre><span class="line">#ldd /usr/local/bin/sysbench</span><br><span class="line">linux-vdso.so.1 (0x00007ffc3a736000)</span><br><span class="line">libmariadb.so.3 => /lib64/libmariadb.so.3 (0x00007f3ce6352000) </span><br><span class="line">libdl.so.2 => /lib64/libdl.so.2 (0x00007f3ce634b000)</span><br><span class="line">libm.so.6 => /lib64/libm.so.6 (0x00007f3ce6205000)</span><br><span class="line">libgcc_s.so.1 => /lib64/libgcc_s.so.1 (0x00007f3ce61ea000)</span><br><span class="line">libpthread.so.0 => /lib64/libpthread.so.0 (0x00007f3ce61c8000)</span><br><span class="line">libc.so.6 => /lib64/libc.so.6 (0x00007f3ce5fec000)</span><br><span class="line">libssl.so.1.1 => /lib64/libssl.so.1.1 (0x00007f3ce5f53000)</span><br><span class="line">libcrypto.so.1.1 => /lib64/libcrypto.so.1.1 (0x00007f3ce5c58000)</span><br><span class="line">/lib64/ld-linux-x86-64.so.2 (0x00007f3ce63f1000)</span><br><span class="line">libz.so.1 => /lib64/libz.so.1 (0x00007f3ce5c3e000)</span><br><span class="line"></span><br><span class="line">#objdump -x /usr/local/bin/sysbench |grep libmysqlclient</span><br><span class="line"> 0x0f735338 0x00 04 libmysqlclient_18 </span><br><span class="line"> </span><br><span class="line">#rpm -q --whatprovides /lib64/libmariadb.so.3</span><br><span class="line">mariadb-connector-c-3.2.6-1.al8.x86_64</span><br><span class="line"></span><br><span class="line">#yum info mariadb-connector-c-3.2.6-1.al8.x86_64</span><br><span class="line">上次元数据过期检查:0:23:09 前,执行于 2024年03月27日 星期三 16时53分17秒。</span><br><span class="line">已安装的软件包</span><br><span class="line">名称 : mariadb-connector-c</span><br><span class="line">版本 : 3.2.6</span><br><span class="line">发布 : 1.al8</span><br><span class="line">架构 : x86_64</span><br><span class="line">大小 : 545 k</span><br><span class="line">源 : mariadb-connector-c-3.2.6-1.al8.src.rpm</span><br><span class="line">仓库 : @System</span><br><span class="line">来自仓库 : alinux3-updates</span><br><span class="line">概况 : The MariaDB Native Client library (C driver)</span><br><span class="line">URL : http://mariadb.org/</span><br><span class="line">协议 : LGPLv2+</span><br><span class="line">描述 : The MariaDB Native Client library (C driver) is used to connect</span><br><span class="line"> : applications developed in C/C++ to MariaDB and MySQL databases.</span><br></pre></td></tr></table></figure><p>Sysbench 建连接堆栈,当端口不够的时候很容易抓到 connect 函数,因为connect 需要lookup 可用端口:</p><figure class="highlight plain"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br></pre></td><td class="code"><pre><span class="line">#pstack 1448113</span><br><span class="line">Thread 3 (Thread 0x7f9a0b23c640 (LWP 1448115)):</span><br><span class="line">#0 0x00007f9a0b9722bb in connect () from /lib64/libpthread.so.0</span><br><span class="line">#1 0x00007f9a0bb02b00 in pvio_socket_internal_connect (pvio=0x7f99fa5db270, name=0x7f99fc0247c0, namelen=16) at /root/mariadb-connector-c-3.2.6/plugins/pvio/pvio_socket.c:642</span><br><span class="line">#2 0x00007f9a0bb02d76 in pvio_socket_connect_sync_or_async (pvio=0x7f99fa5db270, name=0x7f99fc0247c0, namelen=16) at /root/mariadb-connector-c-3.2.6/plugins/pvio/pvio_socket.c:750</span><br><span class="line">#3 0x00007f9a0bb03499 in pvio_socket_connect (pvio=0x7f99fa5db270, cinfo=0x7f9a0b23b3d0) at /root/mariadb-connector-c-3.2.6/plugins/pvio/pvio_socket.c:919</span><br><span class="line">#4 0x00007f9a0bb15277 in ma_pvio_connect (pvio=0x7f99fa5db270, cinfo=0x7f9a0b23b3d0) at /root/mariadb-connector-c-3.2.6/libmariadb/ma_pvio.c:484</span><br><span class="line">#5 0x00007f9a0bb0b59c in mthd_my_real_connect (mysql=0x7f99fc01ff50, host=0x14e4110 "127.0.0.1", user=0x14e27c0 "root", passwd=0x14e40c0 "123", db=0x14e28f0 "test", port=3306, unix_socket=0x0, client_flag=65536) at /root/mariadb-connector-c-3.2.6/libmariadb/mariadb_lib.c:1462</span><br><span class="line">#6 0x00007f9a0bb0affb in mysql_real_connect (mysql=0x7f99fc01ff50, host=0x14e4110 "127.0.0.1", user=0x14e27c0 "root", passwd=0x14e40c0 "123", db=0x14e28f0 "test", port=3306, unix_socket=0x0, client_flag=65536) at /root/mariadb-connector-c-3.2.6/libmariadb/mariadb_lib.c:1301</span><br><span class="line">#7 0x000000000041b5d0 in mysql_drv_real_connect (db_mysql_con=0x7f99fc01fbf0) at drv_mysql.c:405</span><br><span class="line">#8 0x000000000041cc6c in mysql_drv_reconnect (sb_con=0x0) at drv_mysql.c:815</span><br><span class="line">#9 check_error (sb_con=sb_con@entry=0x7f99fc0210a0, func=func@entry=0x486637 "mysql_drv_query()", query=query@entry=0x7f99fc0207f0 "SELECT c FROM sbtest16 WHERE id=5031", counter=counter@entry=0x7f99fc0210c8) at drv_mysql.c:894</span><br><span class="line">#10 0x000000000041d1d1 in mysql_drv_query (rs=0x7f99fc0210c8, len=<optimized out>, query=0x7f99fc0207f0 "SELECT c FROM sbtest16 WHERE id=5031", sb_conn=<optimized out>) at drv_mysql.c:1071</span><br><span class="line">#11 mysql_drv_query (rs=0x7f99fc0210c8, len=<optimized out>, query=0x7f99fc0207f0 "SELECT c FROM sbtest16 WHERE id=5031", sb_conn=<optimized out>) at drv_mysql.c:1051</span><br><span class="line">#12 mysql_drv_execute (stmt=<optimized out>, rs=<optimized out>) at drv_mysql.c:1040</span><br><span class="line">#13 0x000000000040f32a in db_execute (stmt=0x7f99fc021270) at db_driver.c:517</span><br></pre></td></tr></table></figure><h3 id="修复"><a href="#修复" class="headerlink" title="修复"></a>修复</h3><p>改下Sysbench 代码 <a href="https://github.com/akopytov/sysbench/blob/master/src/drivers/mysql/drv_mysql.c" target="_blank" rel="noopener">./src/drivers/mysql/drv_mysql.c </a>加一行就可以解决这个问题:</p><figure class="highlight c"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br></pre></td><td class="code"><pre><span class="line"><span class="function"><span class="keyword">static</span> <span class="keyword">int</span> <span class="title">mysql_drv_reconnect</span><span class="params">(<span class="keyword">db_conn_t</span> *sb_con)</span></span></span><br><span class="line"><span class="function"></span>{</span><br><span class="line"> <span class="keyword">db_mysql_conn_t</span> *db_mysql_con = (<span class="keyword">db_mysql_conn_t</span> *) sb_con->ptr;</span><br><span class="line"> MYSQL *con = db_mysql_con->mysql;</span><br><span class="line"></span><br><span class="line"> log_text(LOG_DEBUG, <span class="string">"Reconnecting zhejian"</span>);</span><br><span class="line"></span><br><span class="line"> DEBUG(<span class="string">"mysql_close(%p)"</span>, con);</span><br><span class="line"> mysql_close(con);</span><br><span class="line"> mysql_init(con); <span class="comment">//add by ren</span></span><br><span class="line"></span><br><span class="line"><span class="comment">//这个死循环在反复创建新连接用光 port range</span></span><br><span class="line"> <span class="keyword">while</span> (mysql_drv_real_connect(db_mysql_con))</span><br><span class="line"> {</span><br><span class="line"> <span class="keyword">if</span> (sb_globals.error)</span><br><span class="line"> <span class="keyword">return</span> DB_ERROR_FATAL;</span><br><span class="line"></span><br><span class="line"> usleep(<span class="number">1000</span>); </span><br><span class="line"> }</span><br><span class="line"></span><br><span class="line"> log_text(LOG_DEBUG, <span class="string">"Reconnected"</span>);</span><br><span class="line"></span><br><span class="line"> <span class="keyword">return</span> DB_ERROR_IGNORABLE;</span><br><span class="line">}</span><br></pre></td></tr></table></figure><h2 id="重现"><a href="#重现" class="headerlink" title="重现"></a>重现</h2><p>只有sysbench 编译时依赖 libmariadb.so 才会有问题</p><figure class="highlight plain"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br></pre></td><td class="code"><pre><span class="line">#ldd /usr/local/bin/sysbench</span><br><span class="line">linux-vdso.so.1 (0x00007ffee0f93000)</span><br><span class="line">libmariadb.so.3 => /lib64/libmariadb.so.3 (0x00007fecb3f9d000)</span><br><span class="line"></span><br><span class="line">yum install mariadb-devel</span><br></pre></td></tr></table></figure><p>卸载掉 mysql-devel 重新安装 mariadb-devel, 再编译 sysbench</p><p>sysbench 源码下载:<a href="https://github.com/akopytov/sysbench" target="_blank" rel="noopener">https://github.com/akopytov/sysbench</a></p><h2 id="总结"><a href="#总结" class="headerlink" title="总结"></a>总结</h2><p>问题结论:mariadb 的 connect lib库对 libmysql 接口实现得有问题,当连接异常断开后会进入死循环疯狂创建连接导致了这个问题</p><p>这里涉及很多技巧:top/ss -s/strace/netstat/telnet 以及很多基础知识 local port range/ CLOSE_WAIT ,会折腾很重要,折腾的前提是会解锁各种姿势</p><p>难的是如何恰到好处地应用这些技巧和正确应用这些知识,剩下的分析推进就很符合逻辑了</p><p>另外我之前强调的在一个错误面前反复折腾不断缩小范围的能力也很重要,比如换个版本确认总比你看代码快吧</p><p>从这篇文章可以看出我真是个好教练,一次故障诊断涉及2-3个知识点,3-5个小命令,2-4次逻辑推断,最后完美定位问题,把各个知识的解读、各种命令的灵活使用展现的淋漓尽致。</p><p>接下来就是如何在我们的统一实验 ECS 上重现这个问题并保证让大家跟着实际操作</p><p>其实开始的时候问题没有这么清晰,每次升级才能稳定重现,后来想要定位问题就必须降低重现难度,考虑到重启客户端ECS 就能恢复,于是:</p><ul><li>不再重启ECS,只重启 Sysbench —— 能恢复</li><li>不真正升级只重启Server —— 问题能稳定重现,重现容易很多了</li><li>不重启 Server,只是kill掉Sysbench 的一条连接 —— 能重现</li><li>将Sysbench 连接数从最开始100个,改成2个压 Server,然后 kill 掉 Sysbench 的一条连接 —— 能重现</li></ul><p>到最后稳定重现方案就是 Sysbench 用两个连接压 Server,然后到 Server 上随便kill 掉其中一条连接,这个问题能稳定重现;重现后重启Sysbench 就能稳定恢复</p><h2 id="参考资料"><a href="#参考资料" class="headerlink" title="参考资料"></a>参考资料</h2><p>一个<a href="https://bugs.mysql.com/bug.php?id=94435" target="_blank" rel="noopener">类似的bug</a> 和 <a href="https://bugs.mysql.com/bug.php?id=88428" target="_blank" rel="noopener">https://bugs.mysql.com/bug.php?id=88428</a></p><h2 id="内核笔记"><a href="#内核笔记" class="headerlink" title="内核笔记"></a>内核笔记</h2><p>分人分析得再好也是别人的,自己积累的一点点终究是自己的;端口不够的时候CPU 拉高我之前碰到过,所以在内核的代码里写了点笔记,这次Sysbench 问题又碰到了,所以正好看到我上次的笔记:<a href="https://github.com/plantegg/linux/blob/3157b476f8216d2655c1c85bad53c975190689ba/net/ipv4/inet_hashtables.c#L447" target="_blank" rel="noopener">https://github.com/plantegg/linux/blob/3157b476f8216d2655c1c85bad53c975190689ba/net/ipv4/inet_hashtables.c#L447</a> </p><p>我的意思是可以拉个Linux 内核较新的代码分支,自己随便哪天学到点啥在上面注释一下,commit,时间久了慢慢就串起来了,如下图错误码和strace 看到的错误信息就是一致的</p><p>直播总结:</p><p>关于可用端口一文,搞懂这个概念(到底有多少可用端口)只是开始;还需要借助案例去理解;天杀的Google 把端口分为奇偶数两部分,简直是神助攻,给了我们无穷Case 来加深端口不够时候系统什么表现的映像;今天直播的案例是端口全不够了,这种非常明显的异常更好发现;如果端口还够只是偶数用完了,但每次都要扫描一遍偶数,发现偶数没有可用端口再去扫描奇数就能找到可用端口,这导致的是每次可能有点卡顿,但是又不报错,因为过于隐晦这在业务层面带来的危害更加大</p><p>同样是学TCP状态流转(ESTABLISHED TIME_WAIT CLOSE_WAIT ),有人看一次就能推理,我们都是普通人,看过了还瞎猜,不管啥都想的是 tcp_reuse/tcp_recycle,还处在使劲蒙的状态</p><p>Ping/telnet/strace/tcpdump 几乎都会用,但是如何恰到好处地去用,报错是什么状态、没任何输出是什么状态</p><p>然后就是过程分析中的一些推理。先从最根本的现象QPS 跌0开始撸</p><p>三个版本:</p><table><thead><tr><th>不同的 libmysqlclient 版本</th><th>现象</th></tr></thead><tbody><tr><td>yum install mariadb-devel (libmariadb.so 3.2.6)</td><td>永远跌零,耗尽端口、CPU高</td></tr><tr><td>手工编译 mariadb-connector-c-3.3(libmariadb.so 3.3)</td><td>永远跌零,但是不费端口、不耗CPU</td></tr><tr><td>yum install MySQL-devel(libmysqlclient 哪个版本都行)</td><td>正常</td></tr></tbody></table><p>手工编译 libmariadb.so 后 CPU 不飙高,但是TPS 一直跌0,也会疯狂重建连接(每1ms 去建一次连接), 还是没处理对,不过会reset 连接释放端口</p><p>如果一个分析推理要求很高的逻辑能力(or 智商),那复制性就不强,没有太大的学习价值(主要是学不会),我们尽量多去学1+1=2这样的逻辑推理,时间久了你就会了1+2=3</p><p>真正的高手肯定不只是流于表象:</p><ol><li>啊,连不上了</li><li>啊,服务器有问题</li><li>啊,CPU高</li><li>啊,too many Connection</li></ol><p>天翼云老哥一年前也发现了这个问题并给了解决办法,但是阅读量只有55 <a href="https://www.ctyun.cn/developer/article/405333884604485" target="_blank" rel="noopener">https://www.ctyun.cn/developer/article/405333884604485</a> ——文章过于简单,如果去学习的话只能看到一个结论</p><h2 id="进一步总结"><a href="#进一步总结" class="headerlink" title="进一步总结"></a>进一步总结</h2><h3 id="如果3306-端口被防火墙drop,那么:"><a href="#如果3306-端口被防火墙drop,那么:" class="headerlink" title="如果3306 端口被防火墙drop,那么:"></a>如果3306 端口被防火墙drop,那么:</h3><figure class="highlight plain"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br></pre></td><td class="code"><pre><span class="line">[root@plantegg 11:25 /root]</span><br><span class="line">#mysql -h127.0.0.1 --ssl-mode=DISABLED -uroot -p123 test</span><br><span class="line">mysql: [Warning] Using a password on the command line interface can be insecure.</span><br><span class="line"></span><br><span class="line">…… 卡着,不报错</span><br><span class="line"></span><br><span class="line"></span><br><span class="line">strace 看到是这样:</span><br><span class="line">futex(0x55b99f129518, FUTEX_WAKE_PRIVATE, 2147483647) = 0</span><br><span class="line">getpid() = 1511313</span><br><span class="line">socket(AF_INET, SOCK_STREAM, IPPROTO_TCP) = 3</span><br><span class="line">connect(3, {sa_family=AF_INET, sin_port=htons(3306), sin_addr=inet_addr("127.0.0.1")}, 16</span><br><span class="line"></span><br><span class="line">一直卡在这里</span><br></pre></td></tr></table></figure><h3 id="Too-many-connections"><a href="#Too-many-connections" class="headerlink" title="Too many connections"></a>Too many connections</h3><figure class="highlight plain"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br></pre></td><td class="code"><pre><span class="line">getpid() = 1515762</span><br><span class="line">socket(AF_INET, SOCK_STREAM, IPPROTO_TCP) = 3</span><br><span class="line">connect(3, {sa_family=AF_INET, sin_port=htons(3306), sin_addr=inet_addr("127.0.0.1")}, 16) = 0</span><br><span class="line">setsockopt(3, SOL_TCP, TCP_NODELAY, [1], 4) = 0</span><br><span class="line">setsockopt(3, SOL_SOCKET, SO_KEEPALIVE, [1], 4) = 0</span><br><span class="line">recvfrom(3, "\27\0\0\0\377\20\4Too many connections", 16384, 0, NULL, NULL) = 27</span><br><span class="line">shutdown(3, SHUT_RDWR) = 0</span><br><span class="line">close(3) = 0</span><br><span class="line">fstat(1, {st_mode=S_IFCHR|0620, st_rdev=makedev(0x88, 0), ...}) = 0</span><br><span class="line">write(2, "ERROR 1040 (HY000): ", 20ERROR 1040 (HY000): ) = 20</span><br><span class="line">write(2, "Too many connections", 20Too many connections) = 20</span><br><span class="line"></span><br><span class="line">#mysql -h127.0.0.1 --ssl-mode=DISABLED -uroot -p123 test</span><br><span class="line">mysql: [Warning] Using a password on the command line interface can be insecure.</span><br><span class="line">ERROR 1040 (HY000): Too many connections</span><br><span class="line"></span><br><span class="line">用完端口后:</span><br><span class="line">getpid() = 1515928</span><br><span class="line">socket(AF_INET, SOCK_STREAM, IPPROTO_TCP) = 3</span><br><span class="line">connect(3, {sa_family=AF_INET, sin_port=htons(3306), sin_addr=inet_addr("127.0.0.1")}, 16) = -1 EADDRNOTAVAIL (Cannot assign requested address)</span><br><span class="line">shutdown(3, SHUT_RDWR) = -1 ENOTCONN (Transport endpoint is not connected)</span><br><span class="line">close(3) = 0</span><br><span class="line">fstat(1, {st_mode=S_IFCHR|0620, st_rdev=makedev(0x88, 0), ...}) = 0</span><br><span class="line">write(2, "ERROR 2003 (HY000): ", 20ERROR 2003 (HY000): ) = 20</span><br><span class="line">write(2, "Can't connect to MySQL server on"..., 54Can't connect to MySQL server on '127.0.0.1:3306' (99)) = 54</span><br><span class="line">write(2, "\n", 1</span><br><span class="line">) = 1</span><br><span class="line">write(1, "\7", 1) = 1</span><br><span class="line">#mysql -h127.0.0.1 --ssl-mode=DISABLED -uroot -p123 test</span><br><span class="line">mysql: [Warning] Using a password on the command line interface can be insecure.</span><br><span class="line">ERROR 2003 (HY000): Can't connect to MySQL server on '127.0.0.1:3306' (99)</span><br></pre></td></tr></table></figure><h3 id="如果是服务端3306-没起:"><a href="#如果是服务端3306-没起:" class="headerlink" title="如果是服务端3306 没起:"></a>如果是服务端3306 没起:</h3><figure class="highlight plain"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br></pre></td><td class="code"><pre><span class="line">socket(AF_INET, SOCK_STREAM, IPPROTO_TCP) = 3</span><br><span class="line">connect(3, {sa_family=AF_INET, sin_port=htons(3306), sin_addr=inet_addr("127.0.0.1")}, 16) = -1 ECONNREFUSED (Connection refused)</span><br><span class="line">shutdown(3, SHUT_RDWR) = -1 ENOTCONN (Transport endpoint is not connected)</span><br><span class="line">close(3) = 0</span><br><span class="line">fstat(1, {st_mode=S_IFCHR|0620, st_rdev=makedev(0x88, 0), ...}) = 0</span><br><span class="line">write(2, "ERROR 2003 (HY000): ", 20ERROR 2003 (HY000): ) = 20</span><br><span class="line">write(2, "Can't connect to MySQL server on"..., 55Can't connect to MySQL server on '127.0.0.1:3306' (111)) = 55</span><br><span class="line">write(2, "\n", 1</span><br><span class="line">) = 1</span><br><span class="line">write(1, "\7", 1) = 1</span><br><span class="line">rt_sigaction(SIGQUIT, {sa_handler=SIG_IGN, sa_mask=[QUIT], sa_flags=SA_RESTORER|SA_RESTART, sa_restorer=0x7f20ab9f0a60}, {sa_handler=SIG_DFL, sa_mask=[], sa_flags=0}, 8) = 0</span><br><span class="line">rt_sigaction(SIGINT, {sa_handler=SIG_IGN, sa_mask=[INT], sa_flags=SA_RESTORER|SA_RESTART, sa_restorer=0x7f20ab9f0a60}, {sa_handler=SIG_DFL, sa_mask=[], sa_flags=0}, 8) = 0</span><br><span class="line">rt_sigaction(SIGHUP, {sa_handler=SIG_IGN, sa_mask=[HUP], sa_flags=SA_RESTORER|SA_RESTART, sa_restorer=0x7f20ab9f0a60}, {sa_handler=SIG_DFL, sa_mask=[], sa_flags=0}, 8) = 0</span><br><span class="line">exit_group(1) = ?</span><br><span class="line">+++ exited with 1 +++</span><br><span class="line"></span><br><span class="line">[root@plantegg 11:54 /root]</span><br><span class="line">#mysql --show-warnings=FALSE -h127.0.0.1 --ssl-mode=DISABLED -uroot -p123 test</span><br><span class="line">mysql: [Warning] Using a password on the command line interface can be insecure.</span><br><span class="line">ERROR 2003 (HY000): Can't connect to MySQL server on '127.0.0.1:3306' (111)</span><br></pre></td></tr></table></figure><h3 id="账号密码权限错误"><a href="#账号密码权限错误" class="headerlink" title="账号密码权限错误"></a>账号密码权限错误</h3><figure class="highlight plain"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br></pre></td><td class="code"><pre><span class="line">#mysql --show-warnings=FALSE -h127.0.0.1 --ssl-mode=DISABLED -uroot -p1234 test</span><br><span class="line">ERROR 1045 (28000): Access denied for user 'root'@'127.0.0.1' (using password: YES)</span><br></pre></td></tr></table></figure><h3 id="telnet"><a href="#telnet" class="headerlink" title="telnet"></a>telnet</h3><figure class="highlight plain"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br></pre></td><td class="code"><pre><span class="line">//正常telnet ,能看到 Greeting以及输密码信息</span><br><span class="line">#telnet 127.0.0.1 3306</span><br><span class="line">Trying 127.0.0.1...</span><br><span class="line">Connected to 127.0.0.1.</span><br><span class="line">Escape character is '^]'.</span><br><span class="line">I</span><br><span class="line">8.2.0�#[6Y</span><br><span class="line"> @+5=,mi?%#caching_sha2_password^]</span><br><span class="line"></span><br><span class="line">//当MySQL-Server 的连接数不够了时</span><br><span class="line">#telnet 127.0.0.1 3306</span><br><span class="line">Trying 127.0.0.1...</span><br><span class="line">Connected to 127.0.0.1.</span><br><span class="line">Escape character is '^]'.</span><br><span class="line">Too many connectionsConnection closed by foreign host.</span><br><span class="line"></span><br><span class="line">//当客户端本地可用端口不够,三次握手还没开始</span><br><span class="line">#telnet 127.0.0.1 3306</span><br><span class="line">Trying 127.0.0.1...</span><br><span class="line">telnet: connect to address 127.0.0.1: Cannot assign requested address</span><br></pre></td></tr></table></figure><h3 id="99-VS-111"><a href="#99-VS-111" class="headerlink" title="99 VS 111"></a>99 VS 111</h3><p>在MySQL错误信息中,<code>ERROR 2003 (HY000)</code>是一个通用的连接失败错误。错误之后的括号中的数字代表的是系统级别的错误码,与MySQL本身的错误代码不同,它们来自于操作系统,表示尝试建立网络连接时遇到了错误。错误码<code>99</code> 和 <code>111</code> 具体代表以下含义:</p><ul><li>**<code>(99)</code>**:这个错误码通常与网络配置相关。在大多数情况下,这个错误产生于Linux系统,并对应于<code>EADDRNOTAVAIL</code>错误,意义是”Cannot assign requested address”。当尝试绑定到无法分配的本地地址时,就会遇到这个错误。在尝试连接到MySQL服务器时,如果客户端使用了一个不存在的网络接口,例如,错误配置的TCP端口或地址,就有可能产生这个错误。</li><li>**<code>(111)</code>**:这个错误码同样在Linux系统中更常见,对应于<code>ECONNREFUSED</code>错误,意义是”Connection refused”。当连接请求被远程主机或中间网络设施(如防火墙)明确拒绝时,就会遇到这个错误。在MySQL的上下文中,<code>(111)</code>错误可能表明MySQL服务没有在指定地址或端口上运行,或是防火墙设置阻止了连接。这也可能表明MySQL配置中的<code>bind-address</code>参数错误,设置为了仅允许本地连接。</li></ul><p>解决错误<code>99</code>通常需要确保客户端是在向正确配置的地址发起连接,而解决错误<code>111</code>则可能需要检查MySQL服务是否运行、防火墙的设置以及<code>my.cnf</code>或<code>my.ini</code>中<code>bind-address</code>参数的配置。</p><p><img src="https://cdn.jsdelivr.net/gh/plantegg/plantegg.github.io/images/951413iMgBlog/image-20240328122311991.png" alt="image-20240328122311991"></p><h3 id="抓包解读"><a href="#抓包解读" class="headerlink" title="抓包解读"></a>抓包解读</h3><h4 id="服务端主动断开"><a href="#服务端主动断开" class="headerlink" title="服务端主动断开"></a>服务端主动断开</h4><p>如下图是出问题的其中一次抓包,我们可以通过这个抓包来详细解析问题出在哪里。这是Sysbench(38692端口) 主动连MySQL Server(3306 端口),3次TCP 握手正常后Server 发送了 Server Greeting(截图中第四个包),然后Sysbench所在的Linux OS 38692端口回复了ack(这个ack 动作不需要Sysbench参与,完全由OS 来处理),这个时候 Sysbench应该读走这个 Server Greeting包并按MySQL 协议发送客户端账号密码,但是没有,过了10秒钟后(图中绿框) Server 再次发送了 1159 错误也就是图中红框,1159表示 Server等了10秒钟也没等到Client的账号密码,于是超时报错</p><p><img src="https://cdn.jsdelivr.net/gh/plantegg/plantegg.github.io/images/951413iMgBlog/image-20240407103809182.png" alt="image-20240407103809182"></p><h4 id="客户端非正常主动断开"><a href="#客户端非正常主动断开" class="headerlink" title="客户端非正常主动断开"></a>客户端非正常主动断开</h4><p>如下图,Server 端回复了 Greeting,本该JDBC Client 发起 login 流程,但是因为这里Server 是 8.0,但是 JDBC Driver 用的5.7 导致兼容性问题,Client 主动断开了</p><p><img src="https://cdn.jsdelivr.net/gh/plantegg/plantegg.github.io/images/951413iMgBlog/image-20240704084229482.png" alt="image-20240704084229482"></p><p>这是 Server 认为通信异常,于是返回:Error message: Got an error reading communication packets 即1158 报错</p><h2 id="延伸"><a href="#延伸" class="headerlink" title="延伸"></a>延伸</h2><p>类似的分析手段,解决其他问题</p><blockquote><p>体验抓包分析,的确……很快就找到了问题点。local_infile=0 时,libmariadb 会在 login 包中设置标志位为 0,但是 libmysqlclient 仍然是 1,这是诡异点1。Server DB产品 在标志位为 0 时会报登录信息错误,这是诡异点2。</p></blockquote><p><img src="https://cdn.jsdelivr.net/gh/plantegg/plantegg.github.io/images/951413iMgBlog/image-20240403144037002.png" alt="image-20240403144037002"></p><p>这条研发人员根据重现的抓包很快定位到了是Server DB产品的Bug,简单来说 Server DB产品对MySQL 协议实现得不好,如图的flag设置为0的话就会被当成 MySQL ping 协议来处理,感叹下还是抓包好使,要不还得去看账号权限啥的</p><p>换个MySQL Client 就糊弄过去了;</p><p>但是如果去分析就能发现就那一个bit的差异,一定是Server 导致了问题,Server研发在铁的证据面前快速定位是产品bug,但是你如果不会抓包分析,一看报错是账号、权限错误就寄了——程序员对别人说的一个字都不要信,只信自己看到的</p><p>再回想想我们平时放弃的那些问题、那些撕逼撕不清楚的锅等等</p><h2 id="参考资料-1"><a href="#参考资料-1" class="headerlink" title="参考资料"></a>参考资料</h2><p>sysbench benchmark MySQL 时候,为什么 kill 连接后,sysbench 没有重连 <a href="https://exfly.github.io/why-sysbench-dont-reconnect-after-kill/" target="_blank" rel="noopener">https://exfly.github.io/why-sysbench-dont-reconnect-after-kill/</a></p><p>LittleFatz:<a href="https://www.wolai.com/3cXBGNzWkW7oqKJQ5VKM3X" target="_blank" rel="noopener">https://www.wolai.com/3cXBGNzWkW7oqKJQ5VKM3X</a> </p><p>WindSoilde*:<a href="https://articles.zsxq.com/id_9a8b4f87zv6l.html" target="_blank" rel="noopener">https://articles.zsxq.com/id_9a8b4f87zv6l.html</a> </p><p>@邹扒皮.com:<a href="https://articles.zsxq.com/id_d27pgzmhuq08.html" target="_blank" rel="noopener">https://articles.zsxq.com/id_d27pgzmhuq08.html</a> 最佳作业</p><p>最终问题的修复也有星球成员提交给社区并被合并了:<a href="https://github.com/akopytov/sysbench/pull/528/files" target="_blank" rel="noopener">https://github.com/akopytov/sysbench/pull/528/files</a> </p><h2 id="如果你觉得看完对你很有帮助可以通过如下方式找到我"><a href="#如果你觉得看完对你很有帮助可以通过如下方式找到我" class="headerlink" title="如果你觉得看完对你很有帮助可以通过如下方式找到我"></a>如果你觉得看完对你很有帮助可以通过如下方式找到我</h2><p>find me on twitter: <a href="https://twitter.com/plantegg" target="_blank" rel="noopener">@plantegg</a></p><p>知识星球:<a href="https://t.zsxq.com/0cSFEUh2J" target="_blank" rel="noopener">https://t.zsxq.com/0cSFEUh2J</a></p><p>开了一个星球,在里面讲解一些案例、知识、学习方法,肯定没法让大家称为顶尖程序员(我自己都不是),只是希望用我的方法、知识、经验、案例作为你的垫脚石,帮助你快速、早日成为一个基本合格的程序员。</p><p>争取在星球内:</p><ul><li>养成基本动手能力</li><li>拥有起码的分析推理能力–按我接触的程序员,大多都是没有逻辑的</li><li>知识上教会你几个关键的知识点</li></ul><img src="https://cdn.jsdelivr.net/gh/plantegg/plantegg.github.io/images/951413iMgBlog/image-20240324161113874-5525682.png" alt="image-20240324161113874" style="zoom:50%;">]]></content>
<summary type="html">
<h1 id="一次故障的诊断过程–Sysbench"><a href="#一次故障的诊断过程–Sysbench" class="headerlink" title="一次故障的诊断过程–Sysbench"></a>一次故障的诊断过程–Sysbench</h1><h2 id="背
</summary>
<category term="MySQL" scheme="https://plantegg.github.io/categories/MySQL/"/>
<category term="MySQL" scheme="https://plantegg.github.io/tags/MySQL/"/>
<category term="sysbench" scheme="https://plantegg.github.io/tags/sysbench/"/>
<category term="tcp" scheme="https://plantegg.github.io/tags/tcp/"/>
<category term="debug" scheme="https://plantegg.github.io/tags/debug/"/>
</entry>
<entry>
<title>历时5年的net_write_timeout 报错分析</title>
<link href="https://plantegg.github.io/2024/09/25/%E4%B8%80%E4%B8%AA%E5%8E%86%E6%97%B65%E5%B9%B4%E7%9A%84%E9%97%AE%E9%A2%98%E5%88%86%E6%9E%90/"/>
<id>https://plantegg.github.io/2024/09/25/一个历时5年的问题分析/</id>
<published>2024-09-25T09:30:03.000Z</published>
<updated>2024-12-30T02:31:20.362Z</updated>
<content type="html"><![CDATA[<h1 id="历时5年的net-write-timeout-报错分析"><a href="#历时5年的net-write-timeout-报错分析" class="headerlink" title="历时5年的net_write_timeout 报错分析"></a>历时5年的net_write_timeout 报错分析</h1><p>全网关于 JDBC 报错:net_write_timeout 的最好/最全总结</p><h2 id="前言"><a href="#前言" class="headerlink" title="前言"></a>前言</h2><p>上一次为了讲如何分析几百万个抓包,所以把这个问题中的一部分简化写了这篇抓包篇:<a href="https://articles.zsxq.com/id_lznw3w4zieuc.html" target="_blank" rel="noopener">https://articles.zsxq.com/id_lznw3w4zieuc.html</a> 建议你先去看看把场景简化下,然后本篇中的分析涉及抓包部分就不再啰嗦讲解,请看抓包篇</p><h2 id="问题描述"><a href="#问题描述" class="headerlink" title="问题描述"></a>问题描述</h2><p>用户为了做数据分析需要把160个DB中的数据迁移到另外一个只读库中,有专门的迁移工具,但是这个迁移工具跑一阵后总是报错,报错堆栈显示是Tomcat 到DB之间的连接出了异常:</p><figure class="highlight plain"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br></pre></td><td class="code"><pre><span class="line">Caused by: com.mysql.jdbc.exceptions.jdbc4.CommunicationsException: Application was streaming results when the connection failed. Consider raising value of 'net_write_timeout' on the server.</span><br><span class="line"> at sun.reflect.GeneratedConstructorAccessor150.newInstance(Unknown Source)</span><br><span class="line"> at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)</span><br><span class="line"> at java.lang.reflect.Constructor.newInstance(Constructor.java:423)</span><br><span class="line"> at com.mysql.jdbc.Util.handleNewInstance(Util.java:425)</span><br><span class="line"> at com.mysql.jdbc.SQLError.createCommunicationsException(SQLError.java:989)</span><br><span class="line"> at com.mysql.jdbc.MysqlIO.reuseAndReadPacket(MysqlIO.java:3749)</span><br><span class="line"> at com.mysql.jdbc.MysqlIO.reuseAndReadPacket(MysqlIO.java:3649)</span><br><span class="line"> at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:4090)</span><br><span class="line"> at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:972)</span><br><span class="line"> at com.mysql.jdbc.MysqlIO.nextRow(MysqlIO.java:2123)</span><br><span class="line"> at com.mysql.jdbc.RowDataDynamic.nextRecord(RowDataDynamic.java:374)</span><br><span class="line"> at com.mysql.jdbc.RowDataDynamic.next(RowDataDynamic.java:354)</span><br><span class="line"> at com.mysql.jdbc.RowDataDynamic.close(RowDataDynamic.java:155)</span><br><span class="line"> at com.mysql.jdbc.ResultSetImpl.realClose(ResultSetImpl.java:6726)</span><br><span class="line"> at com.mysql.jdbc.ResultSetImpl.close(ResultSetImpl.java:865)</span><br><span class="line"> at com.alibaba.druid.pool.DruidPooledResultSet.close(DruidPooledResultSet.java:86)</span><br></pre></td></tr></table></figure><p>这个异常堆栈告诉我们Tomcat 到Database之间的连接异常了,似乎是 net_write_timeout 超时导致的</p><p>对应业务结构:</p><p><img src="https://cdn.jsdelivr.net/gh/plantegg/plantegg.github.io/images/951413iMgBlog/image-20230706210452742.png" alt="image-20230706210452742"></p><h2 id="net-write-timeout-原理简介"><a href="#net-write-timeout-原理简介" class="headerlink" title="net_write_timeout 原理简介"></a>net_write_timeout 原理简介</h2><p>先看下 <a href="https://dev.mysql.com/doc/refman/5.7/en/server-system-variables.html#sysvar_net_write_timeout" target="_blank" rel="noopener"><code>net_write_timeout</code></a>的解释:</p><blockquote><p>The number of seconds to wait for a block to be written to a connection before aborting the write. 只针对执行查询中的等待超时,网络不好,tcp buffer满了(应用迟迟不读走数据)等容易导致mysql server端报net_write_timeout错误,指的是mysql server hang在那里长时间无法发送查询结果。</p></blockquote><p>报这个错就是DB 等了net_write_timeout这么久没写数据,可能是Tomcat 端卡死没有读走数据。</p><p>但是根据我多年来和这个报错打交道的经验告诉我:这个报错不只是因为net_write_timeout 超时导致的,任何Tomcat 到 DB间的连接断开了,都报这个错误,原因是JDBC 驱动搞不清楚断开的具体原因,统统当 net_write_timeout 了</p><p>一定要记住这个原理。如果这里不理解可以进一步阅读:<a href="https://wx.zsxq.com/dweb2/index/topic_detail/412251415855228" target="_blank" rel="noopener">https://wx.zsxq.com/dweb2/index/topic_detail/412251415855228</a> </p><h2 id="分析"><a href="#分析" class="headerlink" title="分析"></a>分析</h2><p>首先把Tomcat 集群从负载均衡上摘一个下来,这样没有业务流量干扰更利于测试和分析日志</p><p>然后让迁移数据工具直接连这个没有流量的节点,问题仍然稳定重现。</p><p>进一步提取迁移工具的SQL,然后走API手工提交给Tomcat 执行,问题仍然稳定重现,现在重现越来越简单了,效率高多了。</p><h3 id="Tomcat-上抓包"><a href="#Tomcat-上抓包" class="headerlink" title="Tomcat 上抓包"></a>Tomcat 上抓包</h3><p>因为没有业务流量干扰,抓包很干净,但是因为DB 节点太多,所以量还是很大的,分析如抓包篇:<a href="https://articles.zsxq.com/id_lznw3w4zieuc.html" target="_blank" rel="noopener">https://articles.zsxq.com/id_lznw3w4zieuc.html</a> </p><p>如下图红框所示的地方可以看到MySQL Server 传着传着居然带了个 fin 包在里面,表示MySQL Server要断开连接了,无奈Client只能也发送quit 断开连接。红框告诉我们一个无比有力的证据MySQL Server 在不应该断开的地方断开了连接,问题在 MySQL Server 端</p><p><img src="https://cdn.jsdelivr.net/gh/plantegg/plantegg.github.io/images/951413iMgBlog/image-20230620141017987.png" alt="image-20230620141017987"></p><p>看起来是Database 主动端开了连接,因为这个过程Tomcat 不需要发任何东西给 Database。这个现象5年前在其它用户场景下就抓到过了,最后问题也不了了之,这次希望搞清楚</p><h3 id="Database-分析"><a href="#Database-分析" class="headerlink" title="Database 分析"></a>Database 分析</h3><p>打开 DB 日志,捞取全量日志可以看到 DB 断开的原因是收到了kill Query!</p><p>有这个结果记住上面抓包图,以后类似这样莫名起来 DB 主动断开大概率就是 kill Query 导致的(经验攒得不容易!)</p><h3 id="Database-抓包"><a href="#Database-抓包" class="headerlink" title="Database 抓包"></a>Database 抓包</h3><p>确实能抓到kill,而且从用户账号来看就是从 Tomcat 发过去的!</p><h3 id="继续分析Tomcat-抓包"><a href="#继续分析Tomcat-抓包" class="headerlink" title="继续分析Tomcat 抓包"></a>继续分析Tomcat 抓包</h3><p>从 DB 分析来看还是有人主动 kill 导致的,所以继续分析Tomcat的抓包看是不是因为代码bug导致Tomcat 发了kill 给DB</p><p>大海捞针,搜 kill,找Tomcat 发给DB的tcp length 长度是16-20的(刚好容纳kill id) 总的来说就是找不到,很神奇</p><p>由于 DB上记录的 Tomcat IP、port 都被中间链路转换过几次了,根本没办法一一对应搞清楚是哪个Tomcat 节点发出来的</p><h3 id="继续尝试重现"><a href="#继续尝试重现" class="headerlink" title="继续尝试重现"></a>继续尝试重现</h3><p>分析完Tomcat 业务代码后感觉业务不会去kill,于是灵机一动在没有流量的Tomcat上跑了一个Sleep 600秒,不用任何数据,神奇的问题也稳定重现了,这下大概知道什么原因了,肯定是客户自己加了慢查询监控逻辑,一旦发现慢查询就 kill</p><p>于是问客户是不是有这种监控,果然有,停掉后反复重试不再有问题!</p><p>测试环境手工触发kill,然后能抓到下发的kill Query 给Database</p><p><img src="https://cdn.jsdelivr.net/gh/plantegg/plantegg.github.io/images/951413iMgBlog/image-20230707150658392.png" alt="image-20230707150658392"></p><h2 id="未解谜题"><a href="#未解谜题" class="headerlink" title="未解谜题"></a>未解谜题</h2><p>为什么没在Tomcat 抓到发给Database的 kill ?</p><p>我反复去重现了,如果是我手工触发Tomcat kill是可以清晰地抓到Tomcat 会发160个kill 给Database,但是我任其自然等待用户监控来杀就一定抓不到kill 下发给DB</p><p>我猜和 Tomcat 集群有关,首先用户监控是走的LVS,通过其中一个Tomcat 可以查询到所有 Tomcat 上的请求,然后发起 kill</p><p>但因为节点太多无法证实!当然业务监控也可以监控DB 然后直接发kill,但是和抓包看到的发起kill的用户不对,发起 kill 的用户是Tomcat独一无二的。</p><h2 id="JDBC驱动报错-net-write-timeout-结论"><a href="#JDBC驱动报错-net-write-timeout-结论" class="headerlink" title="JDBC驱动报错 net_write_timeout 结论"></a>JDBC驱动报错 net_write_timeout 结论</h2><blockquote><p>Application was streaming results when the connection failed. Consider raising value of ‘net_write_timeout’ on the server. - com.mysql.jdbc.exceptions.jdbc4.CommunicationsException: Application was streaming results when the connection failed. Consider raising value of ‘net_write_timeout’ on the server.</p></blockquote><p>这个报错不一定是 <code>net_write_timeout</code> 设置过小导致的,<strong>JDBC 在 streaming 流模式下只要连接异常就会报如上错误</strong>,比如:</p><ul><li>连接被 TCP reset</li><li>RDS 前端自带的Proxy 主动断开连接</li><li>连接因为某种原因(比如 QueryTimeOut) 触发 kill Query导致连接中断</li><li>RDS <a href="https://aone.alibaba-inc.com/v2/project/687880/bug/50491193" target="_blank" rel="noopener">端因为</a>kill 主动断开连接 //比如用户监控RDS、DRDS脚本杀掉慢查询</li></ul><p>net_write_timeout:表示这么长时间RDS/DN 无法写数据到网络层发给DRDS/CN,原因是DRDS/CN 长时间没将数据读走</p><h2 id="总结"><a href="#总结" class="headerlink" title="总结"></a>总结</h2><p>首先一个错误现象对应多个完全不一样的错误原因是非常头疼的,这个问题反反复复在多个场景下出现,当然原因各异,但是这个传数据途中 DB 主动 fin连接还是第一次碰到并搞清楚,同样主动 fin 不一定是kill,但是我们要依照证据推进问题,既然是DB fin就有必要先从DB 来看原因。</p><p>从这个问题你可以先从什么是JDBC 流模式出发(mysql –quick 就是流模式,你可以快速查一个大数据试试;然后去掉–quick 对比一下),结合网络buffer 来了解流模式:<a href="https://plantegg.github.io/2020/07/03/MySQL%20JDBC%20StreamResult%20%E5%92%8C%20net_write_timeout/">https://plantegg.github.io/2020/07/03/MySQL%20JDBC%20StreamResult%20%E5%92%8C%20net_write_timeout/</a></p><p>然后从流模式来学习MySQL 的 net_write_timeout,假如你的代码报了 net_write_timeout 你会分析吗?</p><p>最后从连接断开去总结,比如网络不好、比如内核bug、比如DB crash、比如 kill、比如……都会导致连接断开,但这一切对业务来说只有 net_write_timeout 一个现象</p><p>这个问题分享出来是因为非常综合,我惊抱怨 socketTimeout、Communication failure等异常,这些异常也挺常见导致的原因多种,但是和 net_write_timeout 比起来还是不如 net_write_timeout 更综合,所以分享给大家,建议这几篇一起阅读效果最好!</p><h2 id="实验模拟-Consider-raising-value-of-‘net-write-timeout’"><a href="#实验模拟-Consider-raising-value-of-‘net-write-timeout’" class="headerlink" title="实验模拟 Consider raising value of ‘net_write_timeout’"></a>实验模拟 Consider raising value of ‘net_write_timeout’</h2><p>使用 Java MySQL JDBC Driver 的同学经常碰到如下错误堆栈,到底这个错误是 net_write_timeout 设置太小还是别的原因也会导致这个问题?需要我们用实验验证一下: </p><figure class="highlight plain"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br></pre></td><td class="code"><pre><span class="line">com.mysql.jdbc.exceptions.jdbc4.CommunicationsException: Application was streaming results when the connection failed. Consider raising value of 'net_write_timeout' on the server.</span><br><span class="line">at java.base/jdk.internal.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)</span><br><span class="line">at java.base/jdk.internal.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:77)</span><br><span class="line">at java.base/jdk.internal.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)</span><br><span class="line">at java.base/java.lang.reflect.Constructor.newInstanceWithCaller(Constructor.java:500)</span><br><span class="line">at java.base/java.lang.reflect.Constructor.newInstance(Constructor.java:481)</span><br><span class="line">at com.mysql.jdbc.Util.handleNewInstance(Util.java:425)</span><br><span class="line">at com.mysql.jdbc.SQLError.createCommunicationsException(SQLError.java:990)</span><br><span class="line">at com.mysql.jdbc.MysqlIO.reuseAndReadPacket(MysqlIO.java:3559)</span><br><span class="line">at com.mysql.jdbc.MysqlIO.reuseAndReadPacket(MysqlIO.java:3459)</span><br><span class="line">at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:3900)</span><br><span class="line">at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:873)</span><br><span class="line">at com.mysql.jdbc.MysqlIO.nextRow(MysqlIO.java:1996)</span><br><span class="line">at com.mysql.jdbc.RowDataDynamic.nextRecord(RowDataDynamic.java:374)</span><br><span class="line">at com.mysql.jdbc.RowDataDynamic.next(RowDataDynamic.java:354)</span><br><span class="line">at com.mysql.jdbc.ResultSetImpl.next(ResultSetImpl.java:6312)</span><br><span class="line">at Test.main(Test.java:38)</span><br><span class="line">Caused by: java.io.EOFException: Can not read response from server. Expected to read 8 bytes, read 3 bytes before connection was unexpectedly lost.</span><br><span class="line">at com.mysql.jdbc.MysqlIO.readFully(MysqlIO.java:3011)</span><br><span class="line">at com.mysql.jdbc.MysqlIO.reuseAndReadPacket(MysqlIO.java:3519)</span><br><span class="line">... 8 more</span><br></pre></td></tr></table></figure><p>JDBC 驱动对这个错误有如下提示(坑人):</p><blockquote><p>Application was streaming results when the connection failed. Consider raising value of ‘net_write_timeout’ on the server. - com.mysql.jdbc.exceptions.jdbc4.CommunicationsException: Application was streaming results when the connection failed. Consider raising value of ‘net_write_timeout’ on the server.</p></blockquote><p>实验中的一些说明:</p><ol><li>netTimeoutForStreamingResults=1 表示设置 net_write_timeout 为 1 秒,客户端会发送 set net_write_timeout=1 给数据库</li><li>conn.setAutoCommit(false); //流式读取必须要 关闭自动提交</li><li>stmt.setFetchSize(Integer.MIN_VALUE);</li></ol><p>以上 2/3是触发流式读取的必要条件,第一条不设置默认是 600 秒,比较难等 :) </p><p>如果确实是 net_write_timeout 太小超时了, RDS 直接发 fin(但是 fin 前面还有一堆 response 包也在排队),然后 RDS 日志先报错:</p><figure class="highlight plain"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">2024-11-28T14:33:03.447397Z 12 [Note] Aborted connection 12 to db: 'test' user: 'root' host: '172.26.137.130' (Got timeout writing communication packets)</span><br></pre></td></tr></table></figure><p>此时客户端还慢悠悠地读,RDS 没有回任何错误信息给客户端,客户端读完所有 Response 然后直接读到连接断开就报 Consider raising value of ‘net_write_timeout’ on the server 了,如果客户端读的慢,比如要 10 分钟实际连接在 RDS 上 10 分钟前就进入 fin 了,但是 10 分钟后客户端才报错</p><figure class="highlight plain"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br></pre></td><td class="code"><pre><span class="line">#netstat -anto | grep "3307"</span><br><span class="line">tcp6 0 0 :::3307 :::* LISTEN off (0.00/0/0)</span><br><span class="line">tcp6 0 140192 172.26.137.120:3307 172.26.137.130:51216 ESTABLISHED probe (0.04/0/0)</span><br><span class="line">2024年 11月 28日 星期四 15:01:43 CST</span><br><span class="line"></span><br><span class="line">//1秒中后此时 数据库感知到超时于是调用 close 断开连接,触发发送 fin给客户端,但是 fin 也需要排队,所以 140192增加了 1 变成140193</span><br><span class="line">tcp6 0 0 :::3307 :::* LISTEN off (0.00/0/0)</span><br><span class="line">tcp6 0 140193 172.26.137.120:3307 172.26.137.130:51216 FIN_WAIT1 probe (0.58/0/1)</span><br><span class="line">2024年 11月 28日 星期四 15:01:44 CST</span><br></pre></td></tr></table></figure><p>重现代码,数据库上构造一个大表,比如 10万行就行,能堆满默认的 tcp buffer size:</p><figure class="highlight plain"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br><span class="line">39</span><br><span class="line">40</span><br><span class="line">41</span><br><span class="line">42</span><br><span class="line">43</span><br><span class="line">44</span><br><span class="line">45</span><br><span class="line">46</span><br><span class="line">47</span><br><span class="line">48</span><br><span class="line">49</span><br><span class="line">50</span><br><span class="line">51</span><br><span class="line">52</span><br><span class="line">53</span><br></pre></td><td class="code"><pre><span class="line">import java.sql.Connection;</span><br><span class="line">import java.sql.DriverManager;</span><br><span class="line">import java.sql.ResultSet;</span><br><span class="line">import java.sql.SQLException;</span><br><span class="line">import java.sql.Statement;</span><br><span class="line">import java.sql.PreparedStatement;</span><br><span class="line"></span><br><span class="line">/*</span><br><span class="line"> * 编译:</span><br><span class="line"> * javac -cp /root/java/*:. Test.java</span><br><span class="line"> * 运行:</span><br><span class="line"> * java -cp .:./mysql-connector-java-5.1.45.jar Test "jdbc:mysql://gf1:3307/test?useSSL=false&useServerPrepStmts=true&cachePrepStmts=true&connectTimeout=500&socketTimeout=1700&netTimeoutForStreamingResults=1" root 123 "select *, id from streaming " 3000</span><br><span class="line"> * netTimeoutForStreamingResults=1 表示RDS 等超过 1 秒都因为 tcp buffer 满无法继续发送数据就断开连接</span><br><span class="line"> * */</span><br><span class="line">public class Test {</span><br><span class="line"> private static String url;</span><br><span class="line"> private Str name;</span><br><span class="line"></span><br><span class="line"> public static void main(String args[]) throws NumberFormatException, InterruptedException, ClassNotFoundException {</span><br><span class="line"> Class.forName("com.mysql.jdbc.Driver");</span><br><span class="line"> url = args[0];</span><br><span class="line"> String user = args[1];</span><br><span class="line"> String pass = args[2];</span><br><span class="line"> String sql = args[3];</span><br><span class="line"> String interval = args[4];</span><br><span class="line"> try {</span><br><span class="line"> Connection conn = DriverManager.getConnection(url, user, pass);</span><br><span class="line"> while (true) {</span><br><span class="line"></span><br><span class="line">conn.setAutoCommit(false);</span><br><span class="line">Statement stmt = conn.createStatement();</span><br><span class="line">stmt.setFetchSize(Integer.MIN_VALUE);</span><br><span class="line"></span><br><span class="line"> long start = System.currentTimeMillis();</span><br><span class="line"> ResultSet rs = stmt.executeQuery(sql);</span><br><span class="line"> int count=0;</span><br><span class="line"> while (rs.next()) {</span><br><span class="line"> System.out.println("id:"+rs.getInt("id")+" count:"+count);</span><br><span class="line"> count++;</span><br><span class="line"> if(count<3) //1 秒后数据库端连接就已经关闭了,但是因为客户端读得慢,需要不 sleep 后才能读到 fin 然后报错,所以报错可以比实际晚很久</span><br><span class="line"> Thread.sleep(1500);</span><br><span class="line">}</span><br><span class="line"> rs.close();</span><br><span class="line"> stmt.close();</span><br><span class="line"> Thread.sleep(Long.valueOf(interval));</span><br><span class="line">break;</span><br><span class="line"> }</span><br><span class="line"> conn.close();</span><br><span class="line"> } catch (SQLException e) {</span><br><span class="line"> e.printStackTrace();</span><br><span class="line"> }</span><br><span class="line"> }</span><br><span class="line">}</span><br></pre></td></tr></table></figure><p>Consider raising value of ‘net_write_timeout’ 这个报错数据库端不会返回任何错误码给客户端,只是发 fin 断开连接,对客户端来说这条连接是 net_write_timeout 超时了 还是 被kill(或者其他原因) 是没法区分的,所以不管什么原因,只要连接异常 MySQL JDBC Driver 就抛 net_write_timeout 错误</p><p>如图,3 秒钟后 fin 包混在数据库 response 就被发到了客户端,实际2 秒前数据库已经报错了,也就是客户端和数据库端报错时间会差 2 秒(具体差几秒取决于重现代码里 sleep 多久然后-1)</p><p><img src="https://cdn.jsdelivr.net/gh/plantegg/plantegg.github.io/images/951413iMgBlog//image-20241128151324385.png" alt="image-20241128151324385"></p><h3 id="实验总结"><a href="#实验总结" class="headerlink" title="实验总结"></a>实验总结</h3><p>这个报错不一定是 <code>net_write_timeout</code> 设置过小导致的,<strong>JDBC 在 streaming 流模式下只要连接异常就会报如上错误</strong>,比如:</p><ul><li>连接被 TCP reset</li><li>RDS 前端自带的Proxy 主动断开连接</li><li>连接因为某种原因(比如 QueryTimeOut) 触发 kill Query导致连接中断</li><li>RDS 端因为kill 主动断开连接 //比如用户监控RDS 脚本杀掉慢查询</li><li>开启流式读取后,只要客户端在读取查询结果没有结束就读到了 fin 包就会报这个错误</li></ul><p>可以将 netTimeoutForStreamingResults 设为 0 或者 100,然后在中途 kill 掉 MySQL 上的 SQL,你也会在客户端看到同样的错误, kill SQL 是在 MySQL 的报错日志中都是同样的:</p><figure class="highlight plain"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">2024-11-28T07:33:12.967012Z 23 [Note] Aborted connection 23 to db: 'test' user: 'root' host: '172.26.137.130' (Got an error writing communication packets)</span><br></pre></td></tr></table></figure><p>所以你看一旦客户端出现这个异常堆栈,除了抓包似乎没什么好办法,其实抓包也只能抓到数据库主动发了 fin 什么原因还是不知道,我恨这个没有错误码一统江湖的报错</p><p>net_write_timeout 后 RDS 直接发 fin(有时 fin 前面还有一堆 response 包也在排队),然后 rds 日志先报错:2024-11-28T06:33:03.447397Z 12 [Note] Aborted connection 12 to db: ‘test’ user: ‘root’ host: ‘172.26.137.130’ (Got timeout writing communication packets)</p><p>客户端慢悠悠地读,RDS 没有传任何错误信息给客户端,客户端读完所有 response 然后直接读到连接断开就报 Consider raising value of ‘net_write_timeout’ on the server 了,如果客户端读的慢,比如要 10 分钟实际连接在 RDS 上 10 分钟前就进入 fin 了,但是 10 分钟后客户端才报错</p><p>进阶阅读:<a href="https://plantegg.github.io/2024/09/25/%E4%B8%80%E4%B8%AA%E5%8E%86%E6%97%B65%E5%B9%B4%E7%9A%84%E9%97%AE%E9%A2%98%E5%88%86%E6%9E%90/">https://plantegg.github.io/2024/09/25/%E4%B8%80%E4%B8%AA%E5%8E%86%E6%97%B65%E5%B9%B4%E7%9A%84%E9%97%AE%E9%A2%98%E5%88%86%E6%9E%90/</a> 和 <a href="https://x.com/plantegg/status/1867535551337050153" target="_blank" rel="noopener">https://x.com/plantegg/status/1867535551337050153</a></p>]]></content>
<summary type="html">
<h1 id="历时5年的net-write-timeout-报错分析"><a href="#历时5年的net-write-timeout-报错分析" class="headerlink" title="历时5年的net_write_timeout 报错分析"></a>历时5年的
</summary>
<category term="MySQL" scheme="https://plantegg.github.io/categories/MySQL/"/>
<category term="MySQL" scheme="https://plantegg.github.io/tags/MySQL/"/>
<category term="JDBC" scheme="https://plantegg.github.io/tags/JDBC/"/>
<category term="kill" scheme="https://plantegg.github.io/tags/kill/"/>
<category term="net_write_timeout" scheme="https://plantegg.github.io/tags/net-write-timeout/"/>
<category term="timeout" scheme="https://plantegg.github.io/tags/timeout/"/>
</entry>
<entry>
<title>长连接黑洞重现和分析</title>
<link href="https://plantegg.github.io/2024/05/05/%E9%95%BF%E8%BF%9E%E6%8E%A5%E9%BB%91%E6%B4%9E%E9%87%8D%E7%8E%B0%E5%92%8C%E5%88%86%E6%9E%90-public/"/>
<id>https://plantegg.github.io/2024/05/05/长连接黑洞重现和分析-public/</id>
<published>2024-05-05T00:30:03.000Z</published>
<updated>2024-11-20T10:00:55.322Z</updated>
<content type="html"><![CDATA[<h1 id="长连接黑洞重现和分析"><a href="#长连接黑洞重现和分析" class="headerlink" title="长连接黑洞重现和分析"></a>长连接黑洞重现和分析</h1><p>这是一个存在多年,遍及各个不同的业务又反反复复地在集团内部出现的一个问题,本文先通过重现展示这个问题,然后从业务、数据库、OS等不同的角度来分析如何解决它,这个问题值得每一位研发同学重视起来,避免再次踩到</p><h2 id="背景"><a href="#背景" class="headerlink" title="背景"></a>背景</h2><p>为了高效率应对故障,本文尝试回答如下一些问题:</p><ul><li>为什么数据库crash 重启恢复后,业务还长时间不能恢复?</li><li>我依赖的业务做了高可用切换,但是我的业务长时间报错</li><li>我依赖的服务下掉了一个节点,为什么我的业务长时间报错 </li><li>客户做变配,升级云服务节点规格,为什么会导致客户业务长时间报错</li></ul><p>目的:希望通过这篇文章尽可能地减少故障时长、让业务快速从故障中恢复</p><h2 id="重现"><a href="#重现" class="headerlink" title="重现"></a>重现</h2><p>空说无凭,先也通过一次真实的重现来展示这个问题</p><h3 id="LVS-MySQL-高可用切换"><a href="#LVS-MySQL-高可用切换" class="headerlink" title="LVS+MySQL 高可用切换"></a>LVS+MySQL 高可用切换</h3><p>OS 默认配置参数</p><figure class="highlight plain"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br></pre></td><td class="code"><pre><span class="line">#sysctl -a |grep -E "tcp_retries|keepalive"</span><br><span class="line">net.ipv4.tcp_keepalive_intvl = 30</span><br><span class="line">net.ipv4.tcp_keepalive_probes = 5</span><br><span class="line">net.ipv4.tcp_keepalive_time = 10</span><br><span class="line">net.ipv4.tcp_retries1 = 3</span><br><span class="line">net.ipv4.tcp_retries2 = 15 //主要是这个参数,默认以及alios 几乎都是15</span><br></pre></td></tr></table></figure><p>LVS 对外服务端口是3001, 后面挂的是 3307,假设3307是当前的Master,Slave是 3306,当检测到3307异常后会从LVS 上摘掉 3307挂上 3306做高可用切换</p><p><img src="https://cdn.jsdelivr.net/gh/plantegg/plantegg.github.io/images/951413iMgBlog/1713838496899-274cdfbd-aa6e-4f1f-9fcc-16725593c25e.png" alt="undefined"></p><p>切换前的 LVS 状态</p><figure class="highlight plain"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br></pre></td><td class="code"><pre><span class="line">#ipvsadm -L --timeout</span><br><span class="line">Timeout (tcp tcpfin udp): 900 120 300</span><br><span class="line">#ipvsadm -L -n</span><br><span class="line">IP Virtual Server version 1.2.1 (size=4096)</span><br><span class="line">Prot LocalAddress:Port Scheduler Flags</span><br><span class="line"> -> RemoteAddress:Port Forward Weight ActiveConn InActConn</span><br><span class="line">TCP 127.0.0.1:3001 rr</span><br><span class="line"> -> 127.0.0.1:3307 Masq 1 0 0</span><br></pre></td></tr></table></figure><p>Sysbench启动压力模拟用户访问,在 31秒的时候模拟管控检测到 3307的Master无法访问,所以管控执行切主把 3306的Slave 提升为新的 Master,同时到 LVS 摘掉 3307,挂上3306,此时管控端着冰可乐、翘着二郎腿,得意地说,你就看吧我们管控牛逼不、我们的高可用牛逼不,这一套行云流水3秒钟不到全搞定</p><p>切换命令如下:</p><figure class="highlight plain"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br></pre></td><td class="code"><pre><span class="line">#cat del3307.sh</span><br><span class="line">ipvsadm -d -t 127.0.0.1:3001 -r 127.0.0.1:3307 ; ipvsadm -a -t 127.0.0.1:3001 -r 127.0.0.1:3306 -m</span><br></pre></td></tr></table></figure><p>此时Sysbench运行状态,在第 32秒如期跌0:</p><figure class="highlight plain"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br></pre></td><td class="code"><pre><span class="line">#/usr/local/bin/sysbench --debug=on --mysql-user='root' --mysql-password='123' --mysql-db='test' --mysql-host='127.0.0.1' --mysql-port='3001' --tables='16' --table-size='10000' --range-size='5' --db-ps-mode='disable' --skip-trx='on' --mysql-ignore-errors='all' --time='11080' --report-interval='1' --histogram='on' --threads=1 oltp_read_write run</span><br><span class="line">sysbench 1.1.0 (using bundled LuaJIT 2.1.0-beta3)</span><br><span class="line"></span><br><span class="line">Running the test with following options:</span><br><span class="line">Number of threads: 1</span><br><span class="line">Report intermediate results every 1 second(s)</span><br><span class="line">Debug mode enabled.</span><br><span class="line"></span><br><span class="line">Initializing random number generator from current time</span><br><span class="line"></span><br><span class="line"></span><br><span class="line">Initializing worker threads...</span><br><span class="line"></span><br><span class="line">DEBUG: Worker thread (#0) started</span><br><span class="line">DEBUG: Reporting thread started</span><br><span class="line">DEBUG: Worker thread (#0) initialized</span><br><span class="line">Threads started!</span><br><span class="line"></span><br><span class="line">[ 1s ] thds: 1 tps: 51.89 qps: 947.00 (r/w/o: 739.44/207.56/0.00) lat (ms,95%): 35.59 err/s 0.00 reconn/s: 0.00</span><br><span class="line">[ 2s ] thds: 1 tps: 60.03 qps: 1084.54 (r/w/o: 841.42/243.12/0.00) lat (ms,95%): 22.28 err/s 0.00 reconn/s: 0.00</span><br><span class="line">…………</span><br><span class="line">[ 29s ] thds: 1 tps: 68.00 qps: 1223.01 (r/w/o: 952.00/271.00/0.00) lat (ms,95%): 16.12 err/s 0.00 reconn/s: 0.00</span><br><span class="line">[ 30s ] thds: 1 tps: 66.00 qps: 1188.00 (r/w/o: 924.00/264.00/0.00) lat (ms,95%): 16.71 err/s 0.00 reconn/s: 0.00</span><br><span class="line">[ 31s ] thds: 1 tps: 67.00 qps: 1203.96 (r/w/o: 937.97/265.99/0.00) lat (ms,95%): 17.95 err/s 0.00 reconn/s: 0.00</span><br><span class="line">[ 32s ] thds: 1 tps: 22.99 qps: 416.85 (r/w/o: 321.88/94.96/0.00) lat (ms,95%): 15.55 err/s 0.00 reconn/s: 0.00</span><br><span class="line">[ 33s ] thds: 1 tps: 0.00 qps: 0.00 (r/w/o: 0.00/0.00/0.00) lat (ms,95%): 0.00 err/s 0.00 reconn/s: 0.00</span><br><span class="line">[ 34s ] thds: 1 tps: 0.00 qps: 0.00 (r/w/o: 0.00/0.00/0.00) lat (ms,95%): 0.00 err/s 0.00 reconn/s: 0.00</span><br><span class="line">[ 35s ] thds: 1 tps: 0.00 qps: 0.00 (r/w/o: 0.00/0.00/0.00) lat (ms,95%): 0.00 err/s 0.00 reconn/s: 0.00</span><br></pre></td></tr></table></figure><p>5分钟后故障报告大批量涌进来,客户:怎么回事,我们的业务挂掉10分钟了,报错都是访问MySQL 超时,赶紧给我看看,从监控确实看到10分钟后客户业务还没恢复:</p><figure class="highlight plain"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br></pre></td><td class="code"><pre><span class="line">[ 601s ] thds: 1 tps: 0.00 qps: 0.00 (r/w/o: 0.00/0.00/0.00) lat (ms,95%): 0.00 err/s 0.00 reconn/s: 0.00</span><br><span class="line">[ 602s ] thds: 1 tps: 0.00 qps: 0.00 (r/w/o: 0.00/0.00/0.00) lat (ms,95%): 0.00 err/s 0.00 reconn/s: 0.00</span><br><span class="line">[ 603s ] thds: 1 tps: 0.00 qps: 0.00 (r/w/o: 0.00/0.00/0.00) lat (ms,95%): 0.00 err/s 0.00 reconn/s: 0.00</span><br><span class="line">[ 604s ] thds: 1 tps: 0.00 qps: 0.00 (r/w/o: 0.00/0.00/0.00) lat (ms,95%): 0.00 err/s 0.00 reconn/s: 0.00</span><br></pre></td></tr></table></figure><p>这时 oncall 都被从被窝里拎了起来,不知谁说了一句赶紧恢复吧,先试试把应用重启,5秒钟后应用重启完毕,业务恢复,大家开心地笑了,又成功防御住一次故障升级,还是重启大法好!</p><p>在业务/Sysbench QPS跌0 期间可以看到 3307被摘掉,3306 成功挂上去了,但是没有新连接建向 3306,业务/Sysbench 使劲薅着 3307</p><figure class="highlight plain"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br></pre></td><td class="code"><pre><span class="line">#ipvsadm -L -n --stats -t 127.0.0.1:3001</span><br><span class="line">Prot LocalAddress:Port Conns InPkts OutPkts InBytes OutBytes</span><br><span class="line"> -> RemoteAddress:Port</span><br><span class="line">TCP 127.0.0.1:3001 2 660294 661999 78202968 184940K</span><br><span class="line"> -> 127.0.0.1:3306 0 0 0 0 0</span><br><span class="line"> </span><br><span class="line">#ipvsadm -Lcn | head -10</span><br><span class="line">IPVS connection entries</span><br><span class="line">pro expire state source virtual destination</span><br><span class="line">TCP 13:11 ESTABLISHED 127.0.0.1:33864 127.0.0.1:3001 127.0.0.1:3307</span><br><span class="line"></span><br><span class="line">#netstat -anto |grep -E "Recv|33864|3001|33077"</span><br><span class="line">Proto Recv-Q Send-Q Local Address Foreign Address State Timer</span><br><span class="line">tcp 0 248 127.0.0.1:33864 127.0.0.1:3001 ESTABLISHED probe (33.48/0/8)</span><br><span class="line">tcp6 0 11 127.0.0.1:3307 127.0.0.1:33864 ESTABLISHED on (49.03/13/0)</span><br></pre></td></tr></table></figure><p>直到 900多秒后 OS 重试了15次发现都失败,于是向业务/Sysbench 返回连接异常,触发业务/Sysbench 释放异常连接重建新连接,新连接指向了新的 Master 3306,业务恢复正常</p><figure class="highlight plain"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br></pre></td><td class="code"><pre><span class="line">[ 957s ] thds: 1 tps: 0.00 qps: 0.00 (r/w/o: 0.00/0.00/0.00) lat (ms,95%): 0.00 err/s 0.00 reconn/s: 0.00</span><br><span class="line">DEBUG: Ignoring error 2013 Lost connection to MySQL server during query,</span><br><span class="line">DEBUG: Reconnecting </span><br><span class="line">DEBUG: Reconnected</span><br><span class="line">[ 958s ] thds: 1 tps: 53.00 qps: 950.97 (r/w/o: 741.98/208.99/0.00) lat (ms,95%): 30.26 err/s 0.00 reconn/s: 1.00</span><br><span class="line">[ 959s ] thds: 1 tps: 64.00 qps: 1154.03 (r/w/o: 896.02/258.01/0.00) lat (ms,95%): 22.69 err/s 0.00 reconn/s: 0.00</span><br><span class="line">[ 960s ] thds: 1 tps: 66.00 qps: 1184.93 (r/w/o: 923.94/260.98/0.00) lat (ms,95%): 25.28 err/s 0.00 reconn/s: 0.00</span><br></pre></td></tr></table></figure><p>到这里重现了故障中经常碰到的业务需要900多秒才能慢慢恢复,这个问题也就是 <strong>TCP 长连接流量黑洞</strong></p><p>如果我们<strong>把 net.ipv4.tcp_retries2 改成5</strong> 再来做这个实验,就会发现业务/Sysbench 只需要20秒就能恢复了,也就是这个流量黑洞从900多秒变成了20秒,这回 oncall 不用再被从被窝里拎出来了吧:</p><figure class="highlight plain"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br></pre></td><td class="code"><pre><span class="line">[ 62s ] thds: 1 tps: 66.00 qps: 1191.00 (r/w/o: 924.00/267.00/0.00) lat (ms,95%): 17.63 err/s 0.00 reconn/s: 0.00</span><br><span class="line">[ 63s ] thds: 1 tps: 63.00 qps: 1123.01 (r/w/o: 874.00/249.00/0.00) lat (ms,95%): 17.63 err/s 0.00 reconn/s: 0.00</span><br><span class="line">[ 64s ] thds: 1 tps: 0.00 qps: 0.00 (r/w/o: 0.00/0.00/0.00) lat (ms,95%): 0.00 err/s 0.00 reconn/s: 0.00</span><br><span class="line">[ 65s ] thds: 1 tps: 0.00 qps: 0.00 (r/w/o: 0.00/0.00/0.00) lat (ms,95%): 0.00 err/s 0.00 reconn/s: 0.00</span><br><span class="line">[ 66s ] thds: 1 tps: 0.00 qps: 0.00 (r/w/o: 0.00/0.00/0.00) lat (ms,95%): 0.00 err/s 0.00 reconn/s: 0.00</span><br><span class="line">[ 67s ] thds: 1 tps: 0.00 qps: 0.00 (r/w/o: 0.00/0.00/0.00) lat (ms,95%): 0.00 err/s 0.00 reconn/s: 0.00</span><br><span class="line">[ 68s ] thds: 1 tps: 0.00 qps: 0.00 (r/w/o: 0.00/0.00/0.00) lat (ms,95%): 0.00 err/s 0.00 reconn/s: 0.00</span><br><span class="line">[ 69s ] thds: 1 tps: 0.00 qps: 0.00 (r/w/o: 0.00/0.00/0.00) lat (ms,95%): 0.00 err/s 0.00 reconn/s: 0.00</span><br><span class="line">[ 70s ] thds: 1 tps: 0.00 qps: 0.00 (r/w/o: 0.00/0.00/0.00) lat (ms,95%): 0.00 err/s 0.00 reconn/s: 0.00</span><br><span class="line">[ 71s ] thds: 1 tps: 0.00 qps: 0.00 (r/w/o: 0.00/0.00/0.00) lat (ms,95%): 0.00 err/s 0.00 reconn/s: 0.00</span><br><span class="line">[ 72s ] thds: 1 tps: 0.00 qps: 0.00 (r/w/o: 0.00/0.00/0.00) lat (ms,95%): 0.00 err/s 0.00 reconn/s: 0.00</span><br><span class="line">[ 73s ] thds: 1 tps: 0.00 qps: 0.00 (r/w/o: 0.00/0.00/0.00) lat (ms,95%): 0.00 err/s 0.00 reconn/s: 0.00</span><br><span class="line">[ 74s ] thds: 1 tps: 0.00 qps: 0.00 (r/w/o: 0.00/0.00/0.00) lat (ms,95%): 0.00 err/s 0.00 reconn/s: 0.00</span><br><span class="line">[ 75s ] thds: 1 tps: 0.00 qps: 0.00 (r/w/o: 0.00/0.00/0.00) lat (ms,95%): 0.00 err/s 0.00 reconn/s: 0.00</span><br><span class="line">[ 76s ] thds: 1 tps: 0.00 qps: 0.00 (r/w/o: 0.00/0.00/0.00) lat (ms,95%): 0.00 err/s 0.00 reconn/s: 0.00</span><br><span class="line">[ 77s ] thds: 1 tps: 0.00 qps: 0.00 (r/w/o: 0.00/0.00/0.00) lat (ms,95%): 0.00 err/s 0.00 reconn/s: 0.00</span><br><span class="line">[ 78s ] thds: 1 tps: 0.00 qps: 0.00 (r/w/o: 0.00/0.00/0.00) lat (ms,95%): 0.00 err/s 0.00 reconn/s: 0.00</span><br><span class="line">[ 79s ] thds: 1 tps: 0.00 qps: 0.00 (r/w/o: 0.00/0.00/0.00) lat (ms,95%): 0.00 err/s 0.00 reconn/s: 0.00</span><br><span class="line">[ 80s ] thds: 1 tps: 0.00 qps: 0.00 (r/w/o: 0.00/0.00/0.00) lat (ms,95%): 0.00 err/s 0.00 reconn/s: 0.00</span><br><span class="line">[ 81s ] thds: 1 tps: 0.00 qps: 0.00 (r/w/o: 0.00/0.00/0.00) lat (ms,95%): 0.00 err/s 0.00 reconn/s: 0.00</span><br><span class="line">[ 82s ] thds: 1 tps: 0.00 qps: 0.00 (r/w/o: 0.00/0.00/0.00) lat (ms,95%): 0.00 err/s 0.00 reconn/s: 0.00</span><br><span class="line">DEBUG: Ignoring error 2013 Lost connection to MySQL server during query,</span><br><span class="line">DEBUG: Reconnecting </span><br><span class="line">DEBUG: Reconnected</span><br><span class="line">[ 83s ] thds: 1 tps: 26.00 qps: 457.01 (r/w/o: 357.01/100.00/0.00) lat (ms,95%): 16.41 err/s 0.00 reconn/s: 1.00</span><br><span class="line">[ 84s ] thds: 1 tps: 60.00 qps: 1086.94 (r/w/o: 846.96/239.99/0.00) lat (ms,95%): 26.68 err/s 0.00 reconn/s: 0.00</span><br><span class="line">[ 85s ] thds: 1 tps: 63.00 qps: 1134.02 (r/w/o: 882.01/252.00/0.00) lat (ms,95%): 23.10 err/s 0.00 reconn/s: 0.00</span><br></pre></td></tr></table></figure><h3 id="LVS-Nginx-上重现"><a href="#LVS-Nginx-上重现" class="headerlink" title="LVS + Nginx 上重现"></a>LVS + Nginx 上重现</h3><p>NGINX上重现这个问题:<a href="https://asciinema.org/a/649890" target="_blank" rel="noopener">https://asciinema.org/a/649890</a> 3分钟的录屏,这个视频构造了一个LVS 的HA切换过程,LVS后有两个Nginx,模拟一个Nginx(Master) 断网后,将第二个Nginx(Slave) 加入到LVS 并将第一个Nginx(Master) 从LVS 摘除,期望业务能立即恢复,但实际上可以看到之前的所有长连接都没有办法恢复,进入一个流量黑洞</p><h2 id="TCP-长连接流量黑洞原理总结"><a href="#TCP-长连接流量黑洞原理总结" class="headerlink" title="TCP 长连接流量黑洞原理总结"></a>TCP 长连接流量黑洞原理总结</h2><p>TCP 长连接在发送包的时候,如果没收到ack 默认会进行15次重传(net.ipv4.tcp_retries2=15, 这个不要较真,会根据RTO 时间大致是15次),累加起来大概是924秒,所以我们经常看到业务需要15分钟左右才恢复。这个问题存在所有TCP长连接中(几乎没有业务还在用短连接吧?),问题的本质和 LVS/k8s Service 都没关系</p><p>我这里重现带上 LVS 只是为了场景演示方便 </p><p>这个问题的本质就是如果Server突然消失(宕机、断网,来不及发 RST )客户端如果正在发东西给Server就会遵循TCP 重传逻辑不断地TCP retran , 如果一直收不到Server 的ack,大约重传15次,900秒左右。所以不是因为有 LVS 导致了这个问题,而是在某些场景下 LVS 有能力处理得更优雅,比如删除 RealServer的时候 LVS 完全可以感知这个动作并 reset 掉其上所有长连接</p><p>为什么在K8S 上这个问题更明显呢,K8S 讲究的就是服务不可靠,随时干掉POD(切断网络),如果干POD 之前能kill -9(触发reset)、或者close 业务触发断开连接那还好,但是大多时候啥都没干,有强摘POD、有直接隔离等等,这些操作都会导致对端只能TCP retran</p><h2 id="怎么解决"><a href="#怎么解决" class="headerlink" title="怎么解决"></a>怎么解决</h2><h3 id="业务方"><a href="#业务方" class="headerlink" title="业务方"></a>业务方</h3><p>业务方要对自己的请求超时时间有控制和兜底,不能任由一个请求长时间 Hang 在那里</p><p>比如JDBC URL 支持设置 SocketTimeout、ConnectTimeout,我相信其他产品也有类似的参数,业务方要设置这些值,不设置就是如上重现里演示的900多秒后才恢复</p><h4 id="SocketTimeout"><a href="#SocketTimeout" class="headerlink" title="SocketTimeout"></a>SocketTimeout</h4><p>只要是连接有机会设置 SocketTimeout 就一定要设置,具体值可以根据你们能接受的慢查询来设置;分析、AP类的请求可以设置大一点</p><p><strong>最重要的:任何业务只要你用到了TCP 长连接一定要配置一个恰当的SocketTimeout</strong>,比如 Jedis 是连接池模式,底层超时之后,会销毁当前连接,下一次重新建连,就会连接到新的切换节点上去并恢复</p><h4 id="RFC-5482-TCP-USER-TIMEOUT"><a href="#RFC-5482-TCP-USER-TIMEOUT" class="headerlink" title="RFC 5482 TCP_USER_TIMEOUT"></a><a href="https://datatracker.ietf.org/doc/html/rfc5482" target="_blank" rel="noopener">RFC 5482</a> <code>TCP_USER_TIMEOUT</code></h4><p><a href="https://datatracker.ietf.org/doc/html/rfc5482" target="_blank" rel="noopener">RFC 5482</a> 中增加了<code>TCP_USER_TIMEOUT</code>这个配置,通常用于定制当 TCP 网络连接中出现数据传输问题时,可以等待多长时间前释放网络资源,对应Linux 这个 <a href="https://github.com/torvalds/linux/commit/dca43c75e7e545694a9dd6288553f55c53e2a3a3" target="_blank" rel="noopener">commit </a></p><p><code>TCP_USER_TIMEOUT</code> 是一个整数值,它指定了当 TCP 连接的数据包在发送后多长时间内未被确认(即没有收到 ACK),TCP 连接会考虑释放这个连接。</p><p>打个比方,设置 <code>TCP_USER_TIMEOUT</code> 后,应用程序就可以指定说:“如果在 30 秒内我发送的数据没有得到确认,那我就认定网络连接出了问题,不再尝试继续发送,而是直接断开连接。”这对于确保连接质量和维护用户体验是非常有帮助的。</p><p>在 Linux 中,可以使用 <code>setsockopt</code> 函数来设置某个特定 socket 的 <code>TCP_USER_TIMEOUT</code> 值:</p><figure class="highlight plain"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br></pre></td><td class="code"><pre><span class="line">int timeout = 30000; // 30 seconds</span><br><span class="line">setsockopt(sock, IPPROTO_TCP, TCP_USER_TIMEOUT, (char *)&timeout, sizeof(timeout));</span><br></pre></td></tr></table></figure><p>在这行代码中,<code>sock</code> 是已经 established 的 TCP socket,我们将该 socket 的 <code>TCP_USER_TIMEOUT</code> 设置为 30000 毫秒,也就是 30 秒。如果设置成功,这个 TCP 连接在发送数据包后 30 秒内如果没有收到 ACK 确认,将开始进行 TCP 连接的释放流程。</p><p>TCP_USER_TIMEOUT 相较 SocketTimeout 可以做到更精确(不影响慢查询),SocketTimeout 超时是不区分ACK 还是请求响应时间的,但是 TCP_USER_TIMEOUT 要求下层的API、OS 都支持。比如 JDK 不支持 TCP_USER_TIMEOUT,但是 <a href="https://github.com/tomasol/netty/commit/3010366d957d7b8106e353f99e15ccdb7d391d8f#diff-a998f73b7303461ca171432d10832884c6e7b0955d9f5634b9a8302b42a4706c" target="_blank" rel="noopener">Netty 框架自己搞了Native</a> 来实现对 TCP_USER_TIMEOUT 以及其它OS 参数的设置,在这些基础上<a href="https://github.com/redis/lettuce/pull/2499" target="_blank" rel="noopener">Redis 的Java 客户端 lettuce 依赖了 Netty ,所以也可以设置 TCP_USER_TIMEOUT</a></p><p>原本我是想在Druid 上提个feature 来支持 TCP_USER_TIMEOUT,这样集团绝大部分业务都可以无感知解决掉这个问题,但查下来发现 JDK 不支持设置这个值,想要在Druid 里面实现设置 TCP_USER_TIMEOUT 的话,得像 Netty 一样走Native 绕过JDK 来设置,这对 Druid 而言有点重了</p><h4 id="ConnectTimeout"><a href="#ConnectTimeout" class="headerlink" title="ConnectTimeout"></a>ConnectTimeout</h4><p>这个值是针对新连接创建超时时间设置,一般设置3-5秒就够长了</p><h4 id="连接池"><a href="#连接池" class="headerlink" title="连接池"></a>连接池</h4><p>建议参考这篇 <a href="https://help.aliyun.com/document_detail/181399.html" target="_blank" rel="noopener">《数据库连接池配置推荐》</a> 这篇里的很多建议也适合业务、应用等,你把数据库看成一个普通服务就好理解了</p><p>补充下如果用的是Druid 数据库连接池不要用它来设置你的 SocketTimeout 参数,因为他有bug 导致你觉得设置了但实际没设置上,<a href="https://github.com/alibaba/druid/releases/tag/1.2.22" target="_blank" rel="noopener">2024-03-16号的1.2.22</a>这个Release 才fix,所以强烈建议你讲 SocketTimeout 写死在JDBC URL 中简单明了</p><h3 id="OS-兜底"><a href="#OS-兜底" class="headerlink" title="OS 兜底"></a>OS 兜底</h3><p>假如业务是一个AP查询/一次慢请求,一次查询/请求就是需要半个小时,将 SocketTimeout 设置太小影响正常的查询,那么可以将如下 OS参数改小,从 OS 层面进行兜底</p><figure class="highlight plain"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br></pre></td><td class="code"><pre><span class="line">net.ipv4.tcp_retries2 = 8</span><br><span class="line">net.ipv4.tcp_syn_retries = 4</span><br></pre></td></tr></table></figure><h4 id="keepalive"><a href="#keepalive" class="headerlink" title="keepalive"></a>keepalive</h4><p>keepalive 默认 7200秒太长了,建议改成20秒,可以在OS 镜像层面固化,然后各个业务可以 patch 自己的值;</p><p>如果一条连接限制超过 900 秒 LVS就会Reset 这条连接,但是我们将keepalive 设置小于900秒的话,即使业务上一直闲置,因为有 keepalive 触发心跳包,让 LVS 不至于 Reset,这也就避免了当业务取连接使用的时候才发现连接已经不可用被断开了,往往这个时候业务抛错误的时间很和真正 Reset 时间还差了很多,不好排查</p><p>在触发 TCP retransmission 后会停止 keepalive 探测</p><h3 id="LVS"><a href="#LVS" class="headerlink" title="LVS"></a>LVS</h3><p>如果你们试用了aliyun的SLB,当摘除节点的时候支持你设置一个时间,过了这个时间 aliyun的SLB 就会向这些连接的客户端发 Reset 干掉这些流量,让客户端触发新建连接,从故障中快速恢复,这是一个实例维度的参数,建议云上所有产品都支持起来,管控可以在购买 aliyun的SLB 的时候设置一个默认值:</p><p> <code>connection_drain_timeout</code> </p><h2 id="其它"><a href="#其它" class="headerlink" title="其它"></a>其它</h2><h3 id="神奇的900秒"><a href="#神奇的900秒" class="headerlink" title="神奇的900秒"></a>神奇的900秒</h3><p>上面阐述的长连接流量黑洞一般是900+秒就恢复了,有时候我们经常在日志中看到 CommunicationsException: Communications link failure 900秒之类的错误,恰好 LVS 也是设置的 900秒闲置 Reset</p><figure class="highlight plain"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br></pre></td><td class="code"><pre><span class="line">#ipvsadm -L --timeout</span><br><span class="line">Timeout (tcp tcpfin udp): 900 120 300</span><br></pre></td></tr></table></figure><h3 id="为什么这个问题这几年才明显暴露"><a href="#为什么这个问题这几年才明显暴露" class="headerlink" title="为什么这个问题这几年才明显暴露"></a>为什么这个问题这几年才明显暴露</h3><ul><li>工程师们混沌了几十年</li><li>之前因为出现频率低重启业务就糊弄过去了</li><li>对新连接不存在这个问题</li><li>有些连接池配置了Check 机制(Check机制一般几秒钟超时 fail)</li><li>微服务多了</li><li>云上 LVS 普及了</li><li>k8s service 大行其道</li></ul><h3 id="我用的-7层是不是就没有这个问题了?"><a href="#我用的-7层是不是就没有这个问题了?" class="headerlink" title="我用的 7层是不是就没有这个问题了?"></a>我用的 7层是不是就没有这个问题了?</h3><p>幼稚,你4层都挂了7层还能蹦跶,再说一遍只要是 TCP 长连接就有这个问题</p><h3 id="极端情况"><a href="#极端情况" class="headerlink" title="极端情况"></a>极端情况</h3><p>A 长连接 访问B 服务,B服务到A网络不通,假如B发生HA,一般会先Reset/断开B上所有连接(比如 MySQL 会去kill 所有processlist;比如重启MySQL——假如这里的B是MySQL),但是因为网络不通这里的reset、fin网络包都无法到达A,所以B是无法兜底这个异常场景, A无法感知B不可用了,会使用旧连接大约15分钟</p><p>最可怕的是 B 服务不响应,B所在的OS 还在响应,那么在A的视角 网络是正常的,这时只能A自己来通过超时兜底</p><h2 id="总结"><a href="#总结" class="headerlink" title="总结"></a>总结</h2><p>这种问题在 LVS 场景下暴露更明显了,但是又和LVS 没啥关系,任何业务长连接都会导致这个 900秒左右的流量黑洞,首先要在业务层面重视这个问题,要不以后数据库一挂掉还得重启业务才能从故障中将恢复,所以业务层面处理好了可以避免900秒黑洞和重启业务,达到快速从故障中恢复</p><p>再强调下这个问题如果去掉LVS/k8s Service/软负载等让两个服务直连,然后拔网线也会同样出现</p><p>最佳实践总结:</p><ul><li>如果你的业务支持设置 SocketTimeout 那么请一定要设置,但不一定适合分析类就是需要长时间返回的请求</li><li>最好的方式是设置 OS 层面的 TCP_USER_TIMEOUT 参数,只要长时间没有 ack 就报错返回,但 JDK 不支持直接设置</li><li>如果用了 ALB/SLB 就一定要配置 connection_drain_timeout 这个参数</li><li>OS 镜像层面也可以将 tcp_retries2 设置为5-10次做一个兜底</li><li>对你的超时时间做到可控、可预期</li></ul><h2 id="相关故障和资料"><a href="#相关故障和资料" class="headerlink" title="相关故障和资料"></a>相关故障和资料</h2><p>ALB 黑洞问题详述:<a href="https://mp.weixin.qq.com/s/BJWD2V_RM2rnU1y7LPB9aw" target="_blank" rel="noopener">https://mp.weixin.qq.com/s/BJWD2V_RM2rnU1y7LPB9aw</a></p><p>数据库故障引发的“血案” :<a href="https://www.cnblogs.com/nullllun/p/15073022.html" target="_blank" rel="noopener">https://www.cnblogs.com/nullllun/p/15073022.html</a> 这篇描述较细致,推荐看看</p><p>tcp_retries2 的解释:</p><figure class="highlight plain"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br></pre></td><td class="code"><pre><span class="line">tcp_retries1 - INTEGER</span><br><span class="line"> This value influences the time, after which TCP decides, that</span><br><span class="line"> something is wrong due to unacknowledged RTO retransmissions,</span><br><span class="line"> and reports this suspicion to the network layer.</span><br><span class="line"> See tcp_retries2 for more details.</span><br><span class="line"></span><br><span class="line"> RFC 1122 recommends at least 3 retransmissions, which is the</span><br><span class="line"> default.</span><br><span class="line"></span><br><span class="line">tcp_retries2 - INTEGER</span><br><span class="line"> This value influences the timeout of an alive TCP connection,</span><br><span class="line"> when RTO retransmissions remain unacknowledged.</span><br><span class="line"> Given a value of N, a hypothetical TCP connection following</span><br><span class="line"> exponential backoff with an initial RTO of TCP_RTO_MIN would</span><br><span class="line"> retransmit N times before killing the connection at the (N+1)th RTO.</span><br><span class="line"></span><br><span class="line"> The default value of 15 yields a hypothetical timeout of 924.6</span><br><span class="line"> seconds and is a lower bound for the effective timeout.</span><br><span class="line"> TCP will effectively time out at the first RTO which exceeds the</span><br><span class="line"> hypothetical timeout.</span><br><span class="line"></span><br><span class="line"> RFC 1122 recommends at least 100 seconds for the timeout,</span><br><span class="line"> which corresponds to a value of at least 8.</span><br></pre></td></tr></table></figure><p>tcp_retries2 默认值为15,根据RTO的值来决定,相当于13-30分钟(RFC1122规定,必须大于100秒),但是这是很多年前的拍下来古董参数值,现在网络条件好多了,尤其是内网,个人认为改成 5-10 是比较恰当 azure 建议:<a href="https://learn.microsoft.com/en-us/azure/azure-cache-for-redis/cache-best-practices-connection" target="_blank" rel="noopener">https://learn.microsoft.com/en-us/azure/azure-cache-for-redis/cache-best-practices-connection</a> ,Oracle RAC的建议值是3:<a href="https://access.redhat.com/solutions/726753" target="_blank" rel="noopener">https://access.redhat.com/solutions/726753</a></p>]]></content>
<summary type="html">
<h1 id="长连接黑洞重现和分析"><a href="#长连接黑洞重现和分析" class="headerlink" title="长连接黑洞重现和分析"></a>长连接黑洞重现和分析</h1><p>这是一个存在多年,遍及各个不同的业务又反反复复地在集团内部出现的一个问题,本
</summary>
<category term="network" scheme="https://plantegg.github.io/categories/network/"/>
<category term="Linux" scheme="https://plantegg.github.io/tags/Linux/"/>
<category term="LVS" scheme="https://plantegg.github.io/tags/LVS/"/>
<category term="network" scheme="https://plantegg.github.io/tags/network/"/>
<category term="SocketTimeout" scheme="https://plantegg.github.io/tags/SocketTimeout/"/>
<category term="TCP_USER_TIMEOUT" scheme="https://plantegg.github.io/tags/TCP-USER-TIMEOUT/"/>
</entry>
<entry>
<title>十年后数据库还是不敢拥抱NUMA-续篇</title>
<link href="https://plantegg.github.io/2024/05/03/%E5%8D%81%E5%B9%B4%E5%90%8E%E6%95%B0%E6%8D%AE%E5%BA%93%E8%BF%98%E6%98%AF%E4%B8%8D%E6%95%A2%E6%8B%A5%E6%8A%B1NUMA-%E7%BB%AD%E7%AF%87/"/>
<id>https://plantegg.github.io/2024/05/03/十年后数据库还是不敢拥抱NUMA-续篇/</id>
<published>2024-05-03T04:30:03.000Z</published>
<updated>2024-11-20T10:00:54.910Z</updated>
<content type="html"><![CDATA[<h1 id="十年后数据库还是不敢拥抱NUMA-续篇"><a href="#十年后数据库还是不敢拥抱NUMA-续篇" class="headerlink" title="十年后数据库还是不敢拥抱NUMA-续篇"></a>十年后数据库还是不敢拥抱NUMA-续篇</h1><h2 id="背景"><a href="#背景" class="headerlink" title="背景"></a>背景</h2><p><a href="https://plantegg.github.io/2021/05/14/%E5%8D%81%E5%B9%B4%E5%90%8E%E6%95%B0%E6%8D%AE%E5%BA%93%E8%BF%98%E6%98%AF%E4%B8%8D%E6%95%A2%E6%8B%A5%E6%8A%B1NUMA/">十年后数据库还是不敢拥抱NUMA</a>, 这篇经典的纠正大家对NUMA 认知的文章一晃发布快3年了,这篇文章的核心结论是:</p><ul><li>之所以有不同的NUMA Node 是不同的CPU Core 到不同的内存距离远近不一样所决定的,这是个物理距离</li><li>程序跑在不同的核上要去读写内存可以让性能差异巨大,所以我们要尽量让一个程序稳定跑在一个Node 内</li><li>默认打开NUMA Node 其实挺好的</li></ul><p>写这个续篇是我收到很多解释,因为跨Node 导致性能抖动,所以集团在物理机OS 的启动参数里设置了 numa=off ,也就是不管BIOS 中如何设置,我们只要在OS 层面设置一下 numa=off 就能让程序稳定下来不再抖了!</p><p>我这几年也认为这是对的,只是让我有点不理解,既然不区分远近了,那物理上存在的远近距离(既抖动)如何能被消除掉的呢?</p><p>所以这个续篇打算通过测试来验证下这个问题</p><h2 id="设置"><a href="#设置" class="headerlink" title="设置"></a>设置</h2><p>BIOS 中有 numa node 设置的开关(注意这里是内存交错/交织),不同的主板这个BIOS设置可能不一样,但是大同小异,基本都有这个参数</p><p><img src="https://cdn.jsdelivr.net/gh/plantegg/plantegg.github.io/images/951413iMgBlog/FrVuhXNHEf2LzigZPHHV6c7UNKrP-5057597.png" alt="img"></p><p>Linux 启动引导参数里也可以设置numa=on(默认值)/off ,linux 引导参数设置案例:</p><figure class="highlight plain"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br></pre></td><td class="code"><pre><span class="line">#cat /proc/cmdline</span><br><span class="line">BOOT_IMAGE=/vmlinuz-3.10.0-327.x86_64 ro crashkernel=auto vconsole.font=latarcyrheb-sun16 vconsole.keymap=us BIOSdevname=0 console=tty0 console=ttyS0,115200 scsi_mod.scan=sync intel_idle.max_cstate=0 pci=pcie_bus_perf ipv6.disable=1 rd.driver.pre=ahci numa=on nosmt=force</span><br></pre></td></tr></table></figure><p>注意如上的 numa=on 也可以改为 numa=off</p><p>看完全置篇要记住一条铁律:CPU到内存的距离是物理远近决定的,你软件层面做些设置是没法优化这个距离,也就是没法优化这个时延 (这是个核心知识点,你要死死记住和理解,后面的一切实验数据都回过头来看这个核心知识点并揣摩)</p><h2 id="实验"><a href="#实验" class="headerlink" title="实验"></a>实验</h2><p>测试机器CPU,如下是BIOS numa=on、cmdline numa=off所看到的,一个node</p><figure class="highlight plain"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br></pre></td><td class="code"><pre><span class="line">#lscpu</span><br><span class="line">Architecture: x86_64</span><br><span class="line">CPU op-mode(s): 32-bit, 64-bit</span><br><span class="line">Byte Order: Little Endian</span><br><span class="line">CPU(s): 96</span><br><span class="line">On-line CPU(s) list: 0-95</span><br><span class="line">Thread(s) per core: 2</span><br><span class="line">Core(s) per socket: 24</span><br><span class="line">Socket(s): 2</span><br><span class="line">NUMA node(s): 1</span><br><span class="line">Vendor ID: GenuineIntel</span><br><span class="line">CPU family: 6</span><br><span class="line">Model: 85</span><br><span class="line">Model name: Intel(R) Xeon(R) Platinum 8163 CPU @ 2.50GHz</span><br><span class="line">Stepping: 4</span><br><span class="line">CPU MHz: 2500.000</span><br><span class="line">CPU max MHz: 3100.0000</span><br><span class="line">CPU min MHz: 1000.0000</span><br><span class="line">BogoMIPS: 4998.89</span><br><span class="line">Virtualization: VT-x</span><br><span class="line">L1d cache: 32K</span><br><span class="line">L1i cache: 32K</span><br><span class="line">L2 cache: 1024K</span><br><span class="line">L3 cache: 33792K</span><br><span class="line">NUMA node0 CPU(s): 0-95</span><br></pre></td></tr></table></figure><p>测试工具是<a href="https://github.com/intel/lmbench" target="_blank" rel="noopener">lmbench</a>,测试命令:</p><figure class="highlight plain"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">for i in $(seq 0 6 95); do echo core:$i; numactl -C $i -m 0 ./bin/lat_mem_rd -W 5 -N 5 -t 64M; done >lat.log 2>&1</span><br></pre></td></tr></table></figure><p>上述测试命令始终将内存绑定在 node0 上,然后用不同的物理core来读写这块内存,按照<a href="https://ata.atatech.org/articles/11000205974" target="_blank" rel="noopener">前一篇</a> 这个时延肯定有快慢之分</p><p>BIOS和引导参数各有两种设置方式,组合起来就是四种,我们分别设置并跑一下内存时延,测试结果:</p><table><thead><tr><th></th><th>BIOS ON</th><th>BIOS OFF</th></tr></thead><tbody><tr><td>cmdline numa=on(默认值)</td><td>NUMA 开启,内存在Node内做交织,就近有快慢之分</td><td>bios 关闭后numa后,OS层面完全不知道下层的结构,默认全局内存做交织,时延是个平均值</td></tr><tr><td>cmdline numa=off</td><td>交织关闭,效果同上</td><td>同上</td></tr></tbody></table><p>测试原始数据如下(测试结果文件名 lat.log.BIOSON.cmdlineOff 表示BIOS ON,cmdline OFF ):</p><figure class="highlight plain"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br><span class="line">39</span><br><span class="line">40</span><br><span class="line">41</span><br><span class="line">42</span><br><span class="line">43</span><br></pre></td><td class="code"><pre><span class="line">//从下面两组测试来看,BIOS层面 on后,不管OS 层面是否on,都不会跨node 做交织,抖动存在</span><br><span class="line">//BIOS on 即使在OS层面关闭numa也不跨node做内存交织,抖动存在</span><br><span class="line">//默认从内存高地址开始分配空间,所以0核要慢</span><br><span class="line">#grep -E "core|64.00000" lat.log.BIOSON.cmdlineOff </span><br><span class="line">core:0 //第0号核</span><br><span class="line">64.00000 100.717 //64.0000为64MB, 100.717 是平均时延100.717ns 即0号核访问node0 下的内存64MB的平均延时是100纳秒</span><br><span class="line">core:24</span><br><span class="line">64.00000 68.484</span><br><span class="line">core:48</span><br><span class="line">64.00000 101.070</span><br><span class="line">core:72</span><br><span class="line">64.00000 68.483</span><br><span class="line">#grep -E "core|64.00000" lat.log.BIOSON.cmdlineON</span><br><span class="line">core:0</span><br><span class="line">64.00000 67.094</span><br><span class="line">core:24</span><br><span class="line">64.00000 100.237</span><br><span class="line">core:48</span><br><span class="line">64.00000 67.614</span><br><span class="line">core:72</span><br><span class="line">64.00000 101.096</span><br><span class="line"></span><br><span class="line">//从下面两组测试来看只要BIOS off了内存就会跨 node 交织,大规模测试下内存 latency 是个平均值</span><br><span class="line">#grep -E "core|64.00000" lat.log.BIOSOff.cmdlineOff //BIOS off 做内存交织,latency就是平均值</span><br><span class="line">core:0</span><br><span class="line">64.00000 85.657 //85 恰好是最大100,最小68的平均值</span><br><span class="line">core:24</span><br><span class="line">64.00000 85.741</span><br><span class="line">core:48</span><br><span class="line">64.00000 85.977</span><br><span class="line">core:72</span><br><span class="line">64.00000 86.671</span><br><span class="line"></span><br><span class="line">//BIOS 关闭后numa后,OS层面完全不知道下层的结构,默认一定是做交织</span><br><span class="line">#grep -E "core|64.00000" lat.log.BIOSOff.cmdlineON</span><br><span class="line">core:0</span><br><span class="line">64.00000 89.123</span><br><span class="line">core:24</span><br><span class="line">64.00000 87.137</span><br><span class="line">core:48</span><br><span class="line">64.00000 87.239</span><br><span class="line">core:72</span><br><span class="line">64.00000 87.323</span><br></pre></td></tr></table></figure><p>从数据可以看到在BIOS 设置ON后,无论 OS cmdline 启动参数里是否设置了 ON 还是 OFF,内存延时都是抖动且一致的(这个有点诧异,说好的消除抖动的呢?)。如果BIOS 设置OFF后内存延时是个稳定的平均值(这个比较好理解)</p><h2 id="疑问"><a href="#疑问" class="headerlink" title="疑问"></a>疑问</h2><ul><li>内存交错时为什么 lmbench 测试得到的时延是平均值,而不是短板效应的最慢值?</li></ul><p>测试软件只能通过大规模数据的读写来测试获取一个平均值,所以当一大块内存读取时,虽然通过交织大块内存被切分到了快慢物理内存上,但是因为规模大慢的被平均掉了。(欢迎内核大佬指正)</p><ul><li>什么是内存交织?</li></ul><p>我的理解假如你有8块物理内存条,如果你有一个int 那么只能在其中一块上,如果你有1MB的数据那么会按cacheline 拆成多个块然后分别放到8块物理内存条上(有快有慢)这样带宽更大,最后测试得到一个平均值</p><p>如果你开启numa那么只会就近交织,比如0-3号内存条在0号core所在的node,OS 做内存交织的时候只会拆分到这0-3号内存条上,那么时延总是最小的那个,如上测试中的60多纳秒。</p><p>这个问题一直困扰了我几年,所以我最近再次测试验证了一下,主要是对 BIOS=on 且 cmdline=off 时有点困扰</p><h2 id="Intel-的-mlc-验证"><a href="#Intel-的-mlc-验证" class="headerlink" title="Intel 的 mlc 验证"></a>Intel 的 mlc 验证</h2><p>测试参数: BIOS=on 同时 cmdline off</p><p>用<a href="https://www.intel.com/content/www/us/en/developer/articles/tool/intelr-memory-latency-checker.html" target="_blank" rel="noopener">Intel 的 mlc 验证下</a>,这个结果有点意思,latency稳定在 145 而不是81 和 145两个值随机出现,应该是mlc默认选到了0核,对应lmbench的这组测试数据(为什么不是100.717, 因为测试方法不一样):</p><figure class="highlight plain"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br></pre></td><td class="code"><pre><span class="line">//如下是</span><br><span class="line">//从下面两种测试来看,BIOS层面 on后,不管OS 层面是否on,都不会跨node 做交织,抖动存在</span><br><span class="line">//BIOS on 即使在OS层面关闭numa也不跨node做内存交织,抖动存在</span><br><span class="line">#grep -E "core|64.00000" lat.log.BIOSON.cmdlineOff </span><br><span class="line">core:0</span><br><span class="line">64.00000 100.717</span><br><span class="line">core:24</span><br><span class="line">64.00000 68.484</span><br><span class="line">core:48</span><br><span class="line">64.00000 101.070</span><br><span class="line">core:72</span><br><span class="line">64.00000 68.483</span><br></pre></td></tr></table></figure><p>此时对应的mlc</p><figure class="highlight plain"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br></pre></td><td class="code"><pre><span class="line">#./mlc</span><br><span class="line">Intel(R) Memory Latency Checker - v3.9</span><br><span class="line">Measuring idle latencies (in ns)...</span><br><span class="line"> Numa node</span><br><span class="line">Numa node 0</span><br><span class="line"> 0 145.8 //多次测试稳定都是145纳秒</span><br><span class="line"></span><br><span class="line">Measuring Peak Injection Memory Bandwidths for the system</span><br><span class="line">Bandwidths are in MB/sec (1 MB/sec = 1,000,000 Bytes/sec)</span><br><span class="line">Using all the threads from each core if Hyper-threading is enabled</span><br><span class="line">Using traffic with the following read-write ratios</span><br><span class="line">ALL Reads : 110598.7</span><br><span class="line">3:1 Reads-Writes : 93408.5</span><br><span class="line">2:1 Reads-Writes : 89249.5</span><br><span class="line">1:1 Reads-Writes : 64137.3</span><br><span class="line">Stream-triad like: 77310.4</span><br><span class="line"></span><br><span class="line">Measuring Memory Bandwidths between nodes within system</span><br><span class="line">Bandwidths are in MB/sec (1 MB/sec = 1,000,000 Bytes/sec)</span><br><span class="line">Using all the threads from each core if Hyper-threading is enabled</span><br><span class="line">Using Read-only traffic type</span><br><span class="line"> Numa node</span><br><span class="line">Numa node 0</span><br><span class="line"> 0 110598.4</span><br><span class="line"></span><br><span class="line">Measuring Loaded Latencies for the system</span><br><span class="line">Using all the threads from each core if Hyper-threading is enabled</span><br><span class="line">Using Read-only traffic type</span><br><span class="line">Inject Latency Bandwidth</span><br><span class="line">Delay (ns) MB/sec</span><br><span class="line">==========================</span><br><span class="line"> 00000 506.00 111483.5</span><br><span class="line"> 00002 505.74 112576.9</span><br><span class="line"> 00008 505.87 112644.3</span><br><span class="line"> 00015 508.96 112643.6</span><br><span class="line"> 00050 574.36 112701.5</span><br></pre></td></tr></table></figure><p>当两个参数都为 on 时的mlc 测试结果:</p><figure class="highlight plain"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br></pre></td><td class="code"><pre><span class="line">#./mlc</span><br><span class="line">Intel(R) Memory Latency Checker - v3.9</span><br><span class="line">Measuring idle latencies (in ns)...</span><br><span class="line"> Numa node</span><br><span class="line">Numa node 0 1</span><br><span class="line"> 0 81.6 145.9</span><br><span class="line"> 1 144.9 81.2</span><br><span class="line"></span><br><span class="line">Measuring Peak Injection Memory Bandwidths for the system</span><br><span class="line">Bandwidths are in MB/sec (1 MB/sec = 1,000,000 Bytes/sec)</span><br><span class="line">Using all the threads from each core if Hyper-threading is enabled</span><br><span class="line">Using traffic with the following read-write ratios</span><br><span class="line">ALL Reads : 227204.2</span><br><span class="line">3:1 Reads-Writes : 212432.5</span><br><span class="line">2:1 Reads-Writes : 210423.3</span><br><span class="line">1:1 Reads-Writes : 196677.2</span><br><span class="line">Stream-triad like: 189691.4</span><br></pre></td></tr></table></figure><p>说明:mlc和 lmbench 测试结果不一样,mlc 时81和145,lmbench测试是68和100,这是两种测试方法的差异而已,但是快慢差距基本是一致的</p><h2 id="结论"><a href="#结论" class="headerlink" title="结论"></a>结论</h2><p>在OS 启动引导参数里设置 numa=off 完全没有必要、也不能解决抖动的问题,反而设置了 numa=off 只能是掩耳盗铃,让用户看不到 NUMA 结构</p>]]></content>
<summary type="html">
<h1 id="十年后数据库还是不敢拥抱NUMA-续篇"><a href="#十年后数据库还是不敢拥抱NUMA-续篇" class="headerlink" title="十年后数据库还是不敢拥抱NUMA-续篇"></a>十年后数据库还是不敢拥抱NUMA-续篇</h1><h2 i
</summary>
<category term="CPU" scheme="https://plantegg.github.io/categories/CPU/"/>
<category term="CPU" scheme="https://plantegg.github.io/tags/CPU/"/>
<category term="performance" scheme="https://plantegg.github.io/tags/performance/"/>
<category term="BIOS" scheme="https://plantegg.github.io/tags/BIOS/"/>
<category term="numa" scheme="https://plantegg.github.io/tags/numa/"/>
</entry>
<entry>
<title>流量一样但为什么CPU使用率差别很大</title>
<link href="https://plantegg.github.io/2024/04/26/%E6%B5%81%E9%87%8F%E4%B8%80%E6%A0%B7%E4%BD%86%E4%B8%BA%E4%BB%80%E4%B9%88CPU%E4%BD%BF%E7%94%A8%E7%8E%87%E5%B7%AE%E5%88%AB%E5%BE%88%E5%A4%A7/"/>
<id>https://plantegg.github.io/2024/04/26/流量一样但为什么CPU使用率差别很大/</id>
<published>2024-04-26T04:30:03.000Z</published>
<updated>2024-11-20T10:00:53.721Z</updated>
<content type="html"><![CDATA[<h1 id="流量一样但为什么CPU使用率差别很大"><a href="#流量一样但为什么CPU使用率差别很大" class="headerlink" title="流量一样但为什么CPU使用率差别很大"></a>流量一样但为什么CPU使用率差别很大</h1><p>这是我翻到2013年的一篇文章,当时惊动所有公司高人,最后分析得知原因后所有人都跪拜,你要知道那是2013年,正好10年过去了,如果是现在用我们星球的理论去套的话,简直不要太容易</p><h2 id="问题描述"><a href="#问题描述" class="headerlink" title="问题描述"></a>问题描述</h2><blockquote><p>同样大小内存、同样的CPU、同样数量的请求、几乎可以忽略的io,两个机器的load却差异挺大。一个机器的load是12左右,另外一个机器却是30左右</p><p>你可以理解这是两台一摸一样的物理机挂在一个LVS 下,LVS 分发流量绝对均衡</p></blockquote><p>所以要找出为什么?</p><h2 id="分析"><a href="#分析" class="headerlink" title="分析"></a>分析</h2><p>两台机器的资源使用率:</p><figure class="highlight plain"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br></pre></td><td class="code"><pre><span class="line">//load低、CPU使用率低 的物理机,省略一部分核</span><br><span class="line">Cpu0 : 67.1%us, 1.6%sy, 0.0%ni, 30.6%id, 0.0%wa, 0.0%hi, 0.7%si, 0.0%st</span><br><span class="line">Cpu1 : 64.1%us, 1.6%sy, 0.0%ni, 34.3%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st</span><br><span class="line">Cpu2 : 63.0%us, 1.6%sy, 0.0%ni, 35.4%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st</span><br><span class="line">Cpu3 : 60.0%us, 1.3%sy, 0.0%ni, 38.4%id, 0.0%wa, 0.0%hi, 0.3%si, 0.0%st</span><br><span class="line">Cpu4 : 59.8%us, 1.3%sy, 0.0%ni, 37.9%id, 1.0%wa, 0.0%hi, 0.0%si, 0.0%st</span><br><span class="line">Cpu5 : 56.7%us, 1.0%sy, 0.0%ni, 42.3%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st</span><br><span class="line">Cpu6 : 63.4%us, 1.3%sy, 0.0%ni, 34.6%id, 0.0%wa, 0.0%hi, 0.7%si, 0.0%st</span><br><span class="line">Cpu7 : 62.5%us, 2.0%sy, 0.0%ni, 35.5%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st</span><br><span class="line">Cpu8 : 58.5%us, 1.3%sy, 0.0%ni, 39.5%id, 0.0%wa, 0.0%hi, 0.7%si, 0.0%st</span><br><span class="line">Cpu9 : 55.8%us, 1.6%sy, 0.0%ni, 42.2%id, 0.3%wa, 0.0%hi, 0.0%si, 0.0%st</span><br><span class="line"></span><br><span class="line">//load高、CPU使用率高 的物理机,省略一部分核</span><br><span class="line">Cpu0 : 90.1%us, 1.9%sy, 0.0%ni, 7.1%id, 0.0%wa, 0.0%hi, 1.0%si, 0.0%st</span><br><span class="line">Cpu1 : 88.5%us, 2.9%sy, 0.0%ni, 8.0%id, 0.0%wa, 0.0%hi, 0.6%si, 0.0%st</span><br><span class="line">Cpu2 : 90.4%us, 1.9%sy, 0.0%ni, 7.7%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st</span><br><span class="line">Cpu3 : 86.9%us, 2.6%sy, 0.0%ni, 10.2%id, 0.0%wa, 0.0%hi, 0.3%si, 0.0%st</span><br><span class="line">Cpu4 : 87.5%us, 1.9%sy, 0.0%ni, 10.2%id, 0.0%wa, 0.0%hi, 0.3%si, 0.0%st</span><br><span class="line">Cpu5 : 87.3%us, 1.9%sy, 0.0%ni, 10.5%id, 0.0%wa, 0.0%hi, 0.3%si, 0.0%st</span><br><span class="line">Cpu6 : 90.4%us, 2.9%sy, 0.0%ni, 6.4%id, 0.0%wa, 0.0%hi, 0.3%si, 0.0%st</span><br><span class="line">Cpu7 : 90.1%us, 1.9%sy, 0.0%ni, 7.6%id, 0.0%wa, 0.0%hi, 0.3%si, 0.0%st</span><br><span class="line">Cpu8 : 89.5%us, 2.6%sy, 0.0%ni, 6.7%id, 0.0%wa, 0.0%hi, 1.3%si, 0.0%st</span><br><span class="line">Cpu9 : 90.7%us, 1.9%sy, 0.0%ni, 7.4%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st</span><br></pre></td></tr></table></figure><p>可以分析产出为什么低,检查CPU是否降频、内存频率是否有差异——检查结果一致</p><p>10年前经过一阵 perf top 看热点后终于醒悟过来知道得去看 IPC,也就是相同CPU使用率下,其中慢的机器产出低了一半,那么继续通过perf看IPC:</p><p><img src="https://cdn.jsdelivr.net/gh/plantegg/plantegg.github.io/images/951413iMgBlog/FrsOfjsmHa6Zwv67IBgTd-GTI2fT.png" alt="img"></p><p>可以看到两台机器的IPC是 0.3 VS 0.55,和CPU使用率差异基本一致,instructions几乎一样(意味着流量一样,LVS 不背锅),但是处理同样的instructions 用掉的cpu-clock 几乎差了一倍,这应该是典型的内存时延大了一倍导致的。IPC 大致等于 instrunctions/cpu-clock (IPC:instrunctions per cycles)</p><p>经检查这两台物理机都是两路,虽然CPU型号/内存频率一致,但是主板间跨Socket的 QPI带宽差了一倍(主板是两个不同的服务商提供)。可以通过绑核测试不同Socket/Node 下内存时延来确认这个问题</p><p>这是同一台机器下两个Socket 的内存带宽,所以如果跨Socket 内存访问多了就会导致时延更高、CPU使用率更高</p><p><img src="https://cdn.jsdelivr.net/gh/plantegg/plantegg.github.io/images/951413iMgBlog/FmaZP2Wf1xiSoHyi2xHslbAVr71_.png" alt="img"></p><h2 id="总结"><a href="#总结" class="headerlink" title="总结"></a>总结</h2><p>在今天我们看到这种问题就很容易了,但我还是要感叹一下在入门前简直太神奇,入门后也不过尔尔,希望你也早点入门。</p><p>第一:向CPU要产出,同样的使用率产出得一样,不一样的话肯定是偷懒了,偷懒的直接证据就是 IPC 低了,导致IPC 低最常见的是内存时延高(内存频率、跨Node/Socket 等,或者内存碎片);延伸阅读:<a href="https://t.zsxq.com/10fYf762S" target="_blank" rel="noopener">性能的本质 IPC</a> ,也是本星球唯二的必读实验</p><p>第二:测试工具很完善了,<a href="https://github.com/intel/lmbench" target="_blank" rel="noopener">lmbench</a> , 怎么用lmbench <a href="https://plantegg.github.io/2022/01/13/%E4%B8%8D%E5%90%8CCPU%E6%80%A7%E8%83%BD%E5%A4%A7PK/">可以看这篇</a> ; 怎么使用perf <a href="https://plantegg.github.io/2021/05/16/Perf_IPC%E4%BB%A5%E5%8F%8ACPU%E5%88%A9%E7%94%A8%E7%8E%87/">Perf IPC以及CPU性能</a></p><p>,学成后装逼可以看 <a href="https://plantegg.github.io/2022/03/15/%E8%AE%B0%E4%B8%80%E6%AC%A1%E5%90%AC%E9%A3%8E%E6%89%87%E5%A3%B0%E9%9F%B3%E6%9D%A5%E5%AE%9A%E4%BD%8D%E6%80%A7%E8%83%BD/">听风扇声音来定位性能瓶颈</a> </p><p>我以前说过每个领域都有一些核心知识点,IPC 就是CPU领域的核心知识点,和tcp的rmem/wmem 一样很容易引导你入门</p><p>计算机专业里非要挑几个必学的知识点肯定得有计算机组成原理,但计算机组成原理内容太多,都去看也不现实,况且很多过时的东西,那么我只希望你能记住计算机组成原理里有个最核心的麻烦:内存墙——CPU 访问内存太慢导致了内存墙是我们碰到众多性能问题的最主要、最核心的一个,结合今天这个案例掌握IPC后再来学内存墙,再到理解计算机组成原理就对了,从一个实用的小点入手。</p><p>计算机专业里除掉组成原理(有点高大上,没那么接地气),另外一个我觉得最有用的是网络——看着low但是接地气,问题多,很实用</p><p>2011年的文章:</p><h4 id="详解服务器内存带宽计算和使用情况测量"><a href="#详解服务器内存带宽计算和使用情况测量" class="headerlink" title="详解服务器内存带宽计算和使用情况测量"></a><strong><a href="http://blog.yufeng.info/archives/1511" target="_blank" rel="noopener">详解服务器内存带宽计算和使用情况测量</a></strong></h4><p>更好的工具来发现类似问题:<a href="https://github.com/intel/numatop" target="_blank" rel="noopener">https://github.com/intel/numatop</a></p><p><img src="https://cdn.jsdelivr.net/gh/plantegg/plantegg.github.io/images/951413iMgBlog/FlOhgPPnxN3DcMRPUvNvbZOuQy0q.png" alt="img"></p><h2 id="如果你觉得看完对你很有帮助可以通过如下方式找到我"><a href="#如果你觉得看完对你很有帮助可以通过如下方式找到我" class="headerlink" title="如果你觉得看完对你很有帮助可以通过如下方式找到我"></a>如果你觉得看完对你很有帮助可以通过如下方式找到我</h2><p>find me on twitter: <a href="https://twitter.com/plantegg" target="_blank" rel="noopener">@plantegg</a></p><p>知识星球:<a href="https://t.zsxq.com/0cSFEUh2J" target="_blank" rel="noopener">https://t.zsxq.com/0cSFEUh2J</a></p><p>开了一个星球,在里面讲解一些案例、知识、学习方法,肯定没法让大家称为顶尖程序员(我自己都不是),只是希望用我的方法、知识、经验、案例作为你的垫脚石,帮助你快速、早日成为一个基本合格的程序员。</p><p>争取在星球内:</p><ul><li>养成基本动手能力</li><li>拥有起码的分析推理能力–按我接触的程序员,大多都是没有逻辑的</li><li>知识上教会你几个关键的知识点</li></ul><img src="https://cdn.jsdelivr.net/gh/plantegg/plantegg.github.io/images/951413iMgBlog/image-20240324161113874.png" alt="image-20240324161113874" style="zoom:50%;">]]></content>
<summary type="html">
<h1 id="流量一样但为什么CPU使用率差别很大"><a href="#流量一样但为什么CPU使用率差别很大" class="headerlink" title="流量一样但为什么CPU使用率差别很大"></a>流量一样但为什么CPU使用率差别很大</h1><p>这是我翻到2
</summary>
<category term="CPU" scheme="https://plantegg.github.io/categories/CPU/"/>
<category term="CPU" scheme="https://plantegg.github.io/tags/CPU/"/>
<category term="performance" scheme="https://plantegg.github.io/tags/performance/"/>
<category term="perf" scheme="https://plantegg.github.io/tags/perf/"/>
</entry>
<entry>
<title>SocketTimeout 后客户端怎么做和服务端怎么做</title>
<link href="https://plantegg.github.io/2024/04/10/SocketTimeout%20%E5%90%8E%E5%AE%A2%E6%88%B7%E7%AB%AF%E6%80%8E%E4%B9%88%E5%81%9A%E3%80%81%E6%9C%8D%E5%8A%A1%E7%AB%AF%E6%80%8E%E4%B9%88%E5%81%9A/"/>
<id>https://plantegg.github.io/2024/04/10/SocketTimeout 后客户端怎么做、服务端怎么做/</id>
<published>2024-04-10T09:30:03.000Z</published>
<updated>2024-12-30T02:31:19.324Z</updated>
<content type="html"><![CDATA[<h1 id="SocketTimeout-后客户端怎么做和服务端怎么做"><a href="#SocketTimeout-后客户端怎么做和服务端怎么做" class="headerlink" title="SocketTimeout 后客户端怎么做和服务端怎么做"></a>SocketTimeout 后客户端怎么做和服务端怎么做</h1><h2 id="背景"><a href="#背景" class="headerlink" title="背景"></a>背景</h2><p>希望通过一个极简,几乎是人人都可以上手验证的实验来触及到一些深度的内容,然后再看看是否会激发你进一步自己设计类似实验和验证过程等</p><p>关于这种简单类型的实验欢迎给我提意见:比如你会不会做;太难、太容易?能学到东西吗?效果如何?我要如何改进</p><h2 id="安装JDK和MySQL"><a href="#安装JDK和MySQL" class="headerlink" title="安装JDK和MySQL"></a>安装JDK和MySQL</h2><figure class="highlight plain"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br></pre></td><td class="code"><pre><span class="line">yum install -y java-1.8.0-openjdk.x86_64 java-1.8.0-openjdk-devel.x86_64 podman-docker.noarch wireshark </span><br><span class="line"></span><br><span class="line">//启动MySQL Server,root密码123</span><br><span class="line">docker run -it -d --net=host -e MYSQL_ROOT_PASSWORD=123 --name=plantegg mysql</span><br><span class="line"></span><br><span class="line">docker run --net=host -v /root/mysql/my3306.cnf:/etc/my.cnf -it -d -e MYSQL_ROOT_PASSWORD=123 --name=mysql3306 mysql:8.0</span><br><span class="line"></span><br><span class="line">//可能需要的MySQL 账号命令</span><br><span class="line">//8.0密码问题,可以设置配置:</span><br><span class="line">ALTER USER 'test'@'localhost' IDENTIFIED WITH mysql_native_password BY '123';</span><br><span class="line">ALTER USER 'root'@'%' IDENTIFIED WITH mysql_native_password BY '123';</span><br></pre></td></tr></table></figure><p>测试环境机器是<a href="https://www.aliyun.com/daily-act/ecs/activity_selection" target="_blank" rel="noopener">99块一年购买的aliyun ECS</a>,OS版本选ALinux3,对应内核版本:</p><figure class="highlight plain"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">5.10.134-15.al8.x86_64</span><br></pre></td></tr></table></figure><p>测试使用的MySQL 版本:</p><figure class="highlight plain"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br></pre></td><td class="code"><pre><span class="line">mysql> \s</span><br><span class="line">--------------</span><br><span class="line">mysql Ver 8.0.32 for Linux on x86_64 (Source distribution)</span><br><span class="line"></span><br><span class="line">Connection id:9</span><br><span class="line">Current database:test</span><br><span class="line">Current user:[email protected]</span><br><span class="line">SSL:Not in use</span><br><span class="line">Current pager:stdout</span><br><span class="line">Using outfile:''</span><br><span class="line">Using delimiter:;</span><br><span class="line">Server version:8.2.0 MySQL Community Server - GPL</span><br><span class="line">Protocol version:10</span><br><span class="line">Connection:127.1 via TCP/IP</span><br><span class="line">Server characterset:utf8mb4</span><br><span class="line">Db characterset:utf8mb4</span><br><span class="line">Client characterset:utf8mb4</span><br><span class="line">Conn. characterset:utf8mb4</span><br><span class="line">TCP port:3306</span><br><span class="line">Binary data as:Hexadecimal</span><br><span class="line">Uptime:15 hours 46 min 24 sec</span><br><span class="line"></span><br><span class="line">Threads: 2 Questions: 34 Slow queries: 0 Opens: 176 Flush tables: 3 Open tables: 95 Queries per second avg: 0.000</span><br></pre></td></tr></table></figure><h2 id="客户端"><a href="#客户端" class="headerlink" title="客户端"></a>客户端</h2><figure class="highlight plain"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br><span class="line">39</span><br><span class="line">40</span><br><span class="line">41</span><br><span class="line">42</span><br><span class="line">43</span><br><span class="line">44</span><br><span class="line">45</span><br><span class="line">46</span><br><span class="line">47</span><br><span class="line">48</span><br><span class="line">49</span><br><span class="line">50</span><br><span class="line">51</span><br><span class="line">52</span><br><span class="line">53</span><br><span class="line">54</span><br><span class="line">55</span><br><span class="line">56</span><br><span class="line">57</span><br><span class="line">58</span><br><span class="line">59</span><br><span class="line">60</span><br><span class="line">61</span><br><span class="line">62</span><br><span class="line">63</span><br><span class="line">64</span><br><span class="line">65</span><br><span class="line">66</span><br><span class="line">67</span><br><span class="line">68</span><br><span class="line">69</span><br><span class="line">70</span><br><span class="line">71</span><br><span class="line">72</span><br><span class="line">73</span><br><span class="line">74</span><br><span class="line">75</span><br><span class="line">76</span><br><span class="line">77</span><br></pre></td><td class="code"><pre><span class="line">测试代码(复制粘贴就可以编译运行了,运行时需要下载jdbc mysql driver,链接见附录):</span><br><span class="line">import java.sql.Connection;</span><br><span class="line">import java.sql.DriverManager;</span><br><span class="line">import java.sql.ResultSet;</span><br><span class="line">import java.sql.SQLException;</span><br><span class="line">import java.sql.Statement;</span><br><span class="line">import java.sql.PreparedStatement;</span><br><span class="line">public class Test { //不要琢磨代码规范、为什么要这么写,就是为了方便改吧改吧做很多不同的验证试验</span><br><span class="line"> public static void main(String args[]) throws NumberFormatException, InterruptedException, ClassNotFoundException {</span><br><span class="line"> Class.forName("com.mysql.jdbc.Driver");</span><br><span class="line"> String url = args[0];</span><br><span class="line"> String user = args[1];</span><br><span class="line"> String pass = args[2];</span><br><span class="line"> String sql = args[3];</span><br><span class="line"> String interval = args[4];</span><br><span class="line"> try {</span><br><span class="line"> Connection conn = DriverManager.getConnection(url, user, pass);</span><br><span class="line"> while (true) {</span><br><span class="line"> PreparedStatement stmt = conn.prepareStatement(sql);</span><br><span class="line"> //stmt.setFetchSize(Integer.MIN_VALUE); //这句是表示开流式读取,但是每条SQL 都会先发set net_write_timeout=600 给Server</span><br><span class="line"> stmt.setString(1, interval);</span><br><span class="line"> ResultSet rs = stmt.executeQuery();</span><br><span class="line"> rs.close();</span><br><span class="line"> stmt.close();</span><br><span class="line"></span><br><span class="line"> PreparedStatement stmt2 = conn.prepareStatement(sql);</span><br><span class="line"> stmt2.setString(1, interval);</span><br><span class="line"> rs = stmt2.executeQuery();</span><br><span class="line">while (rs.next()) {</span><br><span class="line"> System.out.println("fine");</span><br><span class="line">}</span><br><span class="line"> rs.close();</span><br><span class="line"> stmt2.close();</span><br><span class="line"></span><br><span class="line"> Thread.sleep(Long.valueOf(interval));</span><br><span class="line">break;</span><br><span class="line"> }</span><br><span class="line">conn.close();</span><br><span class="line"> } catch (SQLException e) {</span><br><span class="line"> e.printStackTrace();</span><br><span class="line"> }</span><br><span class="line"> }</span><br><span class="line">}</span><br><span class="line"></span><br><span class="line">#javac Test.java //编译,需要提前安装JDK</span><br><span class="line">//执行,需要下载jdbc jar驱动,见附录,还需要有一个数据库,随便建个表,或者查里面自带的库都可以</span><br><span class="line">#java -cp .:./mysql-connector-java-5.1.45.jar Test "jdbc:mysql://127.0.0.1:3306/test?useSSL=false&useServerPrepStmts=true&cachePrepStmts=true&connectTimeout=500&socketTimeout=1700" root 123 "select sleep(10), id from sbtest1 where id= ?" 100 //设置了1.7秒超时查询还不返回的话业务代码报错,堆栈如下:</span><br><span class="line">com.mysql.jdbc.exceptions.jdbc4.CommunicationsException: Communications link failure //连接异常</span><br><span class="line"></span><br><span class="line">The last packet successfully received from the server was 1,701(1700ms) milliseconds ago. The last packet sent successfully to the server was 1,701 milliseconds ago.</span><br><span class="line">at java.base/jdk.internal.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)</span><br><span class="line">at java.base/jdk.internal.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)</span><br><span class="line">at java.base/jdk.internal.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)</span><br><span class="line">at java.base/java.lang.reflect.Constructor.newInstanceWithCaller(Constructor.java:500)</span><br><span class="line">at java.base/java.lang.reflect.Constructor.newInstance(Constructor.java:481)</span><br><span class="line">at com.mysql.jdbc.Util.handleNewInstance(Util.java:425)</span><br><span class="line">at com.mysql.jdbc.SQLError.createCommunicationsException(SQLError.java:990)</span><br><span class="line">at com.mysql.jdbc.MysqlIO.reuseAndReadPacket(MysqlIO.java:3559)</span><br><span class="line">at com.mysql.jdbc.MysqlIO.reuseAndReadPacket(MysqlIO.java:3459)</span><br><span class="line">at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:3900)</span><br><span class="line">at com.mysql.jdbc.MysqlIO.sendCommand(MysqlIO.java:2527)</span><br><span class="line">at com.mysql.jdbc.ServerPreparedStatement.serverExecute(ServerPreparedStatement.java:1283)</span><br><span class="line">at com.mysql.jdbc.ServerPreparedStatement.executeInternal(ServerPreparedStatement.java:783)</span><br><span class="line">at com.mysql.jdbc.PreparedStatement.executeQuery(PreparedStatement.java:1966)</span><br><span class="line">at Test.main(Test.java:30)</span><br><span class="line">Caused by: java.net.SocketTimeoutException: Read timed out // 异常,JDBC Driver 会调 Socket.setSoTimeout 来设置超时时间给 timeRead使用</span><br><span class="line">at java.base/sun.nio.ch.NioSocketImpl.timedRead(NioSocketImpl.java:284) //timedRead 函数可以设置读取超时(timeout)</span><br><span class="line">at java.base/sun.nio.ch.NioSocketImpl.implRead(NioSocketImpl.java:310)</span><br><span class="line">at java.base/sun.nio.ch.NioSocketImpl.read(NioSocketImpl.java:351)</span><br><span class="line">at java.base/sun.nio.ch.NioSocketImpl$1.read(NioSocketImpl.java:802)</span><br><span class="line">at java.base/java.net.Socket$SocketInputStream.read(Socket.java:919)</span><br><span class="line">at com.mysql.jdbc.util.ReadAheadInputStream.fill(ReadAheadInputStream.java:101)</span><br><span class="line">at com.mysql.jdbc.util.ReadAheadInputStream.readFromUnderlyingStreamIfNecessary(ReadAheadInputStream.java:144)</span><br><span class="line">at com.mysql.jdbc.util.ReadAheadInputStream.read(ReadAheadInputStream.java:174)</span><br><span class="line">at com.mysql.jdbc.MysqlIO.readFully(MysqlIO.java:3008)</span><br><span class="line">at com.mysql.jdbc.MysqlIO.reuseAndReadPacket(MysqlIO.java:3469)</span><br><span class="line">... 7 more</span><br></pre></td></tr></table></figure><p>源码参考:<a href="https://sourcegraph.com/github.com/openjdk/jdk/-/blob/src/java.base/share/classes/sun/nio/ch/NioSocketImpl.java?L291:18-291:26" target="_blank" rel="noopener">https://sourcegraph.com/github.com/openjdk/jdk/-/blob/src/java.base/share/classes/sun/nio/ch/NioSocketImpl.java?L291:18-291:26</a></p><p>客户端读到一半的时候 MySQL Hang 了,也会触发 SocketTimeoutException 异常,同时客户端还会看到 Consider raising value of ‘net_write_timeout’ on the server(测试代码开启了流式读取)</p><figure class="highlight plain"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br></pre></td><td class="code"><pre><span class="line">java -cp .:./mysql-connector-java-5.1.45.jar Test "jdbc:mysql://gf1:3307/test?useSSL=false&useServerPrepStmts=true&cachePrepStmts=true&connectTimeout=500&socketTimeout=1500&netTimeoutForStreamingResults=0" root 123 "select *, id from streaming " 5000</span><br><span class="line"></span><br><span class="line">timestamp:1734084150084 id:1 count:60798</span><br><span class="line">timestamp:1734084150084 id:2 count:60799</span><br><span class="line">timestamp:1734084151594 //读到 60799行后,MySQL 卡了,读卡了 1500ms 后报错</span><br><span class="line">com.mysql.jdbc.exceptions.jdbc4.CommunicationsException: Application was streaming results when the connection failed. Consider raising value of 'net_write_timeout' on the server.</span><br><span class="line">at java.base/jdk.internal.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)</span><br><span class="line">at java.base/jdk.internal.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:77)</span><br><span class="line">at java.base/jdk.internal.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)</span><br><span class="line">at java.base/java.lang.reflect.Constructor.newInstanceWithCaller(Constructor.java:500)</span><br><span class="line">at java.base/java.lang.reflect.Constructor.newInstance(Constructor.java:481)</span><br><span class="line">at com.mysql.jdbc.Util.handleNewInstance(Util.java:425)</span><br><span class="line">at com.mysql.jdbc.SQLError.createCommunicationsException(SQLError.java:990)</span><br><span class="line">at com.mysql.jdbc.MysqlIO.reuseAndReadPacket(MysqlIO.java:3559)</span><br><span class="line">at com.mysql.jdbc.MysqlIO.reuseAndReadPacket(MysqlIO.java:3459)</span><br><span class="line">at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:3900)</span><br><span class="line">at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:873)</span><br><span class="line">at com.mysql.jdbc.MysqlIO.nextRow(MysqlIO.java:1996)</span><br><span class="line">at com.mysql.jdbc.RowDataDynamic.nextRecord(RowDataDynamic.java:374)</span><br><span class="line">at com.mysql.jdbc.RowDataDynamic.next(RowDataDynamic.java:354)</span><br><span class="line">at com.mysql.jdbc.ResultSetImpl.next(ResultSetImpl.java:6312)</span><br><span class="line">at Test.main(Test.java:38)</span><br><span class="line">Caused by: java.net.SocketTimeoutException: Read timed out</span><br><span class="line">at java.base/sun.nio.ch.NioSocketImpl.timedRead(NioSocketImpl.java:288)</span><br><span class="line">at java.base/sun.nio.ch.NioSocketImpl.implRead(NioSocketImpl.java:314)</span><br><span class="line">at java.base/sun.nio.ch.NioSocketImpl.read(NioSocketImpl.java:355)</span><br><span class="line">at java.base/sun.nio.ch.NioSocketImpl$1.read(NioSocketImpl.java:808)</span><br><span class="line">at java.base/java.net.Socket$SocketInputStream.read(Socket.java:966)</span><br><span class="line">at com.mysql.jdbc.util.ReadAheadInputStream.fill(ReadAheadInputStream.java:101)</span><br><span class="line">at com.mysql.jdbc.util.ReadAheadInputStream.readFromUnderlyingStreamIfNecessary(ReadAheadInputStream.java:144)</span><br><span class="line">at com.mysql.jdbc.util.ReadAheadInputStream.read(ReadAheadInputStream.java:174)</span><br><span class="line">at com.mysql.jdbc.MysqlIO.readFully(MysqlIO.java:3008)</span><br><span class="line">at com.mysql.jdbc.MysqlIO.reuseAndReadPacket(MysqlIO.java:3469)</span><br><span class="line">... 8 more</span><br></pre></td></tr></table></figure><h2 id="服务端对应的抓包"><a href="#服务端对应的抓包" class="headerlink" title="服务端对应的抓包"></a>服务端对应的抓包</h2><p>如果OS 比较老,安装的tshark 也较老,那么命令参数略微不一样,主要是 col.Info 这个列,没有 _ws 前缀:</p><figure class="highlight plain"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">#tshark -i eth0 port 3306 -d tcp.port==3306,mysql -nn -T fields -e frame.number -e frame.time_delta -e tcp.srcport -e tcp.dstport -e col.Info -e mysql.query</span><br></pre></td></tr></table></figure><p>如果是阿里云 99 买了ECS,安装的内核版本较高比如4.19,那么配套安装的tshark也较高,就用如下命令:</p><figure class="highlight plain"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br><span class="line">39</span><br><span class="line">40</span><br><span class="line">41</span><br><span class="line">42</span><br><span class="line">43</span><br><span class="line">44</span><br></pre></td><td class="code"><pre><span class="line">#tshark -i eth0 -Y "tcp.port==3306" -d tcp.port==3306,mysql -T fields -e frame.number -e frame.time -e frame.time_delta -e tcp.srcport -e tcp.dstport -e tcp.len -e _ws.col.Info -e mysql.query</span><br><span class="line"> //第二列是时间间隔</span><br><span class="line">10.000000000302603306kingdomsonline > mysql [SYN] Seq=0 Win=42340 Len=0 MSS=1460 SACK_PERM=1 WS=512</span><br><span class="line">20.000024473330630260mysql > kingdomsonline [SYN, ACK] Seq=0 Ack=1 Win=29200 Len=0 MSS=1460 SACK_PERM=1 WS=128</span><br><span class="line">30.000271938302603306kingdomsonline > mysql [ACK] Seq=1 Ack=1 Win=42496 Len=0 //3次握手</span><br><span class="line">40.000660359330630260Server Greeting proto=10 version=8.2.0 //MySQL server主动发送版本、问候信息等</span><br><span class="line">50.000263009302603306kingdomsonline > mysql [ACK] Seq=1 Ack=78 Win=42496 Len=0</span><br><span class="line">60.039698745302603306Login Request user=test db=test //客户端验证账号密码</span><br><span class="line">70.000009044330630260mysql > kingdomsonline [ACK] Seq=78 Ack=243 Win=30336 Len=0</span><br><span class="line">80.000171281330630260Response</span><br><span class="line">90.000260062302603306kingdomsonline > mysql [ACK] Seq=243 Ack=126 Win=42496 Len=0</span><br><span class="line">100.000298127302603306Request Unknown (168)</span><br><span class="line">110.000142114330630260Response OK </span><br><span class="line">120.000255322302603306kingdomsonline > mysql [ACK] Seq=267 Ack=137 Win=42496 Len=0</span><br><span class="line">130.003596187302603306Request Query/* mysql-connector-java-5.1.45 ( Revision: 9131eefa398531c7dc98776e8a3fe839e544c5b2 ) */SELECT @@session.auto_increment_increment AS auto_increment_increment, @@character_set_client AS character_set_client, @@character_set_connection AS character_set_connection, @@character_set_results AS character_set_results, @@character_set_server AS character_set_server, @@collation_server AS collation_server, @@init_connect AS init_connect, @@interactive_timeout AS interactive_timeout, @@license AS license, @@lower_case_table_names AS lower_case_table_names, @@max_allowed_packet AS max_allowed_packet, @@net_buffer_length AS net_buffer_length, @@net_write_timeout AS net_write_timeout, @@have_query_cache AS have_query_cache, @@sql_mode AS sql_mode, @@system_time_zone AS system_time_zone, @@time_zone AS time_zone, @@transaction_isolation AS transaction_isolation, @@wait_timeout AS wait_timeout</span><br><span class="line">140.000328419330630260Response</span><br><span class="line">150.000266581302603306kingdomsonline > mysql [ACK] Seq=1164 Ack=1208 Win=42496 Len=0</span><br><span class="line">160.022407439302603306Request QuerySHOW WARNINGS</span><br><span class="line">170.000058143330630260Response</span><br><span class="line">180.000267585302603306kingdomsonline > mysql [ACK] Seq=1182 Ack=1411 Win=42496 Len=0</span><br><span class="line">190.001776177302603306Request QuerySET NAMES utf8mb4 //客户端设置charset</span><br><span class="line">200.000052102330630260Response OK</span><br><span class="line">210.000263257302603306kingdomsonline > mysql [ACK] Seq=1204 Ack=1422 Win=42496 Len=0</span><br><span class="line">220.000175172302603306Request QuerySET character_set_results = NULL</span><br><span class="line">230.000046756330630260Response OK</span><br><span class="line">240.000258191302603306kingdomsonline > mysql [ACK] Seq=1241 Ack=1433 Win=42496 Len=0</span><br><span class="line">250.000185322302603306Request QuerySET autocommit=1</span><br><span class="line">260.000037833330630260Response OK</span><br><span class="line">270.000255747302603306kingdomsonline > mysql [ACK] Seq=1262 Ack=1444 Win=42496 Len=0</span><br><span class="line">280.011132112302603306Request Prepare Statementselect sleep(10), id from sbtest1 where id= ? //进一步学习</span><br><span class="line">290.000171861330630260Response //作业:Prepared Statement 放回了啥?</span><br><span class="line">300.000290736302603306kingdomsonline > mysql [ACK] Seq=1312 Ack=1570 Win=42496 Len=0</span><br><span class="line">310.000613187302603306Request Execute Statement //客户端发送SQL请求</span><br><span class="line">320.039923585330630260mysql > kingdomsonline [ACK] Seq=1570 Ack=1334 Win=32128 Len=0</span><br><span class="line">331.675682641302603306kingdomsonline > mysql [FIN, ACK] Seq=1334 Ack=1570 Win=42496 Len=0 //1.7秒后客户端发fin主动断开</span><br><span class="line">340.039320026330630260mysql > kingdomsonline [ACK] Seq=1570 Ack=1335 Win=32128 Len=0</span><br><span class="line"></span><br><span class="line">//MySQL 还完全不知道客户端fin了,继续发送响应结果。tcp断开在OS 层面处理,业务再使用这个已断开的连接时OS 会返回错误</span><br><span class="line">353.245406398330630260Response</span><br><span class="line">360.000041708330630260Server Greeting Error 1158 //MySQL 感知到OS返回的错误,发送错误码(已经没有用了),不过客户端已经断开收不到了</span><br><span class="line">370.000053987330630260mysql > kingdomsonline [FIN, ACK] Seq=1742 Ack=1335 Win=32128 Len=0</span><br><span class="line">380.000165707302603306kingdomsonline > mysql [RST] Seq=1335 Win=0 Len=0 //连接都断开了,客户端已经退出,客户端的OS代发reset </span><br><span class="line">390.000017860302603306kingdomsonline > mysql [RST] Seq=1335 Win=0 Len=0</span><br><span class="line">400.000082025302603306kingdomsonline > mysql [RST] Seq=1335 Win=0 Len=0</span><br></pre></td></tr></table></figure><p>如果你的 tshark 版本较高,以上命令行可以改为:</p><figure class="highlight plain"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">tshark -i lo -Y "tcp.port==3306" -T fields -e frame.number -e frame.time_delta -e tcp.srcport -e tcp.dstport -e _ws.col.Info -e mysql.query</span><br></pre></td></tr></table></figure><p>GPT4.0 Turbo 对上面这个 tshark 命令的解释:</p><figure class="highlight plain"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br></pre></td><td class="code"><pre><span class="line">你提到的命令使用tshark捕获在端口3306(MySQL的默认端口)上的网络流量,并提取特定的字段进行显示。tshark是Wireshark的命令行版本,一个非常强大的网络协议分析工具。</span><br><span class="line"></span><br><span class="line">这条命令的组成部分解释如下:</span><br><span class="line"></span><br><span class="line">-i eth0: 指定tshark监听eth0接口上的流量。这是网络接口的名称,可以根据实际情况进行修改。</span><br><span class="line"></span><br><span class="line">port 3306: 指定tshark只捕获目标或源端口为3306的TCP流量,这是MySQL服务器监听的标准端口。</span><br><span class="line"></span><br><span class="line">-T fields: 设置tshark的输出格式为字段值(fields),意味着输出将以列的形式显示指定的字段。</span><br><span class="line"></span><br><span class="line">-e frame.number: 添加一个字段,显示每个捕获的帧的编号。</span><br><span class="line"></span><br><span class="line">-e frame.time_delta: 添加一个字段,显示自上一帧以来经过的时间(以秒为单位)。</span><br><span class="line"></span><br><span class="line">-e tcp.srcport: 添加一个字段,显示TCP源端口号。</span><br><span class="line"></span><br><span class="line">-e tcp.dstport: 添加一个字段,显示TCP目标端口号。</span><br><span class="line"></span><br><span class="line">-e col.Info: 添加一个字段,显示捕获帧的信息列。</span><br><span class="line"></span><br><span class="line">-e mysql.query: 添加一个字段,如果捕获的TCP流量是MySQL流量,显示MySQL查询。</span><br><span class="line"></span><br><span class="line">这个命令的输出将是以空格分隔的列,每一列代表上述参数指定的一个字段。每一行代表一个捕获的网络帧,其中包含MySQL查询。</span><br><span class="line"></span><br><span class="line">请注意,由于这个命令包含了mysql.query,它需要tshark能够解析MySQL协议。而且,如果MySQL流量是加密的(例如使用SSL/TLS),那么tshark将无法直接显示查询内容。</span><br><span class="line"></span><br><span class="line">在运行这个命令前,你需要有足够的权限来捕获网络流量,通常需要root权限或者足够的权限来访问网络接口。如果你不是root用户,可以使用sudo运行tshark命令。</span><br><span class="line"></span><br><span class="line">sudo tshark -i eth0 port 3306 -T fields -e frame.number -e frame.time_delta -e tcp.srcport -e tcp.dstport -e col.Info -e mysql.query</span><br><span class="line">这个命令将实时显示MySQL查询相关的流量信息。如果你想要保存结果到一个文件,可以使用重定向或-w选项。如果只想要捕获一定数量的包,可以使用-c选项后跟要捕获的包的数量。</span><br></pre></td></tr></table></figure><p>最后用一张大截图来演示这个实验:</p><p><img src="https://cdn.jsdelivr.net/gh/plantegg/plantegg.github.io/images/951413iMgBlog/image-20240111121421625.png" alt="image-20240111121421625"></p><h2 id="PreparedStatement-验证"><a href="#PreparedStatement-验证" class="headerlink" title="PreparedStatement 验证"></a>PreparedStatement 验证</h2><p>测试代码,用Prepared执行三次查询:</p><figure class="highlight java"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br></pre></td><td class="code"><pre><span class="line"> PreparedStatement stmt = conn.prepareStatement(sql);</span><br><span class="line"><span class="comment">// stmt.setFetchSize(Integer.MIN_VALUE);</span></span><br><span class="line"> stmt.setString(<span class="number">1</span>, interval);</span><br><span class="line"> ResultSet rs = stmt.executeQuery();</span><br><span class="line"> rs.close();</span><br><span class="line"> stmt.close();</span><br><span class="line"></span><br><span class="line"> PreparedStatement stmt2 = conn.prepareStatement(sql);</span><br><span class="line"> stmt2.setString(<span class="number">1</span>, interval);</span><br><span class="line"> rs = stmt2.executeQuery();</span><br><span class="line"> <span class="comment">//Thread.sleep(60000);</span></span><br><span class="line"> <span class="keyword">while</span> (rs.next()) {</span><br><span class="line"> System.out.println(<span class="string">"fine"</span>);</span><br><span class="line"> }</span><br><span class="line"> rs = stmt2.executeQuery();</span><br><span class="line"> <span class="comment">//Thread.sleep(60000);</span></span><br><span class="line"> <span class="keyword">while</span> (rs.next()) {</span><br><span class="line"> System.out.println(<span class="string">"fine"</span>);</span><br><span class="line"> }</span><br><span class="line"> rs.close();</span><br><span class="line"> stmt2.close();</span><br></pre></td></tr></table></figure><p>如图绿色是Prepared过程不会真执行 Select 查数据,只是把这条SQL 发给Server,让Server 提前编译,可以看出来编译时间0.000146秒(绿色方框),因为SQL 非常简单;三个红色线分别是3次真正的查询,都走了Prepared(不再传 Select了),不过时间很不稳定,所以这个统计必须大批量。红色方框是三次通过Prepared 执行 Select 查数据的执行时间:</p><p><img src="https://cdn.jsdelivr.net/gh/plantegg/plantegg.github.io/images/951413iMgBlog/image-20240322103643718.png" alt="image-20240322103643718"></p><h2 id="结论"><a href="#结论" class="headerlink" title="结论"></a>结论</h2><p>作为一个CRUD boy从以上实验中你可以学到哪些东西?</p><ul><li>客户端报错堆栈要熟悉,Communications link failure (很多原因可以导致这个错误哈)和 java.net.SocketTimeoutException: Read timed out</li><li>JDBC 连接参数要配置socketTimeout,不配置会导致很多很多故障,显得CRUD boy太业余</li><li>抓包,从抓包中学到每个动作,反过来分析原因,比如这次报错就是客户端发送了查询过1.7秒主动断开,所以问题在客户端,1.7秒也要敏感</li><li>最重要的是学到这个实验过程,比如再自己去试试分析 PreparedStatement 的工作原理,如何才能让 PreparedStatement 生效</li></ul><h2 id="进一步学习"><a href="#进一步学习" class="headerlink" title="进一步学习"></a>进一步学习</h2><p>你可以把抓包保存,然后下载到wireshark中,能看到具体每一个包的详细内容,比如加密后的密码、Prepared statement是个啥(一个唯一id)</p><p>比如明明MySQL Server感知到了连接断开错误(Message: Got an error reading communication packets) 还要挣扎着返回这个错误信息给客户端有必要吗?</p><p>java 跑着,直接kill -9 java-pid 看看服务端收到什么包?(有经验后下次看到这样的症状就知道为啥了)</p><h2 id="参考"><a href="#参考" class="headerlink" title="参考"></a>参考</h2><p><a href="https://fromdual.com/mysql-error-codes-and-messages-1150-1199#error_er_net_read_error" target="_blank" rel="noopener">MySQL 1158错误信息的详细意思</a></p><p><a href="http://www.java2s.com/example/jar/m/download-mysqlconnectorjava5145jar-file.html" target="_blank" rel="noopener">mysql jdbc driver</a></p><h1 id="后续-Debug"><a href="#后续-Debug" class="headerlink" title="后续 Debug"></a>后续 Debug</h1><h2 id="为啥我的Java-代码跑半天也不报错:"><a href="#为啥我的Java-代码跑半天也不报错:" class="headerlink" title="为啥我的Java 代码跑半天也不报错:"></a>为啥我的Java 代码跑半天也不报错:</h2><p><img src="https://cdn.jsdelivr.net/gh/plantegg/plantegg.github.io/images/951413iMgBlog/image-20240115090336526.png" alt="image-20240115090336526"></p><p>jstack -p java-pid ,可以看到main 卡在执行SQL 后等结果的堆栈里,所以不是Java sleep了,等看对端MySQL 在干什么</p><figure class="highlight plain"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br></pre></td><td class="code"><pre><span class="line">"Reference Handler" #2 daemon prio=10 os_prio=0 tid=0x00007f6a5c0db000 nid=0x109d8e in Object.wait() [0x00007f6a60a75000]</span><br><span class="line"> java.lang.Thread.State: WAITING (on object monitor)</span><br><span class="line"> at java.lang.Object.wait(Native Method)</span><br><span class="line"> - waiting on <0x00000000f6b08d90> (a java.lang.ref.Reference$Lock)</span><br><span class="line"> at java.lang.Object.wait(Object.java:502)</span><br><span class="line"> at java.lang.ref.Reference.tryHandlePending(Reference.java:191)</span><br><span class="line"> - locked <0x00000000f6b08d90> (a java.lang.ref.Reference$Lock)</span><br><span class="line"> at java.lang.ref.Reference$ReferenceHandler.run(Reference.java:153)</span><br><span class="line"></span><br><span class="line">"main" #1 prio=5 os_prio=0 tid=0x00007f6a5c04b000 nid=0x109d8a runnable [0x00007f6a638e9000]</span><br><span class="line"> java.lang.Thread.State: RUNNABLE</span><br><span class="line"> at java.net.SocketInputStream.socketRead0(Native Method)</span><br><span class="line"> at java.net.SocketInputStream.socketRead(SocketInputStream.java:116)</span><br><span class="line"> at java.net.SocketInputStream.read(SocketInputStream.java:171)</span><br><span class="line"> at java.net.SocketInputStream.read(SocketInputStream.java:141)</span><br><span class="line"> at com.mysql.jdbc.util.ReadAheadInputStream.fill(ReadAheadInputStream.java:101)</span><br><span class="line"> at com.mysql.jdbc.util.ReadAheadInputStream.readFromUnderlyingStreamIfNecessary(ReadAheadInputStream.java:144)</span><br><span class="line"> at com.mysql.jdbc.util.ReadAheadInputStream.read(ReadAheadInputStream.java:174)</span><br><span class="line"> - locked <0x00000000f6b71370> (a com.mysql.jdbc.util.ReadAheadInputStream)</span><br><span class="line"> at com.mysql.jdbc.MysqlIO.readFully(MysqlIO.java:3008)</span><br><span class="line"> at com.mysql.jdbc.MysqlIO.reuseAndReadPacket(MysqlIO.java:3469)</span><br><span class="line"> at com.mysql.jdbc.MysqlIO.reuseAndReadPacket(MysqlIO.java:3459)</span><br><span class="line"> at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:3900)</span><br><span class="line"> at com.mysql.jdbc.MysqlIO.sendCommand(MysqlIO.java:2527)</span><br><span class="line"> at com.mysql.jdbc.ServerPreparedStatement.serverExecute(ServerPreparedStatement.java:1283)</span><br><span class="line"> - locked <0x00000000f6b0a228> (a com.mysql.jdbc.JDBC4Connection)</span><br><span class="line"> at com.mysql.jdbc.ServerPreparedStatement.executeInternal(ServerPreparedStatement.java:783)</span><br><span class="line"> - locked <0x00000000f6b0a228> (a com.mysql.jdbc.JDBC4Connection)</span><br><span class="line"> at com.mysql.jdbc.PreparedStatement.executeQuery(PreparedStatement.java:1966)</span><br><span class="line"> - locked <0x00000000f6b0a228> (a com.mysql.jdbc.JDBC4Connection)</span><br><span class="line"> at Test.main(Test.java:30)</span><br><span class="line"></span><br><span class="line">"VM Thread" os_prio=0 tid=0x00007f6a5c0d1000 nid=0x109d8d runnable</span><br><span class="line"></span><br><span class="line">"GC task thread#0 (ParallelGC)" os_prio=0 tid=0x00007f6a5c05e000 nid=0x109d8b runnable</span><br><span class="line"></span><br><span class="line">"GC task thread#1 (ParallelGC)" os_prio=0 tid=0x00007f6a5c060000 nid=0x109d8c runnable</span><br></pre></td></tr></table></figure><h2 id="抓包"><a href="#抓包" class="headerlink" title="抓包"></a>抓包</h2><p>为啥抓不到任何包?</p><p><img src="https://cdn.jsdelivr.net/gh/plantegg/plantegg.github.io/images/951413iMgBlog/image-20240115090420370.png" alt="image-20240115090420370"></p><p>先确认3306 端口是你的MySQL 在跑:</p><figure class="highlight plain"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br></pre></td><td class="code"><pre><span class="line"># ss -lntp |grep 3306</span><br><span class="line">LISTEN 0 151 *:3306 *:* users:(("mysqld",pid=1023638,fd=22))</span><br><span class="line">LISTEN 0 70 *:33060 *:* users:(("mysqld",pid=1023638,fd=20))</span><br></pre></td></tr></table></figure><p><em>:3306 中的‘“</em>” 表示MySQLD 监听本机任何网卡的3306端口,查看一下网卡名字:</p><figure class="highlight plain"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br></pre></td><td class="code"><pre><span class="line"># ip a</span><br><span class="line">1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000</span><br><span class="line"> link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00</span><br><span class="line"> inet 127.0.0.1/8 scope host lo</span><br><span class="line"> valid_lft forever preferred_lft forever</span><br><span class="line"> inet6 ::1/128 scope host</span><br><span class="line"> valid_lft forever preferred_lft forever</span><br><span class="line">2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP group default qlen 1000</span><br><span class="line"> link/ether 00:16:3e:39:b5:e0 brd ff:ff:ff:ff:ff:ff</span><br><span class="line"> altname enp0s5</span><br><span class="line"> altname ens5</span><br><span class="line"> inet 172.17.151.5/20 brd 172.17.159.255 scope global dynamic noprefixroute eth0</span><br><span class="line"> valid_lft 309352989sec preferred_lft 309352989sec</span><br><span class="line"> inet6 fe80::216:3eff:fe39:b5e0/64 scope link</span><br><span class="line"> valid_lft forever preferred_lft forever</span><br><span class="line">3: cni-podman0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000</span><br><span class="line"> link/ether de:a8:06:82:76:00 brd ff:ff:ff:ff:ff:ff</span><br><span class="line"> inet 10.88.0.1/16 brd 10.88.255.255 scope global cni-podman0</span><br><span class="line"> valid_lft forever preferred_lft forever</span><br><span class="line"> inet6 fe80::dca8:6ff:fe82:7600/64 scope link</span><br><span class="line"> valid_lft forever preferred_lft forever</span><br><span class="line">4: veth82cad224@if2: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue master cni-podman0 state UP group default</span><br><span class="line"> link/ether 4e:1b:4a:0d:a9:e2 brd ff:ff:ff:ff:ff:ff link-netns netns-58786150-bf63-2ae1-242f-cf221eed34fe</span><br><span class="line"> inet6 fe80::4c1b:4aff:fe0d:a9e2/64 scope link</span><br><span class="line"> valid_lft forever preferred_lft forever</span><br></pre></td></tr></table></figure><p>尝试 tshark -i any – any是个什么鬼,展开学习下抓包参数</p><p>这里对select sleep 不确定的话可以Google sleep的单位、用法;也可以MySQL Client 验证一下</p><figure class="highlight plain"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br></pre></td><td class="code"><pre><span class="line"># mysql -h127.1 -uroot -p123</span><br><span class="line">Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.</span><br><span class="line"></span><br><span class="line">mysql> select sleep(1.4);</span><br><span class="line">+------------+</span><br><span class="line">| sleep(1.4) |</span><br><span class="line">+------------+</span><br><span class="line">| 0 |</span><br><span class="line">+------------+</span><br><span class="line">1 row in set (1.40 sec)</span><br><span class="line"></span><br><span class="line">为什么不用mysql client做这个SocketTimeout的实验:mysql似乎没有SocketTimeout这个参数:</span><br><span class="line">mysql --help |grep -i time</span><br><span class="line"> and reconnecting may take a longer time. Disable with</span><br><span class="line"> --connect-timeout=# Number of seconds before connection timeout.</span><br><span class="line">connect-timeout 0</span><br></pre></td></tr></table></figure><h2 id="终于能抓到包了"><a href="#终于能抓到包了" class="headerlink" title="终于能抓到包了"></a>终于能抓到包了</h2><p><img src="https://cdn.jsdelivr.net/gh/plantegg/plantegg.github.io/images/951413iMgBlog/image-20240115090938622.png" alt="image-20240115090938622"></p><h2 id="Kill-Java"><a href="#Kill-Java" class="headerlink" title="Kill Java"></a>Kill Java</h2><figure class="highlight plain"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br></pre></td><td class="code"><pre><span class="line">560.014671013401023306Request Prepare Statementselect sleep(60), id from sbtest1 where id= ?</span><br><span class="line">570.000253230330640102Response</span><br><span class="line">580.00017311040102330640102 → 3306 [ACK] Seq=1312 Ack=1570 Win=65536 Len=0 TSval=478689604 TSecr=478689604</span><br><span class="line">590.000602784401023306Request Execute Statement</span><br><span class="line">600.0409031273306401023306 → 40102 [ACK] Seq=1570 Ack=1334 Win=65536 Len=0 TSval=478689645 TSecr=478689604</span><br><span class="line"></span><br><span class="line"></span><br><span class="line">770.39702382040102330640102 → 3306 [FIN, ACK] Seq=1334 Ack=1570 Win=65536 Len=0 TSval=478705206 TSecr=478689645</span><br><span class="line">780.0404249263306401023306 → 40102 [ACK] Seq=1570 Ack=1335 Win=65536 Len=0 TSval=478705246 TSecr=478705206</span><br><span class="line">830.793390527330640102Response</span><br><span class="line">840.00001652240102330640102 → 3306 [RST] Seq=1335 Win=0 Len=0</span><br></pre></td></tr></table></figure><h2 id="mysql-kill-pid"><a href="#mysql-kill-pid" class="headerlink" title="mysql kill pid"></a>mysql kill pid</h2><figure class="highlight plain"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br></pre></td><td class="code"><pre><span class="line"># tshark -i lo -Y "tcp.port==59636" -T fields -e frame.number -e frame.time_delta -e tcp.srcport -e tcp.dstport -e _ws.col.Info -e mysql.query</span><br><span class="line">Running as user "root" and group "root". This could be dangerous.</span><br><span class="line">Capturing on 'Loopback'</span><br><span class="line">//</span><br><span class="line">850.0000422613306596363306 → 59636 [FIN, ACK] Seq=1 Ack=1 Win=512 Len=0 TSval=478849322 TSecr=478831136</span><br><span class="line">920.00810647059636330659636 → 3306 [FIN, ACK] Seq=1 Ack=2 Win=512 Len=0 TSval=478849333 TSecr=478849322</span><br><span class="line">930.0000086123306596363306 → 59636 [ACK] Seq=2 Ack=2 Win=512 Len=0 TSval=478849333 TSecr=478849333</span><br></pre></td></tr></table></figure><h2 id="kill-mysqld-pid"><a href="#kill-mysqld-pid" class="headerlink" title="kill mysqld pid"></a>kill mysqld pid</h2><figure class="highlight plain"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br></pre></td><td class="code"><pre><span class="line">]# tcpdump -i lo port 50436</span><br><span class="line">dropped privs to tcpdump</span><br><span class="line">tcpdump: verbose output suppressed, use -v or -vv for full protocol decode</span><br><span class="line">listening on lo, link-type EN10MB (Ethernet), capture size 262144 bytes</span><br><span class="line"></span><br><span class="line"></span><br><span class="line">09:32:50.138291 IP localhost.mysql > localhost.50436: Flags [F.], seq 206507581, ack 2788331892, win 512, options [nop,nop,TS val 479088777 ecr 479041791], length 0</span><br><span class="line">09:32:50.150621 IP localhost.50436 > localhost.mysql: Flags [F.], seq 1, ack 1, win 512, options [nop,nop,TS val 479088789 ecr 479088777], length 0</span><br><span class="line">09:32:50.150640 IP localhost.mysql > localhost.50436: Flags [.], ack 2, win 512, options [nop,nop,TS val 479088789 ecr 479088789], length 0</span><br></pre></td></tr></table></figure><h2 id="视频学习"><a href="#视频学习" class="headerlink" title="视频学习"></a>视频学习</h2><p>如果你也想试试这个实验的话,可以参考我们的视频:<a href="https://meeting.tencent.com/user-center/shared-record-info?id=c0962ad4-16bc-4ac8-83ab-2e302c372e73&is-single=false&record_type=2&from=3" target="_blank" rel="noopener">https://meeting.tencent.com/user-center/shared-record-info?id=c0962ad4-16bc-4ac8-83ab-2e302c372e73&is-single=false&record_type=2&from=3</a></p><h2 id="如果你觉得看完对你很有帮助可以通过如下方式找到我"><a href="#如果你觉得看完对你很有帮助可以通过如下方式找到我" class="headerlink" title="如果你觉得看完对你很有帮助可以通过如下方式找到我"></a>如果你觉得看完对你很有帮助可以通过如下方式找到我</h2><p>find me on twitter: <a href="https://twitter.com/plantegg" target="_blank" rel="noopener">@plantegg</a></p><p>知识星球:<a href="https://t.zsxq.com/0cSFEUh2J" target="_blank" rel="noopener">https://t.zsxq.com/0cSFEUh2J</a></p><p>开了一个星球,在里面讲解一些案例、知识、学习方法,肯定没法让大家称为顶尖程序员(我自己都不是),只是希望用我的方法、知识、经验、案例作为你的垫脚石,帮助你快速、早日成为一个基本合格的程序员。</p><p>争取在星球内:</p><ul><li>养成基本动手能力</li><li>拥有起码的分析推理能力–按我接触的程序员,大多都是没有逻辑的</li><li>知识上教会你几个关键的知识点</li></ul><img src="https://cdn.jsdelivr.net/gh/plantegg/plantegg.github.io/images/951413iMgBlog/image-20240324161113874.png" alt="image-20240324161113874" style="zoom:50%;"><figure class="highlight plain"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">JDBC客户端MySQL服务器初始化阶段建立数据库连接连接成功配置参数socketTimeout=1459msnetTimeoutForStreamingResults=1s设置流式查询setAutoCommit(false)setFetchSize(Integer.MIN_VALUE)执行查询: SELECT * FROM data_table开始准备结果集返回第一批数据开始处理第一条数据Thread.sleep(1500ms)net_write_timeout计时开始(1s)等待客户端处理...1秒后超时关闭连接仍在sleep(1500ms)尝试读取下一条数据连接已关闭抛出CommunicationsExceptionJDBC客户端MySQL服务器</span><br></pre></td></tr></table></figure>]]></content>
<summary type="html">
<h1 id="SocketTimeout-后客户端怎么做和服务端怎么做"><a href="#SocketTimeout-后客户端怎么做和服务端怎么做" class="headerlink" title="SocketTimeout 后客户端怎么做和服务端怎么做"></a>So
</summary>
<category term="MySQL" scheme="https://plantegg.github.io/categories/MySQL/"/>
<category term="MySQL" scheme="https://plantegg.github.io/tags/MySQL/"/>
<category term="SocketTimeout" scheme="https://plantegg.github.io/tags/SocketTimeout/"/>
<category term="tcpdump" scheme="https://plantegg.github.io/tags/tcpdump/"/>
</entry>
<entry>
<title>无招胜有招--一周年总结</title>
<link href="https://plantegg.github.io/2024/03/25/%E6%97%A0%E6%8B%9B%E8%83%9C%E6%9C%89%E6%8B%9B/"/>
<id>https://plantegg.github.io/2024/03/25/无招胜有招/</id>
<published>2024-03-25T09:30:03.000Z</published>
<updated>2024-11-20T10:00:55.387Z</updated>
<content type="html"><![CDATA[<h1 id="无招胜有招–一周年总结"><a href="#无招胜有招–一周年总结" class="headerlink" title="无招胜有招–一周年总结"></a>无招胜有招–一周年总结</h1><p>大家抱着美好和雄赳赳的目标来到<a href="https://wx.zsxq.com/dweb2/index/group/15552551584552" target="_blank" rel="noopener">这个知识星球</a>,开始的时候兴奋地以为找到了银弹(其实银弹是有的,在文章最后),经过一段时间后大概率发现没什么变化,然后就回到了以前的老路子上,我觉得关键问题是你没获取到星球的精华,所以这篇我打算反复再唠叨一下</p><h2 id="知识效率-工程效率"><a href="#知识效率-工程效率" class="headerlink" title="知识效率 工程效率"></a><strong><a href="https://t.zsxq.com/14IBWajEq" target="_blank" rel="noopener">知识效率 工程效率</a></strong></h2><p>虽然我们现在通过这篇《<a href="https://t.zsxq.com/14IBWajEq" target="_blank" rel="noopener">知识效率 工程效率</a>》知道了两者的差别, 但是还是需要记住通过积累可以将我们的学习能力从工程效率升级到知识效率(厚积薄发),大部分时候没有做到薄发,是因为你以为理解了、积累了实际没理解</p><h2 id="核心知识点"><a href="#核心知识点" class="headerlink" title="核心知识点"></a><strong>核心知识点</strong></h2><p>尽力寻找每个领域的核心知识点,核心知识点的定义就是通过一两个这样的知识点能撬动对整个领域的理解,也就是常说的<a href="http://www.baidu.com/link?url=9Hv8LOY09wOqjLFX-UuX35AxJjTDjmkHcSPm3ReeTWO-4rH-46hmz6aR4b-WP7PwZHUGkxEBhWt1iqHkM8uM56Au6Ada4lg6angCByW3J-BLDkxE45Aq-QqOTWzRspa4" target="_blank" rel="noopener">纲挈目张</a></p><p>比如网络领域里:一个网络包是怎么流转的+抓包。假如你理解<a href="https://plantegg.github.io/2019/05/15/%E5%B0%B1%E6%98%AF%E8%A6%81%E4%BD%A0%E6%87%82%E7%BD%91%E7%BB%9C--%E4%B8%80%E4%B8%AA%E7%BD%91%E7%BB%9C%E5%8C%85%E7%9A%84%E6%97%85%E7%A8%8B/">网络包的流转后</a>再去看<a href="https://plantegg.github.io/2019/06/20/%E5%B0%B1%E6%98%AF%E8%A6%81%E4%BD%A0%E6%87%82%E8%B4%9F%E8%BD%BD%E5%9D%87%E8%A1%A1--lvs%E5%92%8C%E8%BD%AC%E5%8F%91%E6%A8%A1%E5%BC%8F/">LVS 负载均衡的原理</a>你就发现只需要看一次你就能很好掌握LVS各个负载均衡的本质,而在这之前你反复看反复忘。掌握了这个知识点基本就可以通关整个领域,剩下的只是无招胜有招碰到一个挨个积累的问题了。</p><p>比如CPU领域理解超线程+IPC+会用perf和内存延时,理解超线程的本质是为什么一个核能干两个核的工作(这和操作系统的分时多任务背后原理是想通的),那是因为我们的程序没法吃满流水线(也就是没法用完一个核的计算能力,用IPC去衡量),没吃满闲置的时候就可以虚拟给另外一个进程用,比如CPU 跑起来最高IPC都能到4,但是无论你找一个Java还是MySQL 去看他们的IPC基本都在1以内,纯计算场景的IPC会高一点,IPC 可以到4但只跑到1的话也就是只用满了25%的能力,那当然可以再虚出来一个超线程提高效率。IPC 之所以低就是因为内存延时大,这么多年CPU的处理能力一直按摩尔定律在提升但是内存延时没有怎么提升,导致基本上我们常见的业务场景(Nginx/MySQL/Redis 等)都是CPU在等从内存取数据(所以搞了L1、L2、L3一堆cache)。</p><p>发散一下或者说<strong>留个作业</strong>你去看看<a href="https://plantegg.github.io/2021/05/14/%E5%8D%81%E5%B9%B4%E5%90%8E%E6%95%B0%E6%8D%AE%E5%BA%93%E8%BF%98%E6%98%AF%E4%B8%8D%E6%95%A2%E6%8B%A5%E6%8A%B1NUMA/">NUMA 的原理或者说本质就是为了让CPU知道就近分配读取内存以提升效率</a>。</p><p>你看<strong>整本计算机组成原理+性能的本质都在这一个知识点的范围内进行延伸和突破</strong>。</p><p>如果你发现一个核心知识点也欢迎写成博客文章分享出来</p><h2 id="读日志、错误信息"><a href="#读日志、错误信息" class="headerlink" title="读日志、错误信息"></a><strong>读日志、错误信息</strong></h2><p>我的经验只是大概20%左右的程序员会去耐心读别人的日志、报错信息,大部分摊摊手求助、放弃了</p><p>日志是最好的学习机会,我知道别人的日志写得很烂,但是你要能耐心多琢磨一点就会比别人更专业一点</p><h2 id="对知识的可观测性"><a href="#对知识的可观测性" class="headerlink" title="对知识的可观测性"></a><strong>对知识的可观测性</strong></h2><p>抓包、perf的使用这些平时要多积累,这点没有捷径,一个好的工程师肯定有<a href="https://plantegg.github.io/2016/10/12/ss%E7%94%A8%E6%B3%95%E5%A4%A7%E5%85%A8/">一堆好的锤子、瑞士军刀、工具包</a>的。在你掌握了知识点后要转化为工作效率,就得多积累这些工具,很多次我们碰到一个好的问题没分析出来是因为我们这种没有门槛的积累不够导致放弃了</p><p>比如需要抓包确认下,不会,一看tcpdump 一堆参数头疼放弃;比如想要知道长连接还是短连接,或者自己设置的长连接有没有生效,不会用netstat -o 这个参数去确认等;比如要下载个源码自己make/install 中间报了几个错误不仔细看放弃;</p><p>反过来回到我们所说的工程效率,就是靠这些工具帮你实现可视、可以触摸,网络之所以大多数同学在大学都学过但是最后基本学懂,就是因为这些网络的东东你只看理论很难立即,但是让你抓过一次包分析下就会恍然大悟——这就是关键门槛你能跨过去</p><h2 id="好习惯"><a href="#好习惯" class="headerlink" title="好习惯"></a><strong>好习惯</strong></h2><p>在星球里我更希望你带走一个好的习惯而不是一个具体知识点,虽然星球里的具体知识点、案例胜过很多教材,但他们总有过时、用不上的时候,唯有好的习惯可以跟随你,帮你实现无招胜有招</p><h3 id="记笔记"><a href="#记笔记" class="headerlink" title="记笔记"></a><strong>记笔记</strong></h3><p>放低身段,不要高估自己的能力(认为自己是知识效率),放低后你要怎么做呢:记笔记、记笔记、记笔记</p><p>只要是你在学习就要或者看书、看资料的时候觉得自己有点通透了,赶紧记录下来,因为大概率一个星期你就忘了,半年你就完全不记得自己以前看过一次了,我好多次看到一篇好文章就感叹自己学到了,兴奋地拉到文章最后想去评论下,结果发现居然有了自己的评论在下面 :)</p><h3 id="动手"><a href="#动手" class="headerlink" title="动手"></a><strong>动手</strong></h3><p>动手,看到后理解了,也记了笔记,其实最好还是要自己去重现,记下自己看到的现象和理解,动手又会有一堆门槛,搭环境、客观则、怎么验证等等,这个时候我前面说的可观测性里面积累的一大堆工具可以让你如有神助、重现起来效率就是比别人高</p><h3 id="汇总输出"><a href="#汇总输出" class="headerlink" title="汇总输出"></a><strong>汇总输出</strong></h3><p>最后笔记记完还没完,笔记基本是零散的,你反复积累后到了一定的时机就是要把他们总结汇总成一篇完整度较高的博客文章,这里当然有自己的虚荣心在这里,但更多的是为了自己查询方便,有了新的理解或者使用姿势我经常更新补充10年前的博客文章,不会写一篇新的,这个补充知识让我的知识结构更完善,不是为了多发一篇博文,我现在解决问题、使用工具基本要靠翻自己的博客文章照着操作</p><h3 id="慢就是快、少就是多"><a href="#慢就是快、少就是多" class="headerlink" title="慢就是快、少就是多"></a><strong>慢就是快、少就是多</strong></h3><p>往往我们喜欢求快,以为自己一看就懂;求多以为自己越看的多越厉害</p><h3 id="不要等着时间流投喂"><a href="#不要等着时间流投喂" class="headerlink" title="不要等着时间流投喂"></a><strong>不要等着时间流投喂</strong></h3><p>看这篇置顶:<a href="https://t.zsxq.com/14Yel6KBg" target="_blank" rel="noopener">https://t.zsxq.com/14Yel6KBg</a></p><h2 id="纲举目张"><a href="#纲举目张" class="headerlink" title="纲举目张"></a><strong>纲举目张</strong></h2><p>对公司的业务、一个软件的运转流程都要尽量做到理解</p><p>比如学MySQL 要尽量知道从一条SQL 怎么进来,进行哪些处理后得到了查询结果;比如前面讲过的一个网络包是怎么到达对端的;比如你们公司的请求是怎么从客户端到达服务端(中间经过了LVS、Nginx吗),服务端又是那些服务得依赖和调用,有没有Redis、MQ、Database,最后数据又是怎么返回的,我知道这在一个公司很难(屎山很复杂),但目前没有更好的方法让你快速掌握并立足</p><p>为什么出现问题后总有一两个人很快能猜出来问题可能在哪个环节,这一部分是经验但更多的是对系统的了解,你都不知道有Redis存在一旦出错了你肯定猜不到Redis这里来</p><p>可以看看我之前说的实习生的故事,完全真实哈:</p><blockquote><p>讲一个我碰到的实习生的事情</p><p>北邮毕业直接后直接到我司实习</p><p>特点:英语好、动手能力强、爱琢磨,除了程序、电脑没有其它爱好 :)</p><p>实习期间因为英语好把我司文档很快就翻烂了,对产品、业务逻辑的理解基本是顶尖的</p><p>实习期间很快成为所有老员工的红人,都离不开他,搭环境、了解业务流程</p><p>因为别人的习惯都是盯着自己眼前的这一趴,只有他对业务非常熟悉</p><p>实习后很快就转正了,又3年后transfer 去了美国总部</p><p>连女朋友都是老员工给牵线的,最后领证一起去了美国。为啥老员工这么热情,是大家真心喜欢他 </p></blockquote><p>再看看张一鸣自述的第一年的工作:</p><p><img src="https://cdn.jsdelivr.net/gh/plantegg/plantegg.github.io/images/951413iMgBlog/FuQNw04aH2PQwnApyAKY1dXRh-nt.png" alt="img"></p><h2 id="总结"><a href="#总结" class="headerlink" title="总结"></a><strong>总结</strong></h2><p>我前面所说的我也没做太好,希望大家能做得更好,我第一次感受无招胜有招就是<a href="https://plantegg.github.io/2022/01/01/%E4%B8%89%E4%B8%AA%E6%95%85%E4%BA%8B/">故事一里面</a>,到故事二过去差不多10年,这10年里我一直在琢磨怎么才能无招胜有招,也有在积累,但是花了10年肯定效率不算高,所以在星球里我希望通过我的经验帮你们缩短一些时间</p><p>上面讲再多如果你只是看看那根本还是没用,买再多的课也没用,关键是看触动后能否有点改变。你可以从里面试着挑几个你认为容易操作,比如记笔记、比如不要等着时间流投喂,或者有感触的试试先改变或者遵循下看看能不能获得一些变化进而形成正向循环</p><p>或者从评论里开始说说你星球这一年真正有哪些改变、学到了啥、你的感悟,不方便的也可以微信我私聊一下</p><p>这篇就当成整个星球学习的一个总结吧</p>]]></content>
<summary type="html">
<h1 id="无招胜有招–一周年总结"><a href="#无招胜有招–一周年总结" class="headerlink" title="无招胜有招–一周年总结"></a>无招胜有招–一周年总结</h1><p>大家抱着美好和雄赳赳的目标来到<a href="https://wx
</summary>
<category term="others" scheme="https://plantegg.github.io/categories/others/"/>
<category term="星球" scheme="https://plantegg.github.io/tags/%E6%98%9F%E7%90%83/"/>
<category term="案例" scheme="https://plantegg.github.io/tags/%E6%A1%88%E4%BE%8B/"/>
</entry>
<entry>
<title>网球肘 过劳性(持续)肌腱病的治疗</title>
<link href="https://plantegg.github.io/2024/03/14/%E7%BD%91%E7%90%83%E8%82%98/"/>
<id>https://plantegg.github.io/2024/03/14/网球肘/</id>
<published>2024-03-14T04:30:03.000Z</published>
<updated>2024-11-20T10:00:53.364Z</updated>
<content type="html"><![CDATA[<h1 id="网球肘-过劳性-持续-肌腱病的治疗"><a href="#网球肘-过劳性-持续-肌腱病的治疗" class="headerlink" title="网球肘 过劳性(持续)肌腱病的治疗"></a><a href="https://www.haoyishu.com/web/article/4947" target="_blank" rel="noopener">网球肘 过劳性(持续)肌腱病的治疗</a></h1><p>因为长期打球,导致手肘部分疼痛难耐,2024年1月开始进行了长时间的休息期,中间2024的2月是春节,所以总共修了快2个月,还不见好,于是去医院,其实医院给的治疗方案也不好,但是医师告诉了我一个关键词这个病叫:<strong>网球肘</strong></p><p>知道关键词后就开始了自我寻求治疗方案的过程,记下来供参考,到2024年3月14号,最近两周多次打球验证我的网球肘基本好了,所以说一下治疗过程</p><h2 id="个人总结"><a href="#个人总结" class="headerlink" title="个人总结"></a>个人总结</h2><p>网球肘的核心是肌肉过劳发炎了,所以关键是如何消炎</p><p>一定要用:<a href="https://www.tidepharm.com/productinfo/162025.html" target="_blank" rel="noopener">氟比洛芬凝胶贴膏</a> ,而且每天两贴尽量不要断,期间通过大拇指使劲按压疼痛部分来感受验证的减轻,一般连续贴3-5天会有明显的效果,如果无效请去医院</p><p>口服消炎药也可以试试,我估计针对性不强(瞎猜的,希望你试试后来告诉我)。至于体外冲击波可以尝试尝试,我个人的经验觉得还不足以证明其有效</p><p><strong>后面的可以不用看了</strong></p><h2 id="治疗"><a href="#治疗" class="headerlink" title="治疗"></a>治疗</h2><p>网球肘已经有几个月了,开始我没在意以为就是肌肉劳累,休息休息就会好,直到过年的时候我真正歇了一个多月,过完年偶尔一用力居然又开是疼,让我计划去医院看看,之前自己在社区医院开过几盒:<a href="https://www.tidepharm.com/productinfo/162025.html" target="_blank" rel="noopener">氟比洛芬凝胶贴膏</a></p><p><img src="https://cdn.jsdelivr.net/gh/plantegg/plantegg.github.io/images/951413iMgBlog/image-20240305141525956.png" alt="image-20240305141525956"></p><p>(家中请常备这个药,膏药里的神奇)</p><p>【适应症】<br>下列疾病及症状的镇痛、消炎:<br>骨关节炎、肩周炎、肌腱及腱鞘炎、腱鞘周围炎、肱骨外上髁炎 ( 网球肘 )、肌肉痛、外伤所致肿胀、疼痛</p><p>过年期间自己也偶尔贴一下,但是效果不明显(应该是没有连续贴导致的效果不好)。</p><p>这次去医院正规想看看,但是大医院挂不上号,于是去了一个小医院(社区医院推荐的,说这家别的不行,刚好看运动医学还不错),到医院大夫一听就笑着问我知不知道有一种病叫:网球肘。这是我第一次听说这个病,大夫用大拇指按压我的伤口附近,确实非常疼,结合我经常打球基本确诊。</p><p>然后给我开了两次体外冲击波物理治疗,当场治疗了一次,过程中很痛,打完的当时再按压就不疼了,但是过几个小时还是照旧(这也在医师的预料中),给我开了两次这个治疗,我只去了一次</p><blockquote><p>体外冲击波疗法(extracorporeal shock wave therapy, ESWT)是<strong>一种非侵入性、安全、有效治疗多种疾病的方法</strong>,在临床多个学科中得到了广泛应用,但临床应用不规范、治疗关键技术不一致、治疗方案不统一、培训体系不健全等问题严重制约了ESWT的临床推广应用。</p><p><a href="https://rs.yiigle.com/CN101658202302/1459029.htm#:~:text=%E4%BD%93%E5%A4%96%E5%86%B2%E5%87%BB%E6%B3%A2%E7%96%97%E6%B3%95%EF%BC%88extracorporeal%20shock,ESWT%E7%9A%84%E4%B8%B4%E5%BA%8A%E6%8E%A8%E5%B9%BF%E5%BA%94%E7%94%A8%E3%80%82" target="_blank" rel="noopener">https://rs.yiigle.com/CN101658202302/1459029.htm#:~:text=%E4%BD%93%E5%A4%96%E5%86%B2%E5%87%BB%E6%B3%A2%E7%96%97%E6%B3%95%EF%BC%88extracorporeal%20shock,ESWT%E7%9A%84%E4%B8%B4%E5%BA%8A%E6%8E%A8%E5%B9%BF%E5%BA%94%E7%94%A8%E3%80%82</a></p></blockquote><p>回到家我就开始了对“网球肘”的学习,<a href="https://zhuanlan.zhihu.com/p/626645077" target="_blank" rel="noopener">中间找到这篇最关键的经验贴</a>【你一定要看】,我把这里面最有价值的引用一下:</p><p>比较对症的治疗方法是,内服 <a href="https://www.zhihu.com/search?type=content&q=%E6%B4%9B%E7%B4%A2%E6%B4%9B%E8%8A%AC%E9%92%A0%E7%89%87" target="_blank" rel="noopener">洛索洛芬钠片</a>,外贴 <a href="https://www.zhihu.com/search?q=%E6%B0%9F%E6%AF%94%E6%B4%9B%E8%8A%AC%E5%87%9D%E8%83%B6%E8%B4%B4%E8%86%8F&search_source=Entity&hybrid_search_source=Entity&hybrid_search_extra=%7B%22sourceType%22:%22answer%22,%22sourceId%22:%2244406930%22%7D" target="_blank" rel="noopener">氟比洛芬凝胶贴膏</a>(商标是泽普思),尤其要注意用量和时机</p><ul><li><p><strong>洛索洛芬钠片——我这次没吃这个</strong></p></li><li><ul><li>一日三次,每次两片(60mg/片)</li><li>饭后服用(<strong>切记!</strong>)</li></ul></li><li><p><strong>氟比洛芬凝胶贴膏</strong>——这点最重要,我不再像以前一样偶尔贴,而是连续一周每天两贴</p></li><li><ul><li>每次一帖,白天/晚上 各一帖</li><li>除了洗澡之外,尽量连续贴</li></ul></li></ul><p>按照上面的做法,经过5天后我的网球肘真的神奇地好了,中间还阳了3天(所以第二次冲击波治疗我也没去)</p><h3 id="经验"><a href="#经验" class="headerlink" title="经验"></a>经验</h3><p>我的判断是网球肘消炎很重要,应该还是氟比洛芬凝胶贴膏起了关键作用,但是要注意:连续贴一周,每天两贴</p><p>至于冲击波是否有效果,我目前觉得可能有效果,但是证据还不够</p><p><strong>知道这个病的名字很重要,这样就有了搜索关键字,看别人描述相对来说我这次不算严重</strong></p><h3 id="UpToDate-临床顾问"><a href="#UpToDate-临床顾问" class="headerlink" title="UpToDate 临床顾问"></a>UpToDate 临床顾问</h3><p>知道名字后,我在淘宝上购买了 UpToDate 临床顾问论文库的账号(收录了几乎所有的医学论文,但是只对收费会员开放),专业点说如果你好好研究 UpToDate,再结合自身状况可以得到比很多专业医师更专业的治疗</p><p>但是这次查到的治疗方案都是普通的消炎、镇痛(对乙氨基酚)等,但是不妨碍你下次可以继续到这里查,一般买个3天的账号才几块钱</p><p><img src="https://cdn.jsdelivr.net/gh/plantegg/plantegg.github.io/images/951413iMgBlog/image-20240305145408804.png" alt="image-20240305145408804"></p><p>另外也推荐大家看默沙东手册(完全免费,有<a href="https://www.msdmanuals.cn/home/injuries-and-poisoning/sports-injuries/lateral-epicondylitis" target="_blank" rel="noopener">网页</a>和app版本):</p><p><img src="https://cdn.jsdelivr.net/gh/plantegg/plantegg.github.io/images/951413iMgBlog/image-20240305145706453.png" alt="image-20240305145706453"></p><h3 id="自我检查"><a href="#自我检查" class="headerlink" title="自我检查"></a>自我检查</h3><p>自己按压疼痛的地方确认在什么地方,结合平时的运动和习惯,是否恢复也可以通过按压和发力来确认</p><h3 id="非甾体类抗炎药-NSAID"><a href="#非甾体类抗炎药-NSAID" class="headerlink" title="非甾体类抗炎药(NSAID)"></a>非甾体类抗炎药(NSAID)</h3><p><strong>非甾体抗炎药</strong>(non-steroidal anti-inflammatory drugs,NSAIDs)又称<strong>非类固醇抗炎药</strong>,简称<strong>非甾体类</strong>,是一类具有解热<a href="https://zh.wikipedia.org/wiki/%E9%95%87%E7%97%9B%E8%8D%AF" target="_blank" rel="noopener">镇痛</a>效果的药物,在施用较高剂量时也具有<a href="https://zh.wikipedia.org/wiki/%E6%8A%97%E7%82%8E%E6%80%A7" target="_blank" rel="noopener">消炎作用</a>。</p><p>NSAID 包括<strong>布洛芬(Advil、Motrin IB 等)、萘普生纳(Aleve、Anaprox DS 等)、双氯芬酸钠和塞来昔布(Celebrex)</strong></p><p>非甾体抗炎药中,属<a href="https://zh.wikipedia.org/wiki/%E9%98%BF%E6%96%AF%E5%8C%B9%E6%9E%97" target="_blank" rel="noopener">阿司匹林</a>、<a href="https://zh.wikipedia.org/wiki/%E5%B8%83%E6%B4%9B%E8%8A%AC" target="_blank" rel="noopener">伊布洛芬</a>、<a href="https://zh.wikipedia.org/wiki/%E7%94%B2%E8%8A%AC%E9%82%A3%E9%85%B8" target="_blank" rel="noopener">甲芬那酸</a>、<a href="https://zh.wikipedia.org/wiki/%E8%90%98%E6%99%AE%E7%94%9F" target="_blank" rel="noopener">萘普生</a>最为著名,在绝大多数国家都可作为<a href="https://zh.wikipedia.org/wiki/%E9%9D%9E%E8%99%95%E6%96%B9%E8%97%A5" target="_blank" rel="noopener">非处方药</a>销售[<a href="https://zh.wikipedia.org/wiki/%E9%9D%9E%E7%94%BE%E4%BD%93%E6%8A%97%E7%82%8E%E8%8D%AF#cite_note-The_Physician_and_Sportsmedicine_2010-4" target="_blank" rel="noopener">4]</a>。</p><p><a href="https://zh.wikipedia.org/wiki/%E5%AF%B9%E4%B9%99%E9%85%B0%E6%B0%A8%E5%9F%BA%E9%85%9A" target="_blank" rel="noopener">对乙酰氨基酚</a>因其抗炎作用微弱,而通常不被归为非甾体抗炎药,它主要通过抑制分布在中枢神经系统的<a href="https://zh.wikipedia.org/wiki/%E7%92%B0%E6%B0%A7%E5%90%88%E9%85%B6" target="_blank" rel="noopener">环氧合酶</a>-2,以减少<a href="https://zh.wikipedia.org/wiki/%E5%89%8D%E5%88%97%E8%85%BA%E7%B4%A0" target="_blank" rel="noopener">前列腺素</a>的生成,从而缓解疼痛,但由于<a href="https://zh.wikipedia.org/wiki/%E7%92%B0%E6%B0%A7%E5%90%88%E9%85%B6" target="_blank" rel="noopener">环氧合酶</a>-2在周边组织中数量较少,因此作用微弱</p><h3 id="抗炎治疗"><a href="#抗炎治疗" class="headerlink" title="抗炎治疗"></a>抗炎治疗</h3><p><strong>抗炎治疗</strong> — 尽管抗炎治疗多年来都是肘部肌腱病的主要疗法,但支持性证据仅来自成功个案和极少数研究。在医学界对肌腱病有了科学认识之后,抗炎疗法对肘部肌腱病和其他慢性退行性肌腱病的作用也出现了争议。抗炎治疗包括冰敷、NSAID、离子透入疗法和注射糖皮质激素。对于LET,冰敷联合离心力量及柔韧性训练并未优于单纯离心力量训练[<a href="http://www.uptodate.zd.hggfdd.top/contents/zh-Hans/elbow-tendinopathy-tennis-and-golf-elbow/abstract/51" target="_blank" rel="noopener">51</a>]。(参见上文[‘病理生理学’](<a href="http://www.uptodate.zd.hggfdd.top/contents/zh-Hans/elbow-tendinopathy-tennis-and-golf-elbow?search=Tennis" target="_blank" rel="noopener">http://www.uptodate.zd.hggfdd.top/contents/zh-Hans/elbow-tendinopathy-tennis-and-golf-elbow?search=Tennis</a> Elbow 冲击波&source=search_result&selectedTitle=3~150&usage_type=default&display_rank=3#H4))</p><h3 id="体外冲击波疗法(extracorporeal-shock-wave-therapy-ESWT)"><a href="#体外冲击波疗法(extracorporeal-shock-wave-therapy-ESWT)" class="headerlink" title="体外冲击波疗法(extracorporeal shock wave therapy, ESWT)"></a>体外冲击波疗法(extracorporeal shock wave therapy, ESWT)</h3><p><strong>体外震波治疗和其他电物理疗法</strong> — 声波已用于治疗慢性LET。总体而言,支持体外震波治疗(extracorporeal shock wave therapy, ESWT)和其他“电物理”疗法的证据并不令人信服,所以我们不予以推荐[<a href="http://www.uptodate.zd.hggfdd.top/contents/zh-Hans/elbow-tendinopathy-tennis-and-golf-elbow/abstract/82" target="_blank" rel="noopener">82</a>]。该操作通常会令患者不适,但有些研究显示ESWT有一定益处[<a href="http://www.uptodate.zd.hggfdd.top/contents/zh-Hans/elbow-tendinopathy-tennis-and-golf-elbow/abstract/83" target="_blank" rel="noopener">83</a>],但也有许多研究未发现ESWT有益[<a href="http://www.uptodate.zd.hggfdd.top/contents/zh-Hans/elbow-tendinopathy-tennis-and-golf-elbow/abstract/84,85" target="_blank" rel="noopener">84,85</a>]。</p><p><a href="https://www.sohu.com/a/722027555_100107953" target="_blank" rel="noopener">体外冲击波疗法临床应用中国疼痛学专家共识 2023版</a> </p><h2 id="定义"><a href="#定义" class="headerlink" title="定义"></a>定义</h2><p>“网球肘 ”(TenniS Elbow)又名肱骨外上髁炎 (1ateralepicondylitis),以网球运动 员发病率高 而得名</p><p>广义 的网球肘可分 为具有不同临床特点的四个类型 :</p><ol><li>外侧网球肘 :亦称肱骨外上髁炎 ,即经典的网 球肘 ,主要累及附于肱骨外上髁的桡侧腕短伸肌腱起 点。</li><li>内侧网球肘:亦称肱骨内上髁炎或高尔夫球肘, 主要 累及附于 肱骨 内上髁 的屈肌 和旋前 圆肌腱 起 点。</li><li>后侧网球肘 :亦称三头肌腱炎 。</li><li>混合型网球 肘 :内外侧网球肘同时发生 ,并不少见</li></ol><h2 id="读懂医疗发票"><a href="#读懂医疗发票" class="headerlink" title="读懂医疗发票"></a>读懂医疗发票</h2><p>自付二:对有自付的药品、检查费收取自费部分;比如药品:10%或50%;检查费:8%;材料费:30%;</p><p>自付一:根据下图,报销比例是90%,也就是你还要出总医药费的90%;但是要注意<strong>总医药费要减掉自付二的部分</strong></p><p>举例:一张发票开了一盒泰诺(酚麻美敏片) 13.47块(乙类清单 10%自付),还有一盒没有无自付的头孢 5.17块,共18.64</p><p>最后发票显示自付二:1.35 就是13.47<em>10% ——这个10%是因为该药有部分自费,自付一:1.73 是 (18.64-(13.47</em>10%))*10% ——这个10%就是达到起付线1800后报销90% </p><p>如果你没有达到起付线,就是100%自付,那么不存在自付二,付款金额全部显示为自付一</p><p><img src="https://cdn.jsdelivr.net/gh/plantegg/plantegg.github.io/images/951413iMgBlog/W020230920365631669567.png" alt="职工基本医疗保险门(急)诊待遇标准"></p><h2 id="痛风"><a href="#痛风" class="headerlink" title="痛风"></a><a href="https://mp.weixin.qq.com/s/hujlWS3Q0xde0z0rDKiPpQ" target="_blank" rel="noopener">痛风</a></h2><p>痛风是由于血中尿酸含量过高(高尿酸血症)而导致尿酸盐结晶沉积在关节内的疾病。沉积的结晶导致关节内和关节周围出现疼痛性炎症的发作。通常具有家族遗传性</p><p>痛风在男性中的发病率高于女性。男性痛风患者较女性常见,通常发生在中年男性和绝经期后的女性。很少发生于年轻人,但如果小于 30 岁的人发生痛风,其病情一般较重。</p><p>食物和痛风没关系。如果发作了就吃止痛药。不疼的时候吃非布司他(找医生开,这是处方药)。定期检查尿酸是否有降下来。 然后非常容易被忽视的一点是注意别喝含糖饮料,干脆戒掉。</p><p>而止痛的药物无非是:</p><ul><li>非甾体抗炎药(NSAIDs)</li><li>秋水仙碱</li><li>糖皮质激素</li></ul><p>秋水仙碱或秋水仙素,考虑到其肝肾毒性,我是肯定坚决不吃。至于糖皮质激素,也就是俗称的激素,偶尔用一次或许也可以接受,但不能持续用</p><h2 id="风湿性多肌痛"><a href="#风湿性多肌痛" class="headerlink" title="风湿性多肌痛"></a>风湿性多肌痛</h2><p>风湿性多肌痛是关节滑膜的炎症,是一种能引起颈、肩、髋部肌肉疼痛和僵硬的疾病。</p><p>风湿性多肌痛发病年龄在 55 岁以上,原因尚不清楚。女性的患病率较男性高。风湿性多肌痛的病因尚不清楚。风湿性多肌痛可与 <a href="https://www.msdmanuals.cn/home/bone-joint-and-muscle-disorders/vasculitic-disorders/giant-cell-arteritis" target="_blank" rel="noopener">巨细胞(颞)动脉炎</a>同时出现,也可在它之前或之后出现。有学者认为这两种疾病是同一种病变的不同表现。风湿性多肌痛似乎比巨细胞动脉炎更常见。</p><h2 id="类风湿性关节炎-RA"><a href="#类风湿性关节炎-RA" class="headerlink" title="类风湿性关节炎 (RA)"></a>类风湿性关节炎 (RA)</h2><p>类风湿性关节炎是一种炎症性关节炎,表现为关节的炎症,受累关节通常包括手脚关节,可导致关节肿胀、疼痛以及常常遭到破坏。</p>]]></content>
<summary type="html">
<h1 id="网球肘-过劳性-持续-肌腱病的治疗"><a href="#网球肘-过劳性-持续-肌腱病的治疗" class="headerlink" title="网球肘 过劳性(持续)肌腱病的治疗"></a><a href="https://www.haoyishu.com/w
</summary>
<category term="others" scheme="https://plantegg.github.io/categories/others/"/>
<category term="网球肘" scheme="https://plantegg.github.io/tags/%E7%BD%91%E7%90%83%E8%82%98/"/>
<category term="氟比洛芬凝胶" scheme="https://plantegg.github.io/tags/%E6%B0%9F%E6%AF%94%E6%B4%9B%E8%8A%AC%E5%87%9D%E8%83%B6/"/>
<category term="泽普思" scheme="https://plantegg.github.io/tags/%E6%B3%BD%E6%99%AE%E6%80%9D/"/>
</entry>
<entry>
<title>从一道面试题谈起</title>
<link href="https://plantegg.github.io/2024/03/05/%E4%BB%8E%E4%B8%80%E9%81%93%E9%9D%A2%E8%AF%95%E9%A2%98%E8%B0%88%E8%B5%B7/"/>
<id>https://plantegg.github.io/2024/03/05/从一道面试题谈起/</id>
<published>2024-03-05T09:30:03.000Z</published>
<updated>2024-11-20T10:00:53.308Z</updated>
<content type="html"><![CDATA[<h1 id="从一道面试题谈起"><a href="#从一道面试题谈起" class="headerlink" title="从一道面试题谈起"></a>从一道面试题谈起</h1><p>这是一道BAT 的面试题,针对的是应届生,其实我觉得这种题目也适合所有面试人,比刷算法题、八股文要有用、实际多了</p><h2 id="题目"><a href="#题目" class="headerlink" title="题目"></a>题目</h2><p>给你几天时间自己在家可以借助任何资源用测试工具Sysbench 完成一次MySQL数据的性能测试,并编写测试报告(自行搭建数据库)</p><p>sysbench压MySQL常用有只读、读写、只写、update等6个场景</p><h2 id="结果"><a href="#结果" class="headerlink" title="结果"></a>结果</h2><p>这个候选人把他的结果发给我看了,我看完一惊要坏事,这个结果估计要不及格了</p><p>他用 sysbench 跑了一下只读、读写、只写等场景然后截图就没有了!</p><p><img src="https://cdn.jsdelivr.net/gh/plantegg/plantegg.github.io/images/951413iMgBlog/image-20230908223348050.png" alt="image-20230908223348050"></p><p>(如上图,大概就是6/7个这样的截图就没有了!)</p><p>我看到这个结果是很震惊的,你希望面试官挨个去看截图?<strong>最起码要有测试结果表格当做结论汇总吧</strong>。</p><p>如果你不知道怎么做可以先去搜一下别人做的测试报告,你可以按照别人的测试流程完全走一遍,基本算是模仿,要有结论的话也能得60分。</p><h2 id="60分的答案"><a href="#60分的答案" class="headerlink" title="60分的答案"></a>60分的答案</h2><p>每个场景增加1/8/16/32等并发,然后按照6个场景不同并发做成一个表格,并观察rt、cpu的指标最后汇总形成图表、给出结论分析,比如拐点在哪里、为什么</p><p>我觉得这个面试题好就好在这里的分析可以无穷展开,适合新手也适合多年的老手,任何结论理由你都可以写上去,只要有理有据有分析</p><h2 id="80分的答案"><a href="#80分的答案" class="headerlink" title="80分的答案"></a>80分的答案</h2><p>给自己出一个拟题,比如对比5.7和8.0的性能差异,8.0相对5.7在哪些场景有优化、优劣势,比如<a href="http://dimitrik.free.fr/blog/posts/mysql-performance-80-iobound-oltprw-vs-percona57.html" target="_blank" rel="noopener">这个测试报告</a></p><p>比如官方说的8.0在全局锁、pagesize等方面有些有优化,那么就针对性地设置场景来测试这些功能。</p><p>比如这是如上链接测试报告中间有数据图表:</p><p><img src="https://cdn.jsdelivr.net/gh/plantegg/plantegg.github.io/images/951413iMgBlog/image-20230908224210461.png" alt="image-20230908224210461"></p><p>最后有结论和分析:</p><ul><li>the main impact in the given <strong>IO-bound</strong> OLTP_RW workload is only DBLWR and nothing else !</li><li>and again, if your workload has more than 32 <em>concurrent</em> users sessions + using a very fast flash storage..</li><li>so far, impatient to see DBLWR fixed in MySQL 8.0 ;-))</li><li>using <strong>4K page size</strong> is absolutely to consider for any IO-bound workloads !</li><li><strong>NOTE</strong> : every Linux vendor today is claiming that 4K IO writes in Linux are <em>atomic</em> ! – and if this is really true for your platform, then you can safely disable DBLWR if you’re using 4K page and already reach <strong>x2 times higher TPS</strong> with MySQL 8.0 today in the given IO-bound OLTP_RW or any similar ! ;-)) – the same x2 times higher TPS was <a href="http://dimitrik.free.fr/blog/posts/mysql-performance-80-ga-iobound-tpcc.html" target="_blank" rel="noopener">also observed on IO-bound TPCC</a> even with an old SSD drive !</li><li>while if your workload is not IO-bound (having active dataset mostly cached in BP, none or very low IO reads) – then DBLWR is not your main impact ! – you may always tune your MySQL instance to make it mostly “invisible”..</li><li><strong>Binlog</strong> – is the main impact in this case.. Unfortunately it’s another old historical PITA in MySQL Server, and it’s largely a time now to get it fixed (or come with a more advanced alternative).. – “nature is always finding its way”, so let’s see..</li><li>no comments on MariaDB 10.3 performance.. – but a good live example that just copying InnoDB code from MySQL 5.7 is not enough to get it running right..</li></ul><p>之所以有80分是因为超出面试官的期待,给出了一个更高级的结论,面试官肯定很愿意约你过去谈谈</p><h2 id="还有没有更高的分"><a href="#还有没有更高的分" class="headerlink" title="还有没有更高的分"></a>还有没有更高的分</h2><p>也许有,但是不好说,80分那个就是优秀很好了,挖掘能力强的应届生会搞出来(肯定没有这么细致和周到,但是有几个关键点的结论就够80分了),再想出彩一点可以根据这个我的星球案例 <a href="https://plantegg.github.io/2021/05/14/%E5%8D%81%E5%B9%B4%E5%90%8E%E6%95%B0%E6%8D%AE%E5%BA%93%E8%BF%98%E6%98%AF%E4%B8%8D%E6%95%A2%E6%8B%A5%E6%8A%B1NUMA/">https://plantegg.github.io/2021/05/14/%E5%8D%81%E5%B9%B4%E5%90%8E%E6%95%B0%E6%8D%AE%E5%BA%93%E8%BF%98%E6%98%AF%E4%B8%8D%E6%95%A2%E6%8B%A5%E6%8A%B1NUMA/</a> 去搞几台物理机开关NUMA 验证一下,然后给一个对性能影响结果的测试数据报告</p><p>或者我博客这篇也行 <a href="https://plantegg.github.io/2019/12/16/Intel%20PAUSE%E6%8C%87%E4%BB%A4%E5%8F%98%E5%8C%96%E6%98%AF%E5%A6%82%E4%BD%95%E5%BD%B1%E5%93%8D%E8%87%AA%E6%97%8B%E9%94%81%E4%BB%A5%E5%8F%8AMySQL%E7%9A%84%E6%80%A7%E8%83%BD%E7%9A%84/%EF%BC%8C%E6%89%BE%E4%B8%8D%E5%90%8CIntel%E6%9C%BA%E5%99%A8%E9%AA%8C%E8%AF%81">https://plantegg.github.io/2019/12/16/Intel%20PAUSE%E6%8C%87%E4%BB%A4%E5%8F%98%E5%8C%96%E6%98%AF%E5%A6%82%E4%BD%95%E5%BD%B1%E5%93%8D%E8%87%AA%E6%97%8B%E9%94%81%E4%BB%A5%E5%8F%8AMySQL%E7%9A%84%E6%80%A7%E8%83%BD%E7%9A%84/,找不同Intel机器验证</a></p><p>给出不同的MySQL参数在不同Intel 芯片下性能的差异报告:</p><p><img src="https://cdn.jsdelivr.net/gh/plantegg/plantegg.github.io/images/951413iMgBlog/image-20221026153750159.png" alt="image-20221026153750159"></p><p>这种结论抛出去肯定会让面试官惊到,并对你刮目相看,至少说明你能在某个点上可以钻研很深,到哪里都要的是火车头,而不是普通工程师。</p><h2 id="总结"><a href="#总结" class="headerlink" title="总结"></a>总结</h2><p>从一个简单的面试题就可以看出应试人员的主观能动性,最起码你要会抄,先去抄别人的测试报告,然后验证一遍然后思考清楚每一个数据的原因(面试大概率会问)</p><p>但是大部分工程师都想临时抱佛脚,其实面试官可能会知道你不懂,但是希望看到给你几天你的深度挖掘和学习能力</p><p>最后可以从一个问题深挖、总结能力上能看出来候选人的天花板上限,而我们大部分时候都是凑合可以、又不是不能用,逼着自己向前精进一步总是很难的。</p>]]></content>
<summary type="html">
<h1 id="从一道面试题谈起"><a href="#从一道面试题谈起" class="headerlink" title="从一道面试题谈起"></a>从一道面试题谈起</h1><p>这是一道BAT 的面试题,针对的是应届生,其实我觉得这种题目也适合所有面试人,比刷算法题、八
</summary>
<category term="MySQL" scheme="https://plantegg.github.io/categories/MySQL/"/>
<category term="MySQL" scheme="https://plantegg.github.io/tags/MySQL/"/>
<category term="Sysbench" scheme="https://plantegg.github.io/tags/Sysbench/"/>
</entry>
<entry>
<title>必读 成长路径</title>
<link href="https://plantegg.github.io/2024/02/20/%E5%BF%85%E8%AF%BB%20%E6%98%9F%E7%90%83%E6%88%90%E9%95%BF%E8%B7%AF%E5%BE%84/"/>
<id>https://plantegg.github.io/2024/02/20/必读 星球成长路径/</id>
<published>2024-02-20T09:30:03.000Z</published>
<updated>2024-11-20T10:00:52.726Z</updated>
<content type="html"><![CDATA[<h1 id="必读-成长路径"><a href="#必读-成长路径" class="headerlink" title="必读 成长路径"></a>必读 成长路径</h1><p>我的<a href="https://plantegg.github.io/2023/05/10/%E7%A8%8B%E5%BA%8F%E5%91%98%E6%A1%88%E4%BE%8B%E6%98%9F%E7%90%83%E4%BB%8B%E7%BB%8D/">星球介绍</a></p><p>这篇是关于我星球里的内容、目标以及如何达到这个目标的一些概述</p><h2 id="星球目标"><a href="#星球目标" class="headerlink" title="星球目标"></a>星球目标</h2><p>本星球致力深度分析各种程序员领域疑难案例,通过案例带动对基础核心知识的理解,同时强化动手能力</p><p>一年星球没法让大家称为顶尖程序员(我自己都不是),只是希望用我的方法、知识、经验、案例作为你的垫脚石,帮助你快速、早日成为一个基本合格的程序员。</p><h2 id="必会技能"><a href="#必会技能" class="headerlink" title="必会技能"></a>必会技能</h2><p>在星球一年的时间你能学到什么(跟着做一定可以学会的):</p><ul><li>网络入门,抓包分析网络能力,wireshark使用 网络篇章索引:<a href="https://articles.zsxq.com/id_jr1w5wvb8j9f.html" target="_blank" rel="noopener">https://articles.zsxq.com/id_jr1w5wvb8j9f.html</a> </li><li><a href="https://t.zsxq.com/17CmWErZB" target="_blank" rel="noopener">QPS、RT和并发的关系</a>,记住查瓶颈追着 RT跑(哪里RT增加快瓶颈就在哪里)</li><li>IPC是什么和性能的本质</li><li><a href="https://wx.zsxq.com/dweb2/index/topic_detail/411522214118158" target="_blank" rel="noopener">养成</a><a href="https://wx.zsxq.com/dweb2/index/topic_detail/411522214118158" target="_blank" rel="noopener">做会</a><a href="https://wx.zsxq.com/dweb2/index/topic_detail/411522214118158" target="_blank" rel="noopener">而不是</a><a href="https://wx.zsxq.com/dweb2/index/topic_detail/411522214118158" target="_blank" rel="noopener">学会</a><a href="https://wx.zsxq.com/dweb2/index/topic_detail/411522214118158" target="_blank" rel="noopener">的习惯</a></li></ul><h2 id="视频素材"><a href="#视频素材" class="headerlink" title="视频素材"></a>视频素材</h2><p><strong>如果你发现看文章、做实验有些障碍,我特意录制了视频做演示(如果你基础好,看文章就能看懂并把实验做完,其实没必要看视频)</strong>:<a href="https://articles.zsxq.com/id_blqwkgux7i0a.html" target="_blank" rel="noopener">https://articles.zsxq.com/id_blqwkgux7i0a.html</a> </p><p>视频内容目前已经完成了:</p><ul><li>抓包技巧演示</li><li>QPS、并发、RT 的关系</li><li>tcp-rt 展示和在性能定位中的使用</li><li>瓶颈定位分析——追着RT 跑</li><li>单机内瓶颈定位</li><li>认识CPU 和 Cache,以及测试Cache、内存时延</li></ul><p>我在星球内一直强调视频不是高效的学习方法,因为你没有办法仔细思索、随时前后反复看等等,看完视频容易形成学懂了的错觉实际很快就忘了,但是我录完这些视频看大家的反馈我发现视频也有优点那就是:很直观、门槛低等,但是一定要注意一个错觉:以为看视频看懂了。但实际就是看视频看完了忘得比看文章快多了,所以看完视频一定要再去实验一下,实验所需要的素材基本都在星球内有了,代码等我都放在了github上</p><h2 id="挑战技能"><a href="#挑战技能" class="headerlink" title="挑战技能"></a>挑战技能</h2><p>有些技能不好描述,或者说是一些暗知识,我们尽量去讨论这些技能的逻辑,同时对一些特别有效的工具、知识会重点突破,这些恰恰是我希望你们最终能掌握的:</p><ul><li>分析解决问题的能力,在一定的知识的基础上靠谱地去分析</li><li>掌握技能而不是死知识</li><li>掌握核心知识点,核心知识点是指理解了一个点很容易对一个领域有较大的突破,比如IPC对于CPU性能、比如内存墙对计算机组成原理的理解、比如RT 对性能瓶颈的定位等</li></ul><p>知识总是学不完的,况且大多时候我们有了知识也解决不了问题,所以我们更注重能力的训练,比如这个提问:<a href="https://t.zsxq.com/0cfBnpmLw" target="_blank" rel="noopener">https://t.zsxq.com/0cfBnpmLw</a></p><h2 id="节奏安排"><a href="#节奏安排" class="headerlink" title="节奏安排"></a>节奏安排</h2><ul><li>一个月完成这一年唯一的一个必做作业:<a href="https://t.zsxq.com/0cUhJcVNa" target="_blank" rel="noopener">https://t.zsxq.com/0cUhJcVNa</a> 目的体验做会和学会的差别</li><li>一个月QPS、并发、RT的关系:<a href="https://t.zsxq.com/0dCmWErZB" target="_blank" rel="noopener">https://t.zsxq.com/0dCmWErZB</a> 性能、瓶颈定位的最核心理论</li><li>一个月补CPU基础,核心可以从内存墙、IPC、NUMA 入手,星球内都有不错的案例,可以查看 CPU 专栏</li><li>一个月用来实践性能瓶颈定位,比如就用Sysbench + MySQL 来构造:<a href="https://articles.zsxq.com/id_blqwkgux7i0a.html" target="_blank" rel="noopener">https://articles.zsxq.com/id_blqwkgux7i0a.html</a> </li><li>……补充中</li></ul><p>如果你发现这个节奏你跟不上,那么就先去<a href="https://articles.zsxq.com/id_blqwkgux7i0a.html" target="_blank" rel="noopener">看视频</a>,然后再按这个节奏来,如果还不行可以再去看视频,如果视频看不懂可以到微信群里讨论或者就视频里的哪个点提问,如果觉得看懂了,但是还是没法独立实验,那可以这个看懂了还是错觉,或者是基础缺的太多了</p><p>请先浏览星球专栏里的必看资源以及学习方法,做到做会而不是看会。另外每个主题后面的留言也很有价值</p><p>本星球大部分理论指导部分请看视频:<a href="https://t.zsxq.com/0dF2WvzEF" target="_blank" rel="noopener">https://t.zsxq.com/0dF2WvzEF</a> (5-10节共90分钟),视频中的理论要和案例结合</p><h2 id="案例选择"><a href="#案例选择" class="headerlink" title="案例选择"></a>案例选择</h2><p>星球选用的案例尽量典型普适性强,代表基础组件基本原理等知识。</p><p>分析手段尽量通用,分析过程一定要逻辑合理每个疑问都能回答清晰。</p><p>搞清楚一个案例基本能横扫一个领域,其次在一个案例后再带3/5个相关小案例可以帮你丰富场景,多角度理解</p><p>基于以上目标一年内选择了如下4个案例:</p><ul><li>TCP传输性能–对应星球有一年唯一的必做实验让大家上手:<a href="https://t.zsxq.com/0dUhJcVNa" target="_blank" rel="noopener">https://t.zsxq.com/0dUhJcVNa</a> <strong>目标:动手</strong></li><li>[历时3年的Nginx卡顿分析](<a href="https://github.com/plantegg/programmer_case/blame/main/performance/Nginx" target="_blank" rel="noopener">https://github.com/plantegg/programmer_case/blame/main/performance/Nginx</a> resueport 导致偶发性卡顿.md)–Nginx的架构本身的设计缺陷带来的卡顿,<strong>修复放来来自TCP传输性能,知识之间的联系</strong></li><li><a href="https://t.zsxq.com/0f00mI5gF" target="_blank" rel="noopener">MySQL有的连接一直慢、有的连接一直快,为什么</a>?目的:<strong>Wireshark分析的巧用,这个方法普适性极强</strong></li><li>同样的QPS,但CPU使用率相差3倍是为什么。<strong>目标:实现对CPU理解的入门</strong></li></ul><p>详细描述请看这里:<a href="https://t.zsxq.com/0cyPswpVB" target="_blank" rel="noopener">https://t.zsxq.com/0cyPswpVB</a></p><h2 id="本星球口头禅"><a href="#本星球口头禅" class="headerlink" title="本星球口头禅"></a>本星球口头禅</h2><p><strong>慢就是快,做会而不是看会,无招胜有招</strong></p><p>慢就是快指的是不要贪多,而是要彻底搞懂一个问题、一个知识点,让这个点成为一个支柱长成体系,贪多往往啥都没有掌握</p><p>做会而不是看会:程序员是工程类(也有科学家,但我们CRUD boy肯定不是),尤其像网络包、CPU流水线都是看不到无法感受,所以建议你去抓包、去做实验体会、触摸到每个包就能够更好地理解,所以星球强调做案例</p><p>无招胜有招:尽量找我普适性强的技能,比如ping ping神功,比如抓包,比如Google搜索,你会反复看到我的案例中使用这些技能</p><h2 id="如何在本星球获得成长的基本步骤"><a href="#如何在本星球获得成长的基本步骤" class="headerlink" title="如何在本星球获得成长的基本步骤"></a>如何在本星球获得成长的基本步骤</h2><p>多和以前的学习方式对比,学了一大堆几个月后全忘了,学了很多不会解决问题,学了很多但要靠反复刷。你不应该继续像以前一样忙忙碌碌但是收获很小</p><ul><li>该买的书买了:<a href="https://t.zsxq.com/0c3P6gpJE" target="_blank" rel="noopener">https://t.zsxq.com/0c3P6gpJE</a></li><li>该做的实验做了:<a href="https://t.zsxq.com/0cUhJcVNa" target="_blank" rel="noopener">https://t.zsxq.com/0cUhJcVNa</a> ,反复试过后,不懂的尽量提问</li><li>该看的视频看过了:<a href="https://articles.zsxq.com/id_blqwkgux7i0a.html" target="_blank" rel="noopener">https://articles.zsxq.com/id_blqwkgux7i0a.html</a> (实验你能独立完成就不用看视频了)</li><li>薅住几个case使劲干,能干多深干多深,看不懂的慢慢想,最好能在工作中找到场景实践一下</li><li>学习方法一定要看</li><li>不要急于求成,贪多不化,尽量单点突破(就是一个点使劲往深里干),彻底学懂一个后你会感受到加速</li><li>体会到动手做和看书的差异,体会到深度学习案例和看书的差异</li><li>不要相信自己看会了,不要相信自己的记忆能力</li><li>为什么你有知识但是没有能力:<a href="https://t.zsxq.com/0cfBnpmLw" target="_blank" rel="noopener">https://t.zsxq.com/0cfBnpmLw</a></li><li>养成记笔记,然后总结输出的习惯</li><li>必看专栏一定要高优先级先看</li></ul><p>最好能有自己的总结输出,比如博客文章,写文章是一次最好的总结,不一定要发出来,也不一定一次写完美了,我经常修改7、8年前的文章,因为随着经验的丰富有了更深入、不同的理解,这时不要写一篇新的,我都是在原来的基础上修改、扩充,这才是体系建设</p><h2 id="成长案例"><a href="#成长案例" class="headerlink" title="成长案例"></a>成长案例</h2><p>这是大学刚毕业几个月的新同学写的博客:<a href="https://yishenggong.com/2023/05/06/why-does-my-network-speed-drop-cn" target="_blank" rel="noopener">https://yishenggong.com/2023/05/06/why-does-my-network-speed-drop-cn/</a> </p><p><a href="https://yishenggong.com/2023/05/22/is-20m-of-rows-still-a-valid-soft-limit-of-mysql-table-in-2023" target="_blank" rel="noopener">https://yishenggong.com/2023/05/22/is-20m-of-rows-still-a-valid-soft-limit-of-mysql-table-in-2023/</a> 你可以比较他加入星球前后的博客文章(20230315 加入星球), 第二篇是英文版上了hacker news前三</p><p>我观察到的学员成长好习惯:</p><ul><li>动手动手,不论事情大小先干起来;</li><li>有自己的节奏,不贪多先把一篇文章、一个知识点薅扎实了</li></ul><h2 id="欢迎在星球里提问"><a href="#欢迎在星球里提问" class="headerlink" title="欢迎在星球里提问"></a>欢迎在星球里提问</h2><p>欢迎大家提问,越具体越好</p><p>比如这个问题就很具体、很好: <a href="https://t.zsxq.com/0enzptS47" target="_blank" rel="noopener">https://t.zsxq.com/0enzptS47</a> (千万不要微信上问,回答了也没有价值)</p><p>我自己一个人写写出来的东西难免自嗨,但是如果是你碰到的实际业务问题我觉得就更有代表性一些</p><p>提问肯定尽力要把问题描述具体,好重现,典型的就是之前 aws 流量降速导致MySQL QPS下降,提问的同学做得特别好的就是把这个问题自己反复分析后发现是网络流量被限速了,然后问题就很容易描述和重现,最后一大帮人帮忙分析问题,最后的结果大家都很开心学到了东西。问题在这里:<a href="https://articles.zsxq.com/id_iq5a872u8sux.html" target="_blank" rel="noopener">https://articles.zsxq.com/id_iq5a872u8sux.html</a> </p><p>你要是通过星球里的方法帮你解决了实际问题这是星球的最终目的,我当然最开心,如果你提了一个你工作中的问题大家一起帮你分析、讨论并最终解决了这就是最好的N对1的私教训练——觉得适合你的能力提升</p><p>我有时候绞尽脑汁写了文章然后大家不关心,有时候一个普通问题似乎大家都很嗨,我也喜欢能让你们很嗨的问题(即使我不懂也可以一起讨论)</p><h2 id="专栏介绍"><a href="#专栏介绍" class="headerlink" title="专栏介绍"></a>专栏介绍</h2><p>必看(一定要看的,我尽量控制必看的少)、实战案例(年度计划一定要分享和搞清楚的案例)、动手实验(做会一直是本星球的重要原则)、学习方法(磨刀不误砍柴工),剩下的就是按类别分比较好理解</p><h2 id="其它"><a href="#其它" class="headerlink" title="其它"></a>其它</h2><p>星主自我介绍:<a href="https://t.zsxq.com/0c33AXrCi" target="_blank" rel="noopener">https://t.zsxq.com/0c33AXrCi</a></p><p>或者在推特找我:<a href="https://twitter.com/plantegg" target="_blank" rel="noopener">https://twitter.com/plantegg</a> </p><p>个人博客:<a href="https://plantegg.github.io/2022/01/01/%E4%B8%89%E4%B8%AA%E6%95%85%E4%BA%8B">https://plantegg.github.io/2022/01/01/%E4%B8%89%E4%B8%AA%E6%95%85%E4%BA%8B/</a> </p><p>博客存放在github,图多的文章会慢一些,可以刷新几次。 </p><p>建议大家多用PC版星球( <a href="https://wx.zsxq.com/" target="_blank" rel="noopener">https://wx.zsxq.com</a> ),第一次记住密码后也很方便,主要是打字看图更合适些</p><p>画图工具和素材:<a href="https://t.zsxq.com/0enaoOUBp" target="_blank" rel="noopener">https://t.zsxq.com/0enaoOUBp</a> </p><p>知识星球:<a href="https://t.zsxq.com/0cSFEUh2J" target="_blank" rel="noopener">https://t.zsxq.com/0cSFEUh2J</a> 或者看看星球的介绍:<a href="https://plantegg.github.io/2023/05/10/%E7%A8%8B%E5%BA%8F%E5%91%98%E6%A1%88%E4%BE%8B%E6%98%9F%E7%90%83%E4%BB%8B%E7%BB%8D/">https://plantegg.github.io/2023/05/10/%E7%A8%8B%E5%BA%8F%E5%91%98%E6%A1%88%E4%BE%8B%E6%98%9F%E7%90%83%E4%BB%8B%E7%BB%8D/</a> </p><p><img src="https://cdn.jsdelivr.net/gh/plantegg/plantegg.github.io/images/951413iMgBlog/image-20240324161113874.png" alt="image-20240324161113874"></p><img src="https://cdn.jsdelivr.net/gh/plantegg/plantegg.github.io/images/951413iMgBlog/image-20230407232314969.png" alt="image-20230407232314969" style="zoom:50%;">]]></content>
<summary type="html">
<h1 id="必读-成长路径"><a href="#必读-成长路径" class="headerlink" title="必读 成长路径"></a>必读 成长路径</h1><p>我的<a href="https://plantegg.github.io/2023/05/10/%
</summary>
<category term="others" scheme="https://plantegg.github.io/categories/others/"/>
<category term="星球" scheme="https://plantegg.github.io/tags/%E6%98%9F%E7%90%83/"/>
</entry>
<entry>
<title>保险</title>
<link href="https://plantegg.github.io/2024/01/18/%E4%BF%9D%E9%99%A9/"/>
<id>https://plantegg.github.io/2024/01/18/保险/</id>
<published>2024-01-18T04:30:03.000Z</published>
<updated>2024-10-21T08:58:45.456Z</updated>
<content type="html"><![CDATA[<h2 id="保险"><a href="#保险" class="headerlink" title="保险"></a>保险</h2><h2 id="我的观点"><a href="#我的观点" class="headerlink" title="我的观点"></a>我的观点</h2><ol><li>不推荐任何理财型保险(你简单认为一年保费过万的都不推荐)</li><li>推荐少量消费型的保险(就是那种几乎没人给你推销,一年几百、几千的保费,没事不返还任何钱给你)</li><li>不推荐重疾险,回报率低</li><li>资源有限就优先给家庭主要收入来源的人买保险,很多人一上来给小孩买,中年男人裸奔,这搞错了</li><li>最实惠的保险是相互宝那种,可惜被获利阶层伙同傻逼们干没了</li></ol><h2 id="理由"><a href="#理由" class="headerlink" title="理由"></a>理由</h2><p>基本逻辑:保险是保意外的,你想赚钱就去买房子、股票、基金、做生意(不是说这几年哈)。消费型的保险(比如人身意外伤害险、车险都算)才是保意外,以小博大,当然也是保的小概率。</p><p>任何一个保险扣除运营费用就是返还率,相互宝运营费用10%-8%,大多人没概念,这是极低了,没有营销成本,10%用在理赔的时候调查考证。但是一个理财型的保险20-30% 给一线销售,这就是为什么这些保险人反复、耐心跟你讲要买保险,为你服务,当然这是成本,值不值你自己考虑;这还没完,还有上级、经理、公司的运营工资等,要不保险公司凭什么养那么多领导家属;所以这是保险公司核心收入来源,也必然导致了价格奇高。</p><p>理赔很复杂,没事的时候当然好,真要理赔各种你没想到的事前告知,你连我这几百字都不愿意看,保险公司那条款你就更不愿意看了。所以我推荐意外险,死了就陪那种简单些,越复杂的你越搞不懂。卖保险的人是不会跟你说那么清晰的,实际上他自己都搞不清楚,真到了出险才是真正的考验!</p><h3 id="一家三口,只买一份保险,假设预算一年5000的话,给谁买?"><a href="#一家三口,只买一份保险,假设预算一年5000的话,给谁买?" class="headerlink" title="一家三口,只买一份保险,假设预算一年5000的话,给谁买?"></a>一家三口,只买一份保险,假设预算一年5000的话,给谁买?</h3><p>肯定是给创造家里主要收入来源那人,保险其实是给活人的福利,你给小朋友买,妈妈挂了,他惨不惨?收入一下子也没了,保险能给他生活费、学费?</p><p>如果给妈妈买,你看至少保额还可以供他几年。现在的父母觉得自己有爱、爱娃,当然是给小朋友买,所以我说是错的</p><p>你别拿有钱人人都买来扛哈。</p><h3 id="为什么不推荐重疾险"><a href="#为什么不推荐重疾险" class="headerlink" title="为什么不推荐重疾险"></a>为什么不推荐重疾险</h3><p>重疾险本来是挺好的,出险直接给钱,是医保外的补充,正如我上面所说赔付率太低了,你还不如把保费存起来,赌概率。</p><h3 id="买一年几百的意外险其实是能嫖到一年几万保费的人为你提供服务的"><a href="#买一年几百的意外险其实是能嫖到一年几万保费的人为你提供服务的" class="headerlink" title="买一年几百的意外险其实是能嫖到一年几万保费的人为你提供服务的"></a>买一年几百的意外险其实是能嫖到一年几万保费的人为你提供服务的</h3><p>这个自己想想</p><p>保险大家都需要,都希望有,但是保险行业是最需要革命和精简的,比银行还夸张,所以我不会花太多钱补贴这帮蛀得太厉害的蛀虫</p><h1 id="个人所得税综合汇算"><a href="#个人所得税综合汇算" class="headerlink" title="个人所得税综合汇算"></a>个人所得税综合汇算</h1><p>昨天晚上Review个税到很晚,**终于找不回来70%**,这里确实有需要补税的地方;但是还有一些抵扣给我漏了;</p><p>我这次多算主要有一个<strong>2022年的bug</strong>,这个bug导致当时要退我几万税(系统自动自己算Bug),我大喜<strong>装糊涂</strong>,各种配合税务局提交资料,最后税务局人工Review的时候发现了这个bug,当然钱也不会退我,不过打电话跟我解释了,我也装糊涂就过去了 </p><p>结果今年这Bug确实修复了,但是他娘的修复过头了,导致我多补70%的税,现在我只需要补30%,开心多了,这30% 是预期内的 </p><p>毕竟我从2019年对个税申报太熟悉了,如图是我研究后的一些经验 </p><p><strong>几个省税的点:</strong> </p><p>1)大部分情况下奖金、工资分开计税税更少,<strong>有极小概率合并计税缴税少</strong>(比如工资低奖金高、比如奖金落在盲区的话合,因为利用了工资税扣减数不用除12) </p><p>2) 目前奖金、工资可以合并也可以单独计税,二选一,默认单独计税——不懂就<strong>在个税app上申报的时候两种都试试,哪种缴税少就用哪种</strong> </p><p>3) <strong>股票比年终奖少扣税</strong>,同样100万,股票收入到手比年终奖多了17万(因为股票税没有盲区) </p><p>最后附送一个案例(如图),100万年终奖和100万股票收入的差别,同时100万年终奖跟工资合并计税税更少; 同时如果100万年终奖采用合并计税也比单独计税拿到手要多 </p><p>目前一个人有机会将税率做得比较低,就是把收入分成3份:工资、将近、股票,<strong>算下来几乎可以按综合年入的30%以内缴税(<strong>高管个税有其它优惠我粉丝都是屌丝就不展开了,很多高管缴税可能比你少——比如去成都、天津,现在多了海南</strong>)</strong> </p><p>另外因为2023年8月才出通知提高23年的附加抵扣额度,所以<strong>今年几乎每个人都要退税</strong>,如果没有退就好好Review以下,已经提交了的还<strong>可以重新退回来重新算</strong>;<strong>2022年、2021年算错了的现在还可以申请要回来!</strong> 现在打开个税APP 去Review,如果有找补回来记得给我发红包。如果你今年退税了辛苦评论区说下让我开心开心</p>]]></content>
<summary type="html">
<h2 id="保险"><a href="#保险" class="headerlink" title="保险"></a>保险</h2><h2 id="我的观点"><a href="#我的观点" class="headerlink" title="我的观点"></a>我的观点</h
</summary>
<category term="技巧" scheme="https://plantegg.github.io/categories/%E6%8A%80%E5%B7%A7/"/>
<category term="保险" scheme="https://plantegg.github.io/tags/%E4%BF%9D%E9%99%A9/"/>
</entry>
</feed>