-
Notifications
You must be signed in to change notification settings - Fork 12
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Build error #2
Comments
Delete Android.bp file on libfdt |
home/Dhina_17/android/lineage/out/soong/build.ninja Android.bperror: kernel/xiaomi/onclite/scripts/dtc-aosp/dtc/Android.bp:15:9: unrecognized property "dist" Then what about this? |
mylove90
referenced
this issue
in mylove90/android_kernel_xiaomi_onc
Sep 22, 2019
[ Upstream commit 551842446ed695641a00782cd118cbb064a416a1 ] ifmsh->csa is an RCU-protected pointer. The writer context in ieee80211_mesh_finish_csa() is already mutually exclusive with wdev->sdata.mtx, but the RCU checker did not know this. Use rcu_dereference_protected() to avoid a warning. fixes the following warning: [ 12.519089] ============================= [ 12.520042] WARNING: suspicious RCU usage [ 12.520652] 5.1.0-rc7-wt+ #16 Tainted: G W [ 12.521409] ----------------------------- [ 12.521972] net/mac80211/mesh.c:1223 suspicious rcu_dereference_check() usage! [ 12.522928] other info that might help us debug this: [ 12.523984] rcu_scheduler_active = 2, debug_locks = 1 [ 12.524855] 5 locks held by kworker/u8:2/152: [ 12.525438] #0: 00000000057be08c ((wq_completion)phy0){+.+.}, at: process_one_work+0x1a2/0x620 [ 12.526607] #1: 0000000059c6b07a ((work_completion)(&sdata->csa_finalize_work)){+.+.}, at: process_one_work+0x1a2/0x620 [ 12.528001] #2: 00000000f184ba7d (&wdev->mtx){+.+.}, at: ieee80211_csa_finalize_work+0x2f/0x90 [ 12.529116] #3: 00000000831a1f54 (&local->mtx){+.+.}, at: ieee80211_csa_finalize_work+0x47/0x90 [ 12.530233] #4: 00000000fd06f988 (&local->chanctx_mtx){+.+.}, at: ieee80211_csa_finalize_work+0x51/0x90 Signed-off-by: Thomas Pedersen <[email protected]> Signed-off-by: Johannes Berg <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
mylove90
referenced
this issue
in mylove90/android_kernel_xiaomi_onc
Sep 22, 2019
[ Upstream commit 6da9f775175e516fc7229ceaa9b54f8f56aa7924 ] When debugging options are turned on, the rcu_read_lock() function might not be inlined. This results in lockdep's print_lock() function printing "rcu_read_lock+0x0/0x70" instead of rcu_read_lock()'s caller. For example: [ 10.579995] ============================= [ 10.584033] WARNING: suspicious RCU usage [ 10.588074] 4.18.0.memcg_v2+ #1 Not tainted [ 10.593162] ----------------------------- [ 10.597203] include/linux/rcupdate.h:281 Illegal context switch in RCU read-side critical section! [ 10.606220] [ 10.606220] other info that might help us debug this: [ 10.606220] [ 10.614280] [ 10.614280] rcu_scheduler_active = 2, debug_locks = 1 [ 10.620853] 3 locks held by systemd/1: [ 10.624632] #0: (____ptrval____) (&type->i_mutex_dir_key#5){.+.+}, at: lookup_slow+0x42/0x70 [ 10.633232] #1: (____ptrval____) (rcu_read_lock){....}, at: rcu_read_lock+0x0/0x70 [ 10.640954] #2: (____ptrval____) (rcu_read_lock){....}, at: rcu_read_lock+0x0/0x70 These "rcu_read_lock+0x0/0x70" strings are not providing any useful information. This commit therefore forces inlining of the rcu_read_lock() function so that rcu_read_lock()'s caller is instead shown. Signed-off-by: Waiman Long <[email protected]> Signed-off-by: Paul E. McKenney <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
mylove90
referenced
this issue
in mylove90/android_kernel_xiaomi_onc
Sep 22, 2019
[ Upstream commit 3f167e1921865b379a9becf03828e7202c7b4917 ] ipv4_pdp_add() is called in RCU read-side critical section. So GFP_KERNEL should not be used in the function. This patch make ipv4_pdp_add() to use GFP_ATOMIC instead of GFP_KERNEL. Test commands: gtp-link add gtp1 & gtp-tunnel add gtp1 v1 100 200 1.1.1.1 2.2.2.2 Splat looks like: [ 130.618881] ============================= [ 130.626382] WARNING: suspicious RCU usage [ 130.626994] 5.2.0-rc6+ #50 Not tainted [ 130.627622] ----------------------------- [ 130.628223] ./include/linux/rcupdate.h:266 Illegal context switch in RCU read-side critical section! [ 130.629684] [ 130.629684] other info that might help us debug this: [ 130.629684] [ 130.631022] [ 130.631022] rcu_scheduler_active = 2, debug_locks = 1 [ 130.632136] 4 locks held by gtp-tunnel/1025: [ 130.632925] #0: 000000002b93c8b7 (cb_lock){++++}, at: genl_rcv+0x15/0x40 [ 130.634159] #1: 00000000f17bc999 (genl_mutex){+.+.}, at: genl_rcv_msg+0xfb/0x130 [ 130.635487] #2: 00000000c644ed8e (rtnl_mutex){+.+.}, at: gtp_genl_new_pdp+0x18c/0x1150 [gtp] [ 130.636936] #3: 0000000007a1cde7 (rcu_read_lock){....}, at: gtp_genl_new_pdp+0x187/0x1150 [gtp] [ 130.638348] [ 130.638348] stack backtrace: [ 130.639062] CPU: 1 PID: 1025 Comm: gtp-tunnel Not tainted 5.2.0-rc6+ #50 [ 130.641318] Call Trace: [ 130.641707] dump_stack+0x7c/0xbb [ 130.642252] ___might_sleep+0x2c0/0x3b0 [ 130.642862] kmem_cache_alloc_trace+0x1cd/0x2b0 [ 130.643591] gtp_genl_new_pdp+0x6c5/0x1150 [gtp] [ 130.644371] genl_family_rcv_msg+0x63a/0x1030 [ 130.645074] ? mutex_lock_io_nested+0x1090/0x1090 [ 130.645845] ? genl_unregister_family+0x630/0x630 [ 130.646592] ? debug_show_all_locks+0x2d0/0x2d0 [ 130.647293] ? check_flags.part.40+0x440/0x440 [ 130.648099] genl_rcv_msg+0xa3/0x130 [ ... ] Fixes: 459aa66 ("gtp: add initial driver for datapath of GPRS Tunneling Protocol (GTP-U)") Signed-off-by: Taehee Yoo <[email protected]> Signed-off-by: David S. Miller <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
mylove90
referenced
this issue
in mylove90/android_kernel_xiaomi_onc
Sep 22, 2019
[ Upstream commit 181fa434d0514e40ebf6e9721f2b72700287b6e2 ] According to the PCI Local Bus specification Revision 3.0, section 6.8.1.3 (Message Control for MSI), endpoints that are Multiple Message Capable as defined by bits [3:1] in the Message Control for MSI can request a number of vectors that is power of two aligned. As specified in section 6.8.1.6 "Message data for MSI", the Multiple Message Enable field (bits [6:4] of the Message Control register) defines the number of low order message data bits the function is permitted to modify to generate its system software allocated vectors. The MSI controller in the Xilinx NWL PCIe controller supports a number of MSI vectors specified through a bitmap and the hwirq number for an MSI, that is the value written in the MSI data TLP is determined by the bitmap allocation. For instance, in a situation where two endpoints sitting on the PCI bus request the following MSI configuration, with the current PCI Xilinx bitmap allocation code (that does not align MSI vector allocation on a power of two boundary): Endpoint #1: Requesting 1 MSI vector - allocated bitmap bits 0 Endpoint #2: Requesting 2 MSI vectors - allocated bitmap bits [1,2] The bitmap value(s) corresponds to the hwirq number that is programmed into the Message Data for MSI field in the endpoint MSI capability and is detected by the root complex to fire the corresponding MSI irqs. The value written in Message Data for MSI field corresponds to the first bit allocated in the bitmap for Multi MSI vectors. The current Xilinx NWL MSI allocation code allows a bitmap allocation that is not a power of two boundaries, so endpoint #2, is allowed to toggle Message Data bit[0] to differentiate between its two vectors (meaning that the MSI data will be respectively 0x0 and 0x1 for the two vectors allocated to endpoint #2). This clearly aliases with the Endpoint #1 vector allocation, resulting in a broken Multi MSI implementation. Update the code to allocate MSI bitmap ranges with a power of two alignment, fixing the bug. Fixes: ab597d3 ("PCI: xilinx-nwl: Add support for Xilinx NWL PCIe Host Controller") Suggested-by: Marc Zyngier <[email protected]> Signed-off-by: Bharat Kumar Gogada <[email protected]> [[email protected]: updated commit log] Signed-off-by: Lorenzo Pieralisi <[email protected]> Acked-by: Marc Zyngier <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
mylove90
referenced
this issue
in mylove90/android_kernel_xiaomi_onc
Sep 22, 2019
[ Upstream commit 80e5302e4bc85a6b685b7668c36c6487b5f90e9a ] An impending change to enable HAVE_C_RECORDMCOUNT on powerpc leads to warnings such as the following: # modprobe kprobe_example ftrace-powerpc: Not expected bl: opcode is 3c4c0001 WARNING: CPU: 0 PID: 227 at kernel/trace/ftrace.c:2001 ftrace_bug+0x90/0x318 Modules linked in: CPU: 0 PID: 227 Comm: modprobe Not tainted 5.2.0-rc6-00678-g1c329100b942 #2 NIP: c000000000264318 LR: c00000000025d694 CTR: c000000000f5cd30 REGS: c000000001f2b7b0 TRAP: 0700 Not tainted (5.2.0-rc6-00678-g1c329100b942) MSR: 900000010282b033 <SF,HV,VEC,VSX,EE,FP,ME,IR,DR,RI,LE,TM[E]> CR: 28228222 XER: 00000000 CFAR: c0000000002642fc IRQMASK: 0 <snip> NIP [c000000000264318] ftrace_bug+0x90/0x318 LR [c00000000025d694] ftrace_process_locs+0x4f4/0x5e0 Call Trace: [c000000001f2ba40] [0000000000000004] 0x4 (unreliable) [c000000001f2bad0] [c00000000025d694] ftrace_process_locs+0x4f4/0x5e0 [c000000001f2bb90] [c00000000020ff10] load_module+0x25b0/0x30c0 [c000000001f2bd00] [c000000000210cb0] sys_finit_module+0xc0/0x130 [c000000001f2be20] [c00000000000bda4] system_call+0x5c/0x70 Instruction dump: 419e0018 2f83ffff 419e00bc 2f83ffea 409e00cc 4800001c 0fe00000 3c62ff96 39000001 39400000 386386d0 480000c4 <0fe00000> 3ce20003 39000001 3c62ff96 ---[ end trace 4c438d5cebf78381 ]--- ftrace failed to modify [<c0080000012a0008>] 0xc0080000012a0008 actual: 01:00:4c:3c Initializing ftrace call sites ftrace record flags: 2000000 (0) expected tramp: c00000000006af4c Looking at the relocation records in __mcount_loc shows a few spurious entries: RELOCATION RECORDS FOR [__mcount_loc]: OFFSET TYPE VALUE 0000000000000000 R_PPC64_ADDR64 .text.unlikely+0x0000000000000008 0000000000000008 R_PPC64_ADDR64 .text.unlikely+0x0000000000000014 0000000000000010 R_PPC64_ADDR64 .text.unlikely+0x0000000000000060 0000000000000018 R_PPC64_ADDR64 .text.unlikely+0x00000000000000b4 0000000000000020 R_PPC64_ADDR64 .init.text+0x0000000000000008 0000000000000028 R_PPC64_ADDR64 .init.text+0x0000000000000014 The first entry in each section is incorrect. Looking at the relocation records, the spurious entries correspond to the R_PPC64_ENTRY records: RELOCATION RECORDS FOR [.text.unlikely]: OFFSET TYPE VALUE 0000000000000000 R_PPC64_REL64 .TOC.-0x0000000000000008 0000000000000008 R_PPC64_ENTRY *ABS* 0000000000000014 R_PPC64_REL24 _mcount <snip> The problem is that we are not validating the return value from get_mcountsym() in sift_rel_mcount(). With this entry, mcountsym is 0, but Elf_r_sym(relp) also ends up being 0. Fix this by ensuring mcountsym is valid before processing the entry. Signed-off-by: Naveen N. Rao <[email protected]> Acked-by: Steven Rostedt (VMware) <[email protected]> Tested-by: Satheesh Rajendran <[email protected]> Signed-off-by: Michael Ellerman <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
mylove90
referenced
this issue
in mylove90/android_kernel_xiaomi_onc
Sep 22, 2019
commit cf3591ef832915892f2499b7e54b51d4c578b28c upstream. Revert the commit bd293d071ffe65e645b4d8104f9d8fe15ea13862. The proper fix has been made available with commit d0a255e795ab ("loop: set PF_MEMALLOC_NOIO for the worker thread"). Note that the fix offered by commit bd293d071ffe doesn't really prevent the deadlock from occuring - if we look at the stacktrace reported by Junxiao Bi, we see that it hangs in bit_wait_io and not on the mutex - i.e. it has already successfully taken the mutex. Changing the mutex from mutex_lock to mutex_trylock won't help with deadlocks that happen afterwards. PID: 474 TASK: ffff8813e11f4600 CPU: 10 COMMAND: "kswapd0" #0 [ffff8813dedfb938] __schedule at ffffffff8173f405 #1 [ffff8813dedfb990] schedule at ffffffff8173fa27 #2 [ffff8813dedfb9b0] schedule_timeout at ffffffff81742fec #3 [ffff8813dedfba60] io_schedule_timeout at ffffffff8173f186 #4 [ffff8813dedfbaa0] bit_wait_io at ffffffff8174034f #5 [ffff8813dedfbac0] __wait_on_bit at ffffffff8173fec8 #6 [ffff8813dedfbb10] out_of_line_wait_on_bit at ffffffff8173ff81 #7 [ffff8813dedfbb90] __make_buffer_clean at ffffffffa038736f [dm_bufio] #8 [ffff8813dedfbbb0] __try_evict_buffer at ffffffffa0387bb8 [dm_bufio] #9 [ffff8813dedfbbd0] dm_bufio_shrink_scan at ffffffffa0387cc3 [dm_bufio] #10 [ffff8813dedfbc40] shrink_slab at ffffffff811a87ce #11 [ffff8813dedfbd30] shrink_zone at ffffffff811ad778 #12 [ffff8813dedfbdc0] kswapd at ffffffff811ae92f #13 [ffff8813dedfbec0] kthread at ffffffff810a8428 #14 [ffff8813dedfbf50] ret_from_fork at ffffffff81745242 Signed-off-by: Mikulas Patocka <[email protected]> Cc: [email protected] Fixes: bd293d071ffe ("dm bufio: fix deadlock with loop device") Depends-on: d0a255e795ab ("loop: set PF_MEMALLOC_NOIO for the worker thread") Signed-off-by: Mike Snitzer <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
mylove90
referenced
this issue
in mylove90/android_kernel_xiaomi_onc
Sep 22, 2019
[ Upstream commit 86968ef21596515958d5f0a40233d02be78ecec0 ] Calling ceph_buffer_put() in __ceph_setxattr() may end up freeing the i_xattrs.prealloc_blob buffer while holding the i_ceph_lock. This can be fixed by postponing the call until later, when the lock is released. The following backtrace was triggered by fstests generic/117. BUG: sleeping function called from invalid context at mm/vmalloc.c:2283 in_atomic(): 1, irqs_disabled(): 0, pid: 650, name: fsstress 3 locks held by fsstress/650: #0: 00000000870a0fe8 (sb_writers#8){.+.+}, at: mnt_want_write+0x20/0x50 #1: 00000000ba0c4c74 (&type->i_mutex_dir_key#6){++++}, at: vfs_setxattr+0x55/0xa0 #2: 000000008dfbb3f2 (&(&ci->i_ceph_lock)->rlock){+.+.}, at: __ceph_setxattr+0x297/0x810 CPU: 1 PID: 650 Comm: fsstress Not tainted 5.2.0+ #437 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.12.1-0-ga5cab58-prebuilt.qemu.org 04/01/2014 Call Trace: dump_stack+0x67/0x90 ___might_sleep.cold+0x9f/0xb1 vfree+0x4b/0x60 ceph_buffer_release+0x1b/0x60 __ceph_setxattr+0x2b4/0x810 __vfs_setxattr+0x66/0x80 __vfs_setxattr_noperm+0x59/0xf0 vfs_setxattr+0x81/0xa0 setxattr+0x115/0x230 ? filename_lookup+0xc9/0x140 ? rcu_read_lock_sched_held+0x74/0x80 ? rcu_sync_lockdep_assert+0x2e/0x60 ? __sb_start_write+0x142/0x1a0 ? mnt_want_write+0x20/0x50 path_setxattr+0xba/0xd0 __x64_sys_lsetxattr+0x24/0x30 do_syscall_64+0x50/0x1c0 entry_SYSCALL_64_after_hwframe+0x49/0xbe RIP: 0033:0x7ff23514359a Signed-off-by: Luis Henriques <[email protected]> Reviewed-by: Jeff Layton <[email protected]> Signed-off-by: Ilya Dryomov <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
mylove90
referenced
this issue
in mylove90/android_kernel_xiaomi_onc
Sep 22, 2019
[ Upstream commit af8a85a41734f37b67ba8ce69d56b685bee4ac48 ] Calling ceph_buffer_put() in fill_inode() may result in freeing the i_xattrs.blob buffer while holding the i_ceph_lock. This can be fixed by postponing the call until later, when the lock is released. The following backtrace was triggered by fstests generic/070. BUG: sleeping function called from invalid context at mm/vmalloc.c:2283 in_atomic(): 1, irqs_disabled(): 0, pid: 3852, name: kworker/0:4 6 locks held by kworker/0:4/3852: #0: 000000004270f6bb ((wq_completion)ceph-msgr){+.+.}, at: process_one_work+0x1b8/0x5f0 #1: 00000000eb420803 ((work_completion)(&(&con->work)->work)){+.+.}, at: process_one_work+0x1b8/0x5f0 #2: 00000000be1c53a4 (&s->s_mutex){+.+.}, at: dispatch+0x288/0x1476 #3: 00000000559cb958 (&mdsc->snap_rwsem){++++}, at: dispatch+0x2eb/0x1476 #4: 000000000d5ebbae (&req->r_fill_mutex){+.+.}, at: dispatch+0x2fc/0x1476 #5: 00000000a83d0514 (&(&ci->i_ceph_lock)->rlock){+.+.}, at: fill_inode.isra.0+0xf8/0xf70 CPU: 0 PID: 3852 Comm: kworker/0:4 Not tainted 5.2.0+ #441 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.12.1-0-ga5cab58-prebuilt.qemu.org 04/01/2014 Workqueue: ceph-msgr ceph_con_workfn Call Trace: dump_stack+0x67/0x90 ___might_sleep.cold+0x9f/0xb1 vfree+0x4b/0x60 ceph_buffer_release+0x1b/0x60 fill_inode.isra.0+0xa9b/0xf70 ceph_fill_trace+0x13b/0xc70 ? dispatch+0x2eb/0x1476 dispatch+0x320/0x1476 ? __mutex_unlock_slowpath+0x4d/0x2a0 ceph_con_workfn+0xc97/0x2ec0 ? process_one_work+0x1b8/0x5f0 process_one_work+0x244/0x5f0 worker_thread+0x4d/0x3e0 kthread+0x105/0x140 ? process_one_work+0x5f0/0x5f0 ? kthread_park+0x90/0x90 ret_from_fork+0x3a/0x50 Signed-off-by: Luis Henriques <[email protected]> Reviewed-by: Jeff Layton <[email protected]> Signed-off-by: Ilya Dryomov <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
mylove90
referenced
this issue
in mylove90/android_kernel_xiaomi_onc
Sep 22, 2019
[ Upstream commit fe163e534e5eecdfd7b5920b0dfd24c458ee85d6 ] syzbot reported: BUG: KMSAN: uninit-value in capi_write+0x791/0xa90 drivers/isdn/capi/capi.c:700 CPU: 0 PID: 10025 Comm: syz-executor379 Not tainted 4.20.0-rc7+ #2 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Call Trace: __dump_stack lib/dump_stack.c:77 [inline] dump_stack+0x173/0x1d0 lib/dump_stack.c:113 kmsan_report+0x12e/0x2a0 mm/kmsan/kmsan.c:613 __msan_warning+0x82/0xf0 mm/kmsan/kmsan_instr.c:313 capi_write+0x791/0xa90 drivers/isdn/capi/capi.c:700 do_loop_readv_writev fs/read_write.c:703 [inline] do_iter_write+0x83e/0xd80 fs/read_write.c:961 vfs_writev fs/read_write.c:1004 [inline] do_writev+0x397/0x840 fs/read_write.c:1039 __do_sys_writev fs/read_write.c:1112 [inline] __se_sys_writev+0x9b/0xb0 fs/read_write.c:1109 __x64_sys_writev+0x4a/0x70 fs/read_write.c:1109 do_syscall_64+0xbc/0xf0 arch/x86/entry/common.c:291 entry_SYSCALL_64_after_hwframe+0x63/0xe7 [...] The problem is that capi_write() is reading past the end of the message. Fix it by checking the message's length in the needed places. Reported-and-tested-by: [email protected] Signed-off-by: Eric Biggers <[email protected]> Signed-off-by: David S. Miller <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
mylove90
referenced
this issue
in mylove90/android_kernel_xiaomi_onc
Sep 24, 2019
[ Upstream commit 551842446ed695641a00782cd118cbb064a416a1 ] ifmsh->csa is an RCU-protected pointer. The writer context in ieee80211_mesh_finish_csa() is already mutually exclusive with wdev->sdata.mtx, but the RCU checker did not know this. Use rcu_dereference_protected() to avoid a warning. fixes the following warning: [ 12.519089] ============================= [ 12.520042] WARNING: suspicious RCU usage [ 12.520652] 5.1.0-rc7-wt+ #16 Tainted: G W [ 12.521409] ----------------------------- [ 12.521972] net/mac80211/mesh.c:1223 suspicious rcu_dereference_check() usage! [ 12.522928] other info that might help us debug this: [ 12.523984] rcu_scheduler_active = 2, debug_locks = 1 [ 12.524855] 5 locks held by kworker/u8:2/152: [ 12.525438] #0: 00000000057be08c ((wq_completion)phy0){+.+.}, at: process_one_work+0x1a2/0x620 [ 12.526607] #1: 0000000059c6b07a ((work_completion)(&sdata->csa_finalize_work)){+.+.}, at: process_one_work+0x1a2/0x620 [ 12.528001] #2: 00000000f184ba7d (&wdev->mtx){+.+.}, at: ieee80211_csa_finalize_work+0x2f/0x90 [ 12.529116] #3: 00000000831a1f54 (&local->mtx){+.+.}, at: ieee80211_csa_finalize_work+0x47/0x90 [ 12.530233] #4: 00000000fd06f988 (&local->chanctx_mtx){+.+.}, at: ieee80211_csa_finalize_work+0x51/0x90 Signed-off-by: Thomas Pedersen <[email protected]> Signed-off-by: Johannes Berg <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
mylove90
referenced
this issue
in mylove90/android_kernel_xiaomi_onc
Sep 24, 2019
[ Upstream commit 6da9f775175e516fc7229ceaa9b54f8f56aa7924 ] When debugging options are turned on, the rcu_read_lock() function might not be inlined. This results in lockdep's print_lock() function printing "rcu_read_lock+0x0/0x70" instead of rcu_read_lock()'s caller. For example: [ 10.579995] ============================= [ 10.584033] WARNING: suspicious RCU usage [ 10.588074] 4.18.0.memcg_v2+ #1 Not tainted [ 10.593162] ----------------------------- [ 10.597203] include/linux/rcupdate.h:281 Illegal context switch in RCU read-side critical section! [ 10.606220] [ 10.606220] other info that might help us debug this: [ 10.606220] [ 10.614280] [ 10.614280] rcu_scheduler_active = 2, debug_locks = 1 [ 10.620853] 3 locks held by systemd/1: [ 10.624632] #0: (____ptrval____) (&type->i_mutex_dir_key#5){.+.+}, at: lookup_slow+0x42/0x70 [ 10.633232] #1: (____ptrval____) (rcu_read_lock){....}, at: rcu_read_lock+0x0/0x70 [ 10.640954] #2: (____ptrval____) (rcu_read_lock){....}, at: rcu_read_lock+0x0/0x70 These "rcu_read_lock+0x0/0x70" strings are not providing any useful information. This commit therefore forces inlining of the rcu_read_lock() function so that rcu_read_lock()'s caller is instead shown. Signed-off-by: Waiman Long <[email protected]> Signed-off-by: Paul E. McKenney <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
mylove90
referenced
this issue
in mylove90/android_kernel_xiaomi_onc
Sep 24, 2019
[ Upstream commit 3f167e1921865b379a9becf03828e7202c7b4917 ] ipv4_pdp_add() is called in RCU read-side critical section. So GFP_KERNEL should not be used in the function. This patch make ipv4_pdp_add() to use GFP_ATOMIC instead of GFP_KERNEL. Test commands: gtp-link add gtp1 & gtp-tunnel add gtp1 v1 100 200 1.1.1.1 2.2.2.2 Splat looks like: [ 130.618881] ============================= [ 130.626382] WARNING: suspicious RCU usage [ 130.626994] 5.2.0-rc6+ #50 Not tainted [ 130.627622] ----------------------------- [ 130.628223] ./include/linux/rcupdate.h:266 Illegal context switch in RCU read-side critical section! [ 130.629684] [ 130.629684] other info that might help us debug this: [ 130.629684] [ 130.631022] [ 130.631022] rcu_scheduler_active = 2, debug_locks = 1 [ 130.632136] 4 locks held by gtp-tunnel/1025: [ 130.632925] #0: 000000002b93c8b7 (cb_lock){++++}, at: genl_rcv+0x15/0x40 [ 130.634159] #1: 00000000f17bc999 (genl_mutex){+.+.}, at: genl_rcv_msg+0xfb/0x130 [ 130.635487] #2: 00000000c644ed8e (rtnl_mutex){+.+.}, at: gtp_genl_new_pdp+0x18c/0x1150 [gtp] [ 130.636936] #3: 0000000007a1cde7 (rcu_read_lock){....}, at: gtp_genl_new_pdp+0x187/0x1150 [gtp] [ 130.638348] [ 130.638348] stack backtrace: [ 130.639062] CPU: 1 PID: 1025 Comm: gtp-tunnel Not tainted 5.2.0-rc6+ #50 [ 130.641318] Call Trace: [ 130.641707] dump_stack+0x7c/0xbb [ 130.642252] ___might_sleep+0x2c0/0x3b0 [ 130.642862] kmem_cache_alloc_trace+0x1cd/0x2b0 [ 130.643591] gtp_genl_new_pdp+0x6c5/0x1150 [gtp] [ 130.644371] genl_family_rcv_msg+0x63a/0x1030 [ 130.645074] ? mutex_lock_io_nested+0x1090/0x1090 [ 130.645845] ? genl_unregister_family+0x630/0x630 [ 130.646592] ? debug_show_all_locks+0x2d0/0x2d0 [ 130.647293] ? check_flags.part.40+0x440/0x440 [ 130.648099] genl_rcv_msg+0xa3/0x130 [ ... ] Fixes: 459aa66 ("gtp: add initial driver for datapath of GPRS Tunneling Protocol (GTP-U)") Signed-off-by: Taehee Yoo <[email protected]> Signed-off-by: David S. Miller <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
mylove90
referenced
this issue
in mylove90/android_kernel_xiaomi_onc
Sep 24, 2019
[ Upstream commit 181fa434d0514e40ebf6e9721f2b72700287b6e2 ] According to the PCI Local Bus specification Revision 3.0, section 6.8.1.3 (Message Control for MSI), endpoints that are Multiple Message Capable as defined by bits [3:1] in the Message Control for MSI can request a number of vectors that is power of two aligned. As specified in section 6.8.1.6 "Message data for MSI", the Multiple Message Enable field (bits [6:4] of the Message Control register) defines the number of low order message data bits the function is permitted to modify to generate its system software allocated vectors. The MSI controller in the Xilinx NWL PCIe controller supports a number of MSI vectors specified through a bitmap and the hwirq number for an MSI, that is the value written in the MSI data TLP is determined by the bitmap allocation. For instance, in a situation where two endpoints sitting on the PCI bus request the following MSI configuration, with the current PCI Xilinx bitmap allocation code (that does not align MSI vector allocation on a power of two boundary): Endpoint #1: Requesting 1 MSI vector - allocated bitmap bits 0 Endpoint #2: Requesting 2 MSI vectors - allocated bitmap bits [1,2] The bitmap value(s) corresponds to the hwirq number that is programmed into the Message Data for MSI field in the endpoint MSI capability and is detected by the root complex to fire the corresponding MSI irqs. The value written in Message Data for MSI field corresponds to the first bit allocated in the bitmap for Multi MSI vectors. The current Xilinx NWL MSI allocation code allows a bitmap allocation that is not a power of two boundaries, so endpoint #2, is allowed to toggle Message Data bit[0] to differentiate between its two vectors (meaning that the MSI data will be respectively 0x0 and 0x1 for the two vectors allocated to endpoint #2). This clearly aliases with the Endpoint #1 vector allocation, resulting in a broken Multi MSI implementation. Update the code to allocate MSI bitmap ranges with a power of two alignment, fixing the bug. Fixes: ab597d3 ("PCI: xilinx-nwl: Add support for Xilinx NWL PCIe Host Controller") Suggested-by: Marc Zyngier <[email protected]> Signed-off-by: Bharat Kumar Gogada <[email protected]> [[email protected]: updated commit log] Signed-off-by: Lorenzo Pieralisi <[email protected]> Acked-by: Marc Zyngier <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
mylove90
referenced
this issue
in mylove90/android_kernel_xiaomi_onc
Sep 24, 2019
[ Upstream commit 80e5302e4bc85a6b685b7668c36c6487b5f90e9a ] An impending change to enable HAVE_C_RECORDMCOUNT on powerpc leads to warnings such as the following: # modprobe kprobe_example ftrace-powerpc: Not expected bl: opcode is 3c4c0001 WARNING: CPU: 0 PID: 227 at kernel/trace/ftrace.c:2001 ftrace_bug+0x90/0x318 Modules linked in: CPU: 0 PID: 227 Comm: modprobe Not tainted 5.2.0-rc6-00678-g1c329100b942 #2 NIP: c000000000264318 LR: c00000000025d694 CTR: c000000000f5cd30 REGS: c000000001f2b7b0 TRAP: 0700 Not tainted (5.2.0-rc6-00678-g1c329100b942) MSR: 900000010282b033 <SF,HV,VEC,VSX,EE,FP,ME,IR,DR,RI,LE,TM[E]> CR: 28228222 XER: 00000000 CFAR: c0000000002642fc IRQMASK: 0 <snip> NIP [c000000000264318] ftrace_bug+0x90/0x318 LR [c00000000025d694] ftrace_process_locs+0x4f4/0x5e0 Call Trace: [c000000001f2ba40] [0000000000000004] 0x4 (unreliable) [c000000001f2bad0] [c00000000025d694] ftrace_process_locs+0x4f4/0x5e0 [c000000001f2bb90] [c00000000020ff10] load_module+0x25b0/0x30c0 [c000000001f2bd00] [c000000000210cb0] sys_finit_module+0xc0/0x130 [c000000001f2be20] [c00000000000bda4] system_call+0x5c/0x70 Instruction dump: 419e0018 2f83ffff 419e00bc 2f83ffea 409e00cc 4800001c 0fe00000 3c62ff96 39000001 39400000 386386d0 480000c4 <0fe00000> 3ce20003 39000001 3c62ff96 ---[ end trace 4c438d5cebf78381 ]--- ftrace failed to modify [<c0080000012a0008>] 0xc0080000012a0008 actual: 01:00:4c:3c Initializing ftrace call sites ftrace record flags: 2000000 (0) expected tramp: c00000000006af4c Looking at the relocation records in __mcount_loc shows a few spurious entries: RELOCATION RECORDS FOR [__mcount_loc]: OFFSET TYPE VALUE 0000000000000000 R_PPC64_ADDR64 .text.unlikely+0x0000000000000008 0000000000000008 R_PPC64_ADDR64 .text.unlikely+0x0000000000000014 0000000000000010 R_PPC64_ADDR64 .text.unlikely+0x0000000000000060 0000000000000018 R_PPC64_ADDR64 .text.unlikely+0x00000000000000b4 0000000000000020 R_PPC64_ADDR64 .init.text+0x0000000000000008 0000000000000028 R_PPC64_ADDR64 .init.text+0x0000000000000014 The first entry in each section is incorrect. Looking at the relocation records, the spurious entries correspond to the R_PPC64_ENTRY records: RELOCATION RECORDS FOR [.text.unlikely]: OFFSET TYPE VALUE 0000000000000000 R_PPC64_REL64 .TOC.-0x0000000000000008 0000000000000008 R_PPC64_ENTRY *ABS* 0000000000000014 R_PPC64_REL24 _mcount <snip> The problem is that we are not validating the return value from get_mcountsym() in sift_rel_mcount(). With this entry, mcountsym is 0, but Elf_r_sym(relp) also ends up being 0. Fix this by ensuring mcountsym is valid before processing the entry. Signed-off-by: Naveen N. Rao <[email protected]> Acked-by: Steven Rostedt (VMware) <[email protected]> Tested-by: Satheesh Rajendran <[email protected]> Signed-off-by: Michael Ellerman <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
mylove90
referenced
this issue
in mylove90/android_kernel_xiaomi_onc
Sep 24, 2019
commit cf3591ef832915892f2499b7e54b51d4c578b28c upstream. Revert the commit bd293d071ffe65e645b4d8104f9d8fe15ea13862. The proper fix has been made available with commit d0a255e795ab ("loop: set PF_MEMALLOC_NOIO for the worker thread"). Note that the fix offered by commit bd293d071ffe doesn't really prevent the deadlock from occuring - if we look at the stacktrace reported by Junxiao Bi, we see that it hangs in bit_wait_io and not on the mutex - i.e. it has already successfully taken the mutex. Changing the mutex from mutex_lock to mutex_trylock won't help with deadlocks that happen afterwards. PID: 474 TASK: ffff8813e11f4600 CPU: 10 COMMAND: "kswapd0" #0 [ffff8813dedfb938] __schedule at ffffffff8173f405 #1 [ffff8813dedfb990] schedule at ffffffff8173fa27 #2 [ffff8813dedfb9b0] schedule_timeout at ffffffff81742fec #3 [ffff8813dedfba60] io_schedule_timeout at ffffffff8173f186 #4 [ffff8813dedfbaa0] bit_wait_io at ffffffff8174034f #5 [ffff8813dedfbac0] __wait_on_bit at ffffffff8173fec8 #6 [ffff8813dedfbb10] out_of_line_wait_on_bit at ffffffff8173ff81 #7 [ffff8813dedfbb90] __make_buffer_clean at ffffffffa038736f [dm_bufio] #8 [ffff8813dedfbbb0] __try_evict_buffer at ffffffffa0387bb8 [dm_bufio] #9 [ffff8813dedfbbd0] dm_bufio_shrink_scan at ffffffffa0387cc3 [dm_bufio] #10 [ffff8813dedfbc40] shrink_slab at ffffffff811a87ce #11 [ffff8813dedfbd30] shrink_zone at ffffffff811ad778 #12 [ffff8813dedfbdc0] kswapd at ffffffff811ae92f #13 [ffff8813dedfbec0] kthread at ffffffff810a8428 #14 [ffff8813dedfbf50] ret_from_fork at ffffffff81745242 Signed-off-by: Mikulas Patocka <[email protected]> Cc: [email protected] Fixes: bd293d071ffe ("dm bufio: fix deadlock with loop device") Depends-on: d0a255e795ab ("loop: set PF_MEMALLOC_NOIO for the worker thread") Signed-off-by: Mike Snitzer <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
mylove90
referenced
this issue
in mylove90/android_kernel_xiaomi_onc
Sep 24, 2019
[ Upstream commit 86968ef21596515958d5f0a40233d02be78ecec0 ] Calling ceph_buffer_put() in __ceph_setxattr() may end up freeing the i_xattrs.prealloc_blob buffer while holding the i_ceph_lock. This can be fixed by postponing the call until later, when the lock is released. The following backtrace was triggered by fstests generic/117. BUG: sleeping function called from invalid context at mm/vmalloc.c:2283 in_atomic(): 1, irqs_disabled(): 0, pid: 650, name: fsstress 3 locks held by fsstress/650: #0: 00000000870a0fe8 (sb_writers#8){.+.+}, at: mnt_want_write+0x20/0x50 #1: 00000000ba0c4c74 (&type->i_mutex_dir_key#6){++++}, at: vfs_setxattr+0x55/0xa0 #2: 000000008dfbb3f2 (&(&ci->i_ceph_lock)->rlock){+.+.}, at: __ceph_setxattr+0x297/0x810 CPU: 1 PID: 650 Comm: fsstress Not tainted 5.2.0+ #437 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.12.1-0-ga5cab58-prebuilt.qemu.org 04/01/2014 Call Trace: dump_stack+0x67/0x90 ___might_sleep.cold+0x9f/0xb1 vfree+0x4b/0x60 ceph_buffer_release+0x1b/0x60 __ceph_setxattr+0x2b4/0x810 __vfs_setxattr+0x66/0x80 __vfs_setxattr_noperm+0x59/0xf0 vfs_setxattr+0x81/0xa0 setxattr+0x115/0x230 ? filename_lookup+0xc9/0x140 ? rcu_read_lock_sched_held+0x74/0x80 ? rcu_sync_lockdep_assert+0x2e/0x60 ? __sb_start_write+0x142/0x1a0 ? mnt_want_write+0x20/0x50 path_setxattr+0xba/0xd0 __x64_sys_lsetxattr+0x24/0x30 do_syscall_64+0x50/0x1c0 entry_SYSCALL_64_after_hwframe+0x49/0xbe RIP: 0033:0x7ff23514359a Signed-off-by: Luis Henriques <[email protected]> Reviewed-by: Jeff Layton <[email protected]> Signed-off-by: Ilya Dryomov <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
mylove90
referenced
this issue
in mylove90/android_kernel_xiaomi_onc
Sep 24, 2019
[ Upstream commit af8a85a41734f37b67ba8ce69d56b685bee4ac48 ] Calling ceph_buffer_put() in fill_inode() may result in freeing the i_xattrs.blob buffer while holding the i_ceph_lock. This can be fixed by postponing the call until later, when the lock is released. The following backtrace was triggered by fstests generic/070. BUG: sleeping function called from invalid context at mm/vmalloc.c:2283 in_atomic(): 1, irqs_disabled(): 0, pid: 3852, name: kworker/0:4 6 locks held by kworker/0:4/3852: #0: 000000004270f6bb ((wq_completion)ceph-msgr){+.+.}, at: process_one_work+0x1b8/0x5f0 #1: 00000000eb420803 ((work_completion)(&(&con->work)->work)){+.+.}, at: process_one_work+0x1b8/0x5f0 #2: 00000000be1c53a4 (&s->s_mutex){+.+.}, at: dispatch+0x288/0x1476 #3: 00000000559cb958 (&mdsc->snap_rwsem){++++}, at: dispatch+0x2eb/0x1476 #4: 000000000d5ebbae (&req->r_fill_mutex){+.+.}, at: dispatch+0x2fc/0x1476 #5: 00000000a83d0514 (&(&ci->i_ceph_lock)->rlock){+.+.}, at: fill_inode.isra.0+0xf8/0xf70 CPU: 0 PID: 3852 Comm: kworker/0:4 Not tainted 5.2.0+ #441 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.12.1-0-ga5cab58-prebuilt.qemu.org 04/01/2014 Workqueue: ceph-msgr ceph_con_workfn Call Trace: dump_stack+0x67/0x90 ___might_sleep.cold+0x9f/0xb1 vfree+0x4b/0x60 ceph_buffer_release+0x1b/0x60 fill_inode.isra.0+0xa9b/0xf70 ceph_fill_trace+0x13b/0xc70 ? dispatch+0x2eb/0x1476 dispatch+0x320/0x1476 ? __mutex_unlock_slowpath+0x4d/0x2a0 ceph_con_workfn+0xc97/0x2ec0 ? process_one_work+0x1b8/0x5f0 process_one_work+0x244/0x5f0 worker_thread+0x4d/0x3e0 kthread+0x105/0x140 ? process_one_work+0x5f0/0x5f0 ? kthread_park+0x90/0x90 ret_from_fork+0x3a/0x50 Signed-off-by: Luis Henriques <[email protected]> Reviewed-by: Jeff Layton <[email protected]> Signed-off-by: Ilya Dryomov <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
mylove90
referenced
this issue
in mylove90/android_kernel_xiaomi_onc
Sep 24, 2019
[ Upstream commit fe163e534e5eecdfd7b5920b0dfd24c458ee85d6 ] syzbot reported: BUG: KMSAN: uninit-value in capi_write+0x791/0xa90 drivers/isdn/capi/capi.c:700 CPU: 0 PID: 10025 Comm: syz-executor379 Not tainted 4.20.0-rc7+ #2 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Call Trace: __dump_stack lib/dump_stack.c:77 [inline] dump_stack+0x173/0x1d0 lib/dump_stack.c:113 kmsan_report+0x12e/0x2a0 mm/kmsan/kmsan.c:613 __msan_warning+0x82/0xf0 mm/kmsan/kmsan_instr.c:313 capi_write+0x791/0xa90 drivers/isdn/capi/capi.c:700 do_loop_readv_writev fs/read_write.c:703 [inline] do_iter_write+0x83e/0xd80 fs/read_write.c:961 vfs_writev fs/read_write.c:1004 [inline] do_writev+0x397/0x840 fs/read_write.c:1039 __do_sys_writev fs/read_write.c:1112 [inline] __se_sys_writev+0x9b/0xb0 fs/read_write.c:1109 __x64_sys_writev+0x4a/0x70 fs/read_write.c:1109 do_syscall_64+0xbc/0xf0 arch/x86/entry/common.c:291 entry_SYSCALL_64_after_hwframe+0x63/0xe7 [...] The problem is that capi_write() is reading past the end of the message. Fix it by checking the message's length in the needed places. Reported-and-tested-by: [email protected] Signed-off-by: Eric Biggers <[email protected]> Signed-off-by: David S. Miller <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
mylove90
referenced
this issue
in mylove90/android_kernel_xiaomi_onc
Oct 6, 2019
[ Upstream commit 79c92ca42b5a3e0ea172ea2ce8df8e125af237da ] When receiving a deauthentication/disassociation frame from a TDLS peer, a station should not disconnect the current AP, but only disable the current TDLS link if it's enabled. Without this change, a TDLS issue can be reproduced by following the steps as below: 1. STA-1 and STA-2 are connected to AP, bidirection traffic is running between STA-1 and STA-2. 2. Set up TDLS link between STA-1 and STA-2, stay for a while, then teardown TDLS link. 3. Repeat step #2 and monitor the connection between STA and AP. During the test, one STA may send a deauthentication/disassociation frame to another, after TDLS teardown, with reason code 6/7, which means: Class 2/3 frame received from nonassociated STA. On receive this frame, the receiver STA will disconnect the current AP and then reconnect. It's not a expected behavior, purpose of this frame should be disabling the TDLS link, not the link with AP. Cc: [email protected] Signed-off-by: Yu Wang <[email protected]> Signed-off-by: Johannes Berg <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
mylove90
referenced
this issue
in mylove90/android_kernel_xiaomi_onc
Oct 6, 2019
commit f8659d68e2bee5b86a1beaf7be42d942e1fc81f4 upstream. Define the working variables to be unsigned long to be compatible with for_each_set_bit and change types as needed. While we are at it remove unused variables from a couple of functions. This was found because of the following KASAN warning: ================================================================== BUG: KASAN: stack-out-of-bounds in find_first_bit+0x19/0x70 Read of size 8 at addr ffff888362d778d0 by task kworker/u308:2/1889 CPU: 21 PID: 1889 Comm: kworker/u308:2 Tainted: G W 5.3.0-rc2-mm1+ #2 Hardware name: Intel Corporation W2600CR/W2600CR, BIOS SE5C600.86B.02.04.0003.102320141138 10/23/2014 Workqueue: ib-comp-unb-wq ib_cq_poll_work [ib_core] Call Trace: dump_stack+0x9a/0xf0 ? find_first_bit+0x19/0x70 print_address_description+0x6c/0x332 ? find_first_bit+0x19/0x70 ? find_first_bit+0x19/0x70 __kasan_report.cold.6+0x1a/0x3b ? find_first_bit+0x19/0x70 kasan_report+0xe/0x12 find_first_bit+0x19/0x70 pma_get_opa_portstatus+0x5cc/0xa80 [hfi1] ? ret_from_fork+0x3a/0x50 ? pma_get_opa_port_ectrs+0x200/0x200 [hfi1] ? stack_trace_consume_entry+0x80/0x80 hfi1_process_mad+0x39b/0x26c0 [hfi1] ? __lock_acquire+0x65e/0x21b0 ? clear_linkup_counters+0xb0/0xb0 [hfi1] ? check_chain_key+0x1d7/0x2e0 ? lock_downgrade+0x3a0/0x3a0 ? match_held_lock+0x2e/0x250 ib_mad_recv_done+0x698/0x15e0 [ib_core] ? clear_linkup_counters+0xb0/0xb0 [hfi1] ? ib_mad_send_done+0xc80/0xc80 [ib_core] ? mark_held_locks+0x79/0xa0 ? _raw_spin_unlock_irqrestore+0x44/0x60 ? rvt_poll_cq+0x1e1/0x340 [rdmavt] __ib_process_cq+0x97/0x100 [ib_core] ib_cq_poll_work+0x31/0xb0 [ib_core] process_one_work+0x4ee/0xa00 ? pwq_dec_nr_in_flight+0x110/0x110 ? do_raw_spin_lock+0x113/0x1d0 worker_thread+0x57/0x5a0 ? process_one_work+0xa00/0xa00 kthread+0x1bb/0x1e0 ? kthread_create_on_node+0xc0/0xc0 ret_from_fork+0x3a/0x50 The buggy address belongs to the page: page:ffffea000d8b5dc0 refcount:0 mapcount:0 mapping:0000000000000000 index:0x0 flags: 0x17ffffc0000000() raw: 0017ffffc0000000 0000000000000000 ffffea000d8b5dc8 0000000000000000 raw: 0000000000000000 0000000000000000 00000000ffffffff 0000000000000000 page dumped because: kasan: bad access detected addr ffff888362d778d0 is located in stack of task kworker/u308:2/1889 at offset 32 in frame: pma_get_opa_portstatus+0x0/0xa80 [hfi1] this frame has 1 object: [32, 36) 'vl_select_mask' Memory state around the buggy address: ffff888362d77780: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ffff888362d77800: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 >ffff888362d77880: 00 00 00 00 00 00 f1 f1 f1 f1 04 f2 f2 f2 00 00 ^ ffff888362d77900: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ffff888362d77980: 00 00 00 00 00 00 00 00 f1 f1 f1 f1 04 f2 f2 f2 ================================================================== Cc: <[email protected]> Fixes: 7724105 ("IB/hfi1: add driver files") Link: https://lore.kernel.org/r/[email protected] Reviewed-by: Mike Marciniszyn <[email protected]> Signed-off-by: Ira Weiny <[email protected]> Signed-off-by: Kaike Wan <[email protected]> Signed-off-by: Dennis Dalessandro <[email protected]> Signed-off-by: Jason Gunthorpe <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
mylove90
referenced
this issue
in mylove90/android_kernel_xiaomi_onc
Oct 18, 2019
commit ab5758848039de9a4b249d46e4ab591197eebaf2 upstream. ccw console is created early in start_kernel and used before css is initialized or ccw console subchannel is registered. Until then console subchannel does not have a parent. For that reason assume subchannels with no parent are not pseudo subchannels. This fixes the following kasan finding: BUG: KASAN: global-out-of-bounds in sch_is_pseudo_sch+0x8e/0x98 Read of size 8 at addr 00000000000005e8 by task swapper/0/0 CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.3.0-rc8-07370-g6ac43dd12538 #2 Hardware name: IBM 2964 NC9 702 (z/VM 6.4.0) Call Trace: ([<000000000012cd76>] show_stack+0x14e/0x1e0) [<0000000001f7fb44>] dump_stack+0x1a4/0x1f8 [<00000000007d7afc>] print_address_description+0x64/0x3c8 [<00000000007d75f6>] __kasan_report+0x14e/0x180 [<00000000018a2986>] sch_is_pseudo_sch+0x8e/0x98 [<000000000189b950>] cio_enable_subchannel+0x1d0/0x510 [<00000000018cac7c>] ccw_device_recognition+0x12c/0x188 [<0000000002ceb1a8>] ccw_device_enable_console+0x138/0x340 [<0000000002cf1cbe>] con3215_init+0x25e/0x300 [<0000000002c8770a>] console_init+0x68a/0x9b8 [<0000000002c6a3d6>] start_kernel+0x4fe/0x728 [<0000000000100070>] startup_continue+0x70/0xd0 Cc: [email protected] Reviewed-by: Sebastian Ott <[email protected]> Signed-off-by: Vasily Gorbik <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
mylove90
referenced
this issue
in mylove90/android_kernel_xiaomi_onc
Oct 18, 2019
[ Upstream commit 0216234c2eed1367a318daeb9f4a97d8217412a0 ] We release wrong pointer on error path in cpu_cache_level__read function, leading to segfault: (gdb) r record ls Starting program: /root/perf/tools/perf/perf record ls ... [ perf record: Woken up 1 times to write data ] double free or corruption (out) Thread 1 "perf" received signal SIGABRT, Aborted. 0x00007ffff7463798 in raise () from /lib64/power9/libc.so.6 (gdb) bt #0 0x00007ffff7463798 in raise () from /lib64/power9/libc.so.6 #1 0x00007ffff7443bac in abort () from /lib64/power9/libc.so.6 #2 0x00007ffff74af8bc in __libc_message () from /lib64/power9/libc.so.6 #3 0x00007ffff74b92b8 in malloc_printerr () from /lib64/power9/libc.so.6 #4 0x00007ffff74bb874 in _int_free () from /lib64/power9/libc.so.6 #5 0x0000000010271260 in __zfree (ptr=0x7fffffffa0b0) at ../../lib/zalloc.. #6 0x0000000010139340 in cpu_cache_level__read (cache=0x7fffffffa090, cac.. #7 0x0000000010143c90 in build_caches (cntp=0x7fffffffa118, size=<optimiz.. ... Releasing the proper pointer. Fixes: 720e98b ("perf tools: Add perf data cache feature") Signed-off-by: Jiri Olsa <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Michael Petlan <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: [email protected]: # v4.6+ Link: http://lore.kernel.org/lkml/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
mylove90
referenced
this issue
in mylove90/android_kernel_xiaomi_onc
Oct 18, 2019
[ Upstream commit 443f2d5ba13d65ccfd879460f77941875159d154 ] Observe a segmentation fault when 'perf stat' is asked to repeat forever with the interval option. Without fix: # perf stat -r 0 -I 5000 -e cycles -a sleep 10 # time counts unit events 5.000211692 3,13,89,82,34,157 cycles 10.000380119 1,53,98,52,22,294 cycles 10.040467280 17,16,79,265 cycles Segmentation fault This problem was only observed when we use forever option aka -r 0 and works with limited repeats. Calling print_counter with ts being set to NULL, is not a correct option when interval is set. Hence avoid print_counter(NULL,..) if interval is set. With fix: # perf stat -r 0 -I 5000 -e cycles -a sleep 10 # time counts unit events 5.019866622 3,15,14,43,08,697 cycles 10.039865756 3,15,16,31,95,261 cycles 10.059950628 1,26,05,47,158 cycles 5.009902655 3,14,52,62,33,932 cycles 10.019880228 3,14,52,22,89,154 cycles 10.030543876 66,90,18,333 cycles 5.009848281 3,14,51,98,25,437 cycles 10.029854402 3,15,14,93,04,918 cycles 5.009834177 3,14,51,95,92,316 cycles Committer notes: Did the 'git bisect' to find the cset introducing the problem to add the Fixes tag below, and at that time the problem reproduced as: (gdb) run stat -r0 -I500 sleep 1 <SNIP> Program received signal SIGSEGV, Segmentation fault. print_interval (prefix=prefix@entry=0x7fffffffc8d0 "", ts=ts@entry=0x0) at builtin-stat.c:866 866 sprintf(prefix, "%6lu.%09lu%s", ts->tv_sec, ts->tv_nsec, csv_sep); (gdb) bt #0 print_interval (prefix=prefix@entry=0x7fffffffc8d0 "", ts=ts@entry=0x0) at builtin-stat.c:866 #1 0x000000000041860a in print_counters (ts=ts@entry=0x0, argc=argc@entry=2, argv=argv@entry=0x7fffffffd640) at builtin-stat.c:938 #2 0x0000000000419a7f in cmd_stat (argc=2, argv=0x7fffffffd640, prefix=<optimized out>) at builtin-stat.c:1411 #3 0x000000000045c65a in run_builtin (p=p@entry=0x6291b8 <commands+216>, argc=argc@entry=5, argv=argv@entry=0x7fffffffd640) at perf.c:370 #4 0x000000000045c893 in handle_internal_command (argc=5, argv=0x7fffffffd640) at perf.c:429 #5 0x000000000045c8f1 in run_argv (argcp=argcp@entry=0x7fffffffd4ac, argv=argv@entry=0x7fffffffd4a0) at perf.c:473 #6 0x000000000045cac9 in main (argc=<optimized out>, argv=<optimized out>) at perf.c:588 (gdb) Mostly the same as just before this patch: Program received signal SIGSEGV, Segmentation fault. 0x00000000005874a7 in print_interval (config=0xa1f2a0 <stat_config>, evlist=0xbc9b90, prefix=0x7fffffffd1c0 "`", ts=0x0) at util/stat-display.c:964 964 sprintf(prefix, "%6lu.%09lu%s", ts->tv_sec, ts->tv_nsec, config->csv_sep); (gdb) bt #0 0x00000000005874a7 in print_interval (config=0xa1f2a0 <stat_config>, evlist=0xbc9b90, prefix=0x7fffffffd1c0 "`", ts=0x0) at util/stat-display.c:964 #1 0x0000000000588047 in perf_evlist__print_counters (evlist=0xbc9b90, config=0xa1f2a0 <stat_config>, _target=0xa1f0c0 <target>, ts=0x0, argc=2, argv=0x7fffffffd670) at util/stat-display.c:1172 #2 0x000000000045390f in print_counters (ts=0x0, argc=2, argv=0x7fffffffd670) at builtin-stat.c:656 #3 0x0000000000456bb5 in cmd_stat (argc=2, argv=0x7fffffffd670) at builtin-stat.c:1960 #4 0x00000000004dd2e0 in run_builtin (p=0xa30e00 <commands+288>, argc=5, argv=0x7fffffffd670) at perf.c:310 #5 0x00000000004dd54d in handle_internal_command (argc=5, argv=0x7fffffffd670) at perf.c:362 #6 0x00000000004dd694 in run_argv (argcp=0x7fffffffd4cc, argv=0x7fffffffd4c0) at perf.c:406 #7 0x00000000004dda11 in main (argc=5, argv=0x7fffffffd670) at perf.c:531 (gdb) Fixes: d4f63a4 ("perf stat: Introduce print_counters function") Signed-off-by: Srikar Dronamraju <[email protected]> Acked-by: Jiri Olsa <[email protected]> Tested-by: Arnaldo Carvalho de Melo <[email protected]> Tested-by: Ravi Bangoria <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Naveen N. Rao <[email protected]> Cc: [email protected] # v4.2+ Link: http://lore.kernel.org/lkml/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
mylove90
referenced
this issue
in mylove90/android_kernel_xiaomi_onc
Oct 30, 2019
[ Upstream commit 2190168aaea42c31bff7b9a967e7b045f07df095 ] On excessive bit errors for the FCP channel ingress fibre path, the channel notifies us. Previously, we only emitted a kernel message and a trace record. Since performance can become suboptimal with I/O timeouts due to bit errors, we now stop using an FCP device by default on channel notification so multipath on top can timely failover to other paths. A new module parameter zfcp.ber_stop can be used to get zfcp old behavior. User explanation of new kernel message: * Description: * The FCP channel reported that its bit error threshold has been exceeded. * These errors might result from a problem with the physical components * of the local fibre link into the FCP channel. * The problem might be damage or malfunction of the cable or * cable connection between the FCP channel and * the adjacent fabric switch port or the point-to-point peer. * Find details about the errors in the HBA trace for the FCP device. * The zfcp device driver closed down the FCP device * to limit the performance impact from possible I/O command timeouts. * User action: * Check for problems on the local fibre link, ensure that fibre optics are * clean and functional, and all cables are properly plugged. * After the repair action, you can manually recover the FCP device by * writing "0" into its "failed" sysfs attribute. * If recovery through sysfs is not possible, set the CHPID of the device * offline and back online on the service element. Fixes: 1da177e ("Linux-2.6.12-rc2") Cc: <[email protected]> #2.6.30+ Link: https://lore.kernel.org/r/[email protected] Reviewed-by: Jens Remus <[email protected]> Reviewed-by: Benjamin Block <[email protected]> Signed-off-by: Steffen Maier <[email protected]> Signed-off-by: Martin K. Petersen <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
mylove90
referenced
this issue
in mylove90/android_kernel_xiaomi_onc
Oct 30, 2019
commit e4f8e513c3d353c134ad4eef9fd0bba12406c7c8 upstream. A long time ago we fixed a similar deadlock in show_slab_objects() [1]. However, it is apparently due to the commits like 01fb58bcba63 ("slab: remove synchronous synchronize_sched() from memcg cache deactivation path") and 03afc0e ("slab: get_online_mems for kmem_cache_{create,destroy,shrink}"), this kind of deadlock is back by just reading files in /sys/kernel/slab which will generate a lockdep splat below. Since the "mem_hotplug_lock" here is only to obtain a stable online node mask while racing with NUMA node hotplug, in the worst case, the results may me miscalculated while doing NUMA node hotplug, but they shall be corrected by later reads of the same files. WARNING: possible circular locking dependency detected ------------------------------------------------------ cat/5224 is trying to acquire lock: ffff900012ac3120 (mem_hotplug_lock.rw_sem){++++}, at: show_slab_objects+0x94/0x3a8 but task is already holding lock: b8ff009693eee398 (kn->count#45){++++}, at: kernfs_seq_start+0x44/0xf0 which lock already depends on the new lock. the existing dependency chain (in reverse order) is: -> #2 (kn->count#45){++++}: lock_acquire+0x31c/0x360 __kernfs_remove+0x290/0x490 kernfs_remove+0x30/0x44 sysfs_remove_dir+0x70/0x88 kobject_del+0x50/0xb0 sysfs_slab_unlink+0x2c/0x38 shutdown_cache+0xa0/0xf0 kmemcg_cache_shutdown_fn+0x1c/0x34 kmemcg_workfn+0x44/0x64 process_one_work+0x4f4/0x950 worker_thread+0x390/0x4bc kthread+0x1cc/0x1e8 ret_from_fork+0x10/0x18 -> #1 (slab_mutex){+.+.}: lock_acquire+0x31c/0x360 __mutex_lock_common+0x16c/0xf78 mutex_lock_nested+0x40/0x50 memcg_create_kmem_cache+0x38/0x16c memcg_kmem_cache_create_func+0x3c/0x70 process_one_work+0x4f4/0x950 worker_thread+0x390/0x4bc kthread+0x1cc/0x1e8 ret_from_fork+0x10/0x18 -> #0 (mem_hotplug_lock.rw_sem){++++}: validate_chain+0xd10/0x2bcc __lock_acquire+0x7f4/0xb8c lock_acquire+0x31c/0x360 get_online_mems+0x54/0x150 show_slab_objects+0x94/0x3a8 total_objects_show+0x28/0x34 slab_attr_show+0x38/0x54 sysfs_kf_seq_show+0x198/0x2d4 kernfs_seq_show+0xa4/0xcc seq_read+0x30c/0x8a8 kernfs_fop_read+0xa8/0x314 __vfs_read+0x88/0x20c vfs_read+0xd8/0x10c ksys_read+0xb0/0x120 __arm64_sys_read+0x54/0x88 el0_svc_handler+0x170/0x240 el0_svc+0x8/0xc other info that might help us debug this: Chain exists of: mem_hotplug_lock.rw_sem --> slab_mutex --> kn->count#45 Possible unsafe locking scenario: CPU0 CPU1 ---- ---- lock(kn->count#45); lock(slab_mutex); lock(kn->count#45); lock(mem_hotplug_lock.rw_sem); *** DEADLOCK *** 3 locks held by cat/5224: #0: 9eff00095b14b2a0 (&p->lock){+.+.}, at: seq_read+0x4c/0x8a8 #1: 0eff008997041480 (&of->mutex){+.+.}, at: kernfs_seq_start+0x34/0xf0 #2: b8ff009693eee398 (kn->count#45){++++}, at: kernfs_seq_start+0x44/0xf0 stack backtrace: Call trace: dump_backtrace+0x0/0x248 show_stack+0x20/0x2c dump_stack+0xd0/0x140 print_circular_bug+0x368/0x380 check_noncircular+0x248/0x250 validate_chain+0xd10/0x2bcc __lock_acquire+0x7f4/0xb8c lock_acquire+0x31c/0x360 get_online_mems+0x54/0x150 show_slab_objects+0x94/0x3a8 total_objects_show+0x28/0x34 slab_attr_show+0x38/0x54 sysfs_kf_seq_show+0x198/0x2d4 kernfs_seq_show+0xa4/0xcc seq_read+0x30c/0x8a8 kernfs_fop_read+0xa8/0x314 __vfs_read+0x88/0x20c vfs_read+0xd8/0x10c ksys_read+0xb0/0x120 __arm64_sys_read+0x54/0x88 el0_svc_handler+0x170/0x240 el0_svc+0x8/0xc I think it is important to mention that this doesn't expose the show_slab_objects to use-after-free. There is only a single path that might really race here and that is the slab hotplug notifier callback __kmem_cache_shrink (via slab_mem_going_offline_callback) but that path doesn't really destroy kmem_cache_node data structures. [1] http://lkml.iu.edu/hypermail/linux/kernel/1101.0/02850.html [[email protected]: add comment explaining why we don't need mem_hotplug_lock] Link: http://lkml.kernel.org/r/[email protected] Fixes: 01fb58bcba63 ("slab: remove synchronous synchronize_sched() from memcg cache deactivation path") Fixes: 03afc0e ("slab: get_online_mems for kmem_cache_{create,destroy,shrink}") Signed-off-by: Qian Cai <[email protected]> Acked-by: Michal Hocko <[email protected]> Cc: Christoph Lameter <[email protected]> Cc: Pekka Enberg <[email protected]> Cc: David Rientjes <[email protected]> Cc: Joonsoo Kim <[email protected]> Cc: Tejun Heo <[email protected]> Cc: Vladimir Davydov <[email protected]> Cc: Roman Gushchin <[email protected]> Cc: <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
AbhinandAK350
referenced
this issue
in AbhinandAK350/neon_kernel_onclite
Nov 3, 2019
[ Upstream commit 2190168aaea42c31bff7b9a967e7b045f07df095 ] On excessive bit errors for the FCP channel ingress fibre path, the channel notifies us. Previously, we only emitted a kernel message and a trace record. Since performance can become suboptimal with I/O timeouts due to bit errors, we now stop using an FCP device by default on channel notification so multipath on top can timely failover to other paths. A new module parameter zfcp.ber_stop can be used to get zfcp old behavior. User explanation of new kernel message: * Description: * The FCP channel reported that its bit error threshold has been exceeded. * These errors might result from a problem with the physical components * of the local fibre link into the FCP channel. * The problem might be damage or malfunction of the cable or * cable connection between the FCP channel and * the adjacent fabric switch port or the point-to-point peer. * Find details about the errors in the HBA trace for the FCP device. * The zfcp device driver closed down the FCP device * to limit the performance impact from possible I/O command timeouts. * User action: * Check for problems on the local fibre link, ensure that fibre optics are * clean and functional, and all cables are properly plugged. * After the repair action, you can manually recover the FCP device by * writing "0" into its "failed" sysfs attribute. * If recovery through sysfs is not possible, set the CHPID of the device * offline and back online on the service element. Fixes: 1da177e ("Linux-2.6.12-rc2") Cc: <[email protected]> mylove90#2.6.30+ Link: https://lore.kernel.org/r/[email protected] Reviewed-by: Jens Remus <[email protected]> Reviewed-by: Benjamin Block <[email protected]> Signed-off-by: Steffen Maier <[email protected]> Signed-off-by: Martin K. Petersen <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
AbhinandAK350
referenced
this issue
in AbhinandAK350/neon_kernel_onclite
Nov 3, 2019
commit e4f8e513c3d353c134ad4eef9fd0bba12406c7c8 upstream. A long time ago we fixed a similar deadlock in show_slab_objects() [1]. However, it is apparently due to the commits like 01fb58bcba63 ("slab: remove synchronous synchronize_sched() from memcg cache deactivation path") and 03afc0e ("slab: get_online_mems for kmem_cache_{create,destroy,shrink}"), this kind of deadlock is back by just reading files in /sys/kernel/slab which will generate a lockdep splat below. Since the "mem_hotplug_lock" here is only to obtain a stable online node mask while racing with NUMA node hotplug, in the worst case, the results may me miscalculated while doing NUMA node hotplug, but they shall be corrected by later reads of the same files. WARNING: possible circular locking dependency detected ------------------------------------------------------ cat/5224 is trying to acquire lock: ffff900012ac3120 (mem_hotplug_lock.rw_sem){++++}, at: show_slab_objects+0x94/0x3a8 but task is already holding lock: b8ff009693eee398 (kn->count#45){++++}, at: kernfs_seq_start+0x44/0xf0 which lock already depends on the new lock. the existing dependency chain (in reverse order) is: -> mylove90#2 (kn->count#45){++++}: lock_acquire+0x31c/0x360 __kernfs_remove+0x290/0x490 kernfs_remove+0x30/0x44 sysfs_remove_dir+0x70/0x88 kobject_del+0x50/0xb0 sysfs_slab_unlink+0x2c/0x38 shutdown_cache+0xa0/0xf0 kmemcg_cache_shutdown_fn+0x1c/0x34 kmemcg_workfn+0x44/0x64 process_one_work+0x4f4/0x950 worker_thread+0x390/0x4bc kthread+0x1cc/0x1e8 ret_from_fork+0x10/0x18 -> mylove90#1 (slab_mutex){+.+.}: lock_acquire+0x31c/0x360 __mutex_lock_common+0x16c/0xf78 mutex_lock_nested+0x40/0x50 memcg_create_kmem_cache+0x38/0x16c memcg_kmem_cache_create_func+0x3c/0x70 process_one_work+0x4f4/0x950 worker_thread+0x390/0x4bc kthread+0x1cc/0x1e8 ret_from_fork+0x10/0x18 -> #0 (mem_hotplug_lock.rw_sem){++++}: validate_chain+0xd10/0x2bcc __lock_acquire+0x7f4/0xb8c lock_acquire+0x31c/0x360 get_online_mems+0x54/0x150 show_slab_objects+0x94/0x3a8 total_objects_show+0x28/0x34 slab_attr_show+0x38/0x54 sysfs_kf_seq_show+0x198/0x2d4 kernfs_seq_show+0xa4/0xcc seq_read+0x30c/0x8a8 kernfs_fop_read+0xa8/0x314 __vfs_read+0x88/0x20c vfs_read+0xd8/0x10c ksys_read+0xb0/0x120 __arm64_sys_read+0x54/0x88 el0_svc_handler+0x170/0x240 el0_svc+0x8/0xc other info that might help us debug this: Chain exists of: mem_hotplug_lock.rw_sem --> slab_mutex --> kn->count#45 Possible unsafe locking scenario: CPU0 CPU1 ---- ---- lock(kn->count#45); lock(slab_mutex); lock(kn->count#45); lock(mem_hotplug_lock.rw_sem); *** DEADLOCK *** 3 locks held by cat/5224: #0: 9eff00095b14b2a0 (&p->lock){+.+.}, at: seq_read+0x4c/0x8a8 mylove90#1: 0eff008997041480 (&of->mutex){+.+.}, at: kernfs_seq_start+0x34/0xf0 mylove90#2: b8ff009693eee398 (kn->count#45){++++}, at: kernfs_seq_start+0x44/0xf0 stack backtrace: Call trace: dump_backtrace+0x0/0x248 show_stack+0x20/0x2c dump_stack+0xd0/0x140 print_circular_bug+0x368/0x380 check_noncircular+0x248/0x250 validate_chain+0xd10/0x2bcc __lock_acquire+0x7f4/0xb8c lock_acquire+0x31c/0x360 get_online_mems+0x54/0x150 show_slab_objects+0x94/0x3a8 total_objects_show+0x28/0x34 slab_attr_show+0x38/0x54 sysfs_kf_seq_show+0x198/0x2d4 kernfs_seq_show+0xa4/0xcc seq_read+0x30c/0x8a8 kernfs_fop_read+0xa8/0x314 __vfs_read+0x88/0x20c vfs_read+0xd8/0x10c ksys_read+0xb0/0x120 __arm64_sys_read+0x54/0x88 el0_svc_handler+0x170/0x240 el0_svc+0x8/0xc I think it is important to mention that this doesn't expose the show_slab_objects to use-after-free. There is only a single path that might really race here and that is the slab hotplug notifier callback __kmem_cache_shrink (via slab_mem_going_offline_callback) but that path doesn't really destroy kmem_cache_node data structures. [1] http://lkml.iu.edu/hypermail/linux/kernel/1101.0/02850.html [[email protected]: add comment explaining why we don't need mem_hotplug_lock] Link: http://lkml.kernel.org/r/[email protected] Fixes: 01fb58bcba63 ("slab: remove synchronous synchronize_sched() from memcg cache deactivation path") Fixes: 03afc0e ("slab: get_online_mems for kmem_cache_{create,destroy,shrink}") Signed-off-by: Qian Cai <[email protected]> Acked-by: Michal Hocko <[email protected]> Cc: Christoph Lameter <[email protected]> Cc: Pekka Enberg <[email protected]> Cc: David Rientjes <[email protected]> Cc: Joonsoo Kim <[email protected]> Cc: Tejun Heo <[email protected]> Cc: Vladimir Davydov <[email protected]> Cc: Roman Gushchin <[email protected]> Cc: <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
mylove90
referenced
this issue
in mylove90/android_kernel_xiaomi_onc
Nov 7, 2019
[ Upstream commit b66f31efbdad95ec274345721d99d1d835e6de01 ] This patch fixes the lock inversion complaint: ============================================ WARNING: possible recursive locking detected 5.3.0-rc7-dbg+ #1 Not tainted -------------------------------------------- kworker/u16:6/171 is trying to acquire lock: 00000000035c6e6c (&id_priv->handler_mutex){+.+.}, at: rdma_destroy_id+0x78/0x4a0 [rdma_cm] but task is already holding lock: 00000000bc7c307d (&id_priv->handler_mutex){+.+.}, at: iw_conn_req_handler+0x151/0x680 [rdma_cm] other info that might help us debug this: Possible unsafe locking scenario: CPU0 ---- lock(&id_priv->handler_mutex); lock(&id_priv->handler_mutex); *** DEADLOCK *** May be due to missing lock nesting notation 3 locks held by kworker/u16:6/171: #0: 00000000e2eaa773 ((wq_completion)iw_cm_wq){+.+.}, at: process_one_work+0x472/0xac0 #1: 000000001efd357b ((work_completion)(&work->work)#3){+.+.}, at: process_one_work+0x476/0xac0 #2: 00000000bc7c307d (&id_priv->handler_mutex){+.+.}, at: iw_conn_req_handler+0x151/0x680 [rdma_cm] stack backtrace: CPU: 3 PID: 171 Comm: kworker/u16:6 Not tainted 5.3.0-rc7-dbg+ #1 Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011 Workqueue: iw_cm_wq cm_work_handler [iw_cm] Call Trace: dump_stack+0x8a/0xd6 __lock_acquire.cold+0xe1/0x24d lock_acquire+0x106/0x240 __mutex_lock+0x12e/0xcb0 mutex_lock_nested+0x1f/0x30 rdma_destroy_id+0x78/0x4a0 [rdma_cm] iw_conn_req_handler+0x5c9/0x680 [rdma_cm] cm_work_handler+0xe62/0x1100 [iw_cm] process_one_work+0x56d/0xac0 worker_thread+0x7a/0x5d0 kthread+0x1bc/0x210 ret_from_fork+0x24/0x30 This is not a bug as there are actually two lock classes here. Link: https://lore.kernel.org/r/[email protected] Fixes: de910bd ("RDMA/cma: Simplify locking needed for serialization of callbacks") Signed-off-by: Bart Van Assche <[email protected]> Reviewed-by: Jason Gunthorpe <[email protected]> Signed-off-by: Jason Gunthorpe <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
mylove90
referenced
this issue
in mylove90/android_kernel_xiaomi_onc
Nov 7, 2019
commit 159d2c7d8106177bd9a986fd005a311fe0d11285 upstream. qdisc_root() use from netem_enqueue() triggers a lockdep warning. __dev_queue_xmit() uses rcu_read_lock_bh() which is not equivalent to rcu_read_lock() + local_bh_disable_bh as far as lockdep is concerned. WARNING: suspicious RCU usage 5.3.0-rc7+ #0 Not tainted ----------------------------- include/net/sch_generic.h:492 suspicious rcu_dereference_check() usage! other info that might help us debug this: rcu_scheduler_active = 2, debug_locks = 1 3 locks held by syz-executor427/8855: #0: 00000000b5525c01 (rcu_read_lock_bh){....}, at: lwtunnel_xmit_redirect include/net/lwtunnel.h:92 [inline] #0: 00000000b5525c01 (rcu_read_lock_bh){....}, at: ip_finish_output2+0x2dc/0x2570 net/ipv4/ip_output.c:214 #1: 00000000b5525c01 (rcu_read_lock_bh){....}, at: __dev_queue_xmit+0x20a/0x3650 net/core/dev.c:3804 #2: 00000000364bae92 (&(&sch->q.lock)->rlock){+.-.}, at: spin_lock include/linux/spinlock.h:338 [inline] #2: 00000000364bae92 (&(&sch->q.lock)->rlock){+.-.}, at: __dev_xmit_skb net/core/dev.c:3502 [inline] #2: 00000000364bae92 (&(&sch->q.lock)->rlock){+.-.}, at: __dev_queue_xmit+0x14b8/0x3650 net/core/dev.c:3838 stack backtrace: CPU: 0 PID: 8855 Comm: syz-executor427 Not tainted 5.3.0-rc7+ #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Call Trace: __dump_stack lib/dump_stack.c:77 [inline] dump_stack+0x172/0x1f0 lib/dump_stack.c:113 lockdep_rcu_suspicious+0x153/0x15d kernel/locking/lockdep.c:5357 qdisc_root include/net/sch_generic.h:492 [inline] netem_enqueue+0x1cfb/0x2d80 net/sched/sch_netem.c:479 __dev_xmit_skb net/core/dev.c:3527 [inline] __dev_queue_xmit+0x15d2/0x3650 net/core/dev.c:3838 dev_queue_xmit+0x18/0x20 net/core/dev.c:3902 neigh_hh_output include/net/neighbour.h:500 [inline] neigh_output include/net/neighbour.h:509 [inline] ip_finish_output2+0x1726/0x2570 net/ipv4/ip_output.c:228 __ip_finish_output net/ipv4/ip_output.c:308 [inline] __ip_finish_output+0x5fc/0xb90 net/ipv4/ip_output.c:290 ip_finish_output+0x38/0x1f0 net/ipv4/ip_output.c:318 NF_HOOK_COND include/linux/netfilter.h:294 [inline] ip_mc_output+0x292/0xf40 net/ipv4/ip_output.c:417 dst_output include/net/dst.h:436 [inline] ip_local_out+0xbb/0x190 net/ipv4/ip_output.c:125 ip_send_skb+0x42/0xf0 net/ipv4/ip_output.c:1555 udp_send_skb.isra.0+0x6b2/0x1160 net/ipv4/udp.c:887 udp_sendmsg+0x1e96/0x2820 net/ipv4/udp.c:1174 inet_sendmsg+0x9e/0xe0 net/ipv4/af_inet.c:807 sock_sendmsg_nosec net/socket.c:637 [inline] sock_sendmsg+0xd7/0x130 net/socket.c:657 ___sys_sendmsg+0x3e2/0x920 net/socket.c:2311 __sys_sendmmsg+0x1bf/0x4d0 net/socket.c:2413 __do_sys_sendmmsg net/socket.c:2442 [inline] __se_sys_sendmmsg net/socket.c:2439 [inline] __x64_sys_sendmmsg+0x9d/0x100 net/socket.c:2439 do_syscall_64+0xfd/0x6a0 arch/x86/entry/common.c:296 entry_SYSCALL_64_after_hwframe+0x49/0xbe Signed-off-by: Eric Dumazet <[email protected]> Reported-by: syzbot <[email protected]> Signed-off-by: David S. Miller <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
AbhinandAK350
referenced
this issue
in AbhinandAK350/neon_kernel_onclite
Nov 8, 2019
[ Upstream commit b66f31efbdad95ec274345721d99d1d835e6de01 ] This patch fixes the lock inversion complaint: ============================================ WARNING: possible recursive locking detected 5.3.0-rc7-dbg+ mylove90#1 Not tainted -------------------------------------------- kworker/u16:6/171 is trying to acquire lock: 00000000035c6e6c (&id_priv->handler_mutex){+.+.}, at: rdma_destroy_id+0x78/0x4a0 [rdma_cm] but task is already holding lock: 00000000bc7c307d (&id_priv->handler_mutex){+.+.}, at: iw_conn_req_handler+0x151/0x680 [rdma_cm] other info that might help us debug this: Possible unsafe locking scenario: CPU0 ---- lock(&id_priv->handler_mutex); lock(&id_priv->handler_mutex); *** DEADLOCK *** May be due to missing lock nesting notation 3 locks held by kworker/u16:6/171: #0: 00000000e2eaa773 ((wq_completion)iw_cm_wq){+.+.}, at: process_one_work+0x472/0xac0 mylove90#1: 000000001efd357b ((work_completion)(&work->work)mylove90#3){+.+.}, at: process_one_work+0x476/0xac0 mylove90#2: 00000000bc7c307d (&id_priv->handler_mutex){+.+.}, at: iw_conn_req_handler+0x151/0x680 [rdma_cm] stack backtrace: CPU: 3 PID: 171 Comm: kworker/u16:6 Not tainted 5.3.0-rc7-dbg+ mylove90#1 Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011 Workqueue: iw_cm_wq cm_work_handler [iw_cm] Call Trace: dump_stack+0x8a/0xd6 __lock_acquire.cold+0xe1/0x24d lock_acquire+0x106/0x240 __mutex_lock+0x12e/0xcb0 mutex_lock_nested+0x1f/0x30 rdma_destroy_id+0x78/0x4a0 [rdma_cm] iw_conn_req_handler+0x5c9/0x680 [rdma_cm] cm_work_handler+0xe62/0x1100 [iw_cm] process_one_work+0x56d/0xac0 worker_thread+0x7a/0x5d0 kthread+0x1bc/0x210 ret_from_fork+0x24/0x30 This is not a bug as there are actually two lock classes here. Link: https://lore.kernel.org/r/[email protected] Fixes: de910bd ("RDMA/cma: Simplify locking needed for serialization of callbacks") Signed-off-by: Bart Van Assche <[email protected]> Reviewed-by: Jason Gunthorpe <[email protected]> Signed-off-by: Jason Gunthorpe <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
AbhinandAK350
referenced
this issue
in AbhinandAK350/neon_kernel_onclite
Nov 8, 2019
commit 159d2c7d8106177bd9a986fd005a311fe0d11285 upstream. qdisc_root() use from netem_enqueue() triggers a lockdep warning. __dev_queue_xmit() uses rcu_read_lock_bh() which is not equivalent to rcu_read_lock() + local_bh_disable_bh as far as lockdep is concerned. WARNING: suspicious RCU usage 5.3.0-rc7+ #0 Not tainted ----------------------------- include/net/sch_generic.h:492 suspicious rcu_dereference_check() usage! other info that might help us debug this: rcu_scheduler_active = 2, debug_locks = 1 3 locks held by syz-executor427/8855: #0: 00000000b5525c01 (rcu_read_lock_bh){....}, at: lwtunnel_xmit_redirect include/net/lwtunnel.h:92 [inline] #0: 00000000b5525c01 (rcu_read_lock_bh){....}, at: ip_finish_output2+0x2dc/0x2570 net/ipv4/ip_output.c:214 mylove90#1: 00000000b5525c01 (rcu_read_lock_bh){....}, at: __dev_queue_xmit+0x20a/0x3650 net/core/dev.c:3804 mylove90#2: 00000000364bae92 (&(&sch->q.lock)->rlock){+.-.}, at: spin_lock include/linux/spinlock.h:338 [inline] mylove90#2: 00000000364bae92 (&(&sch->q.lock)->rlock){+.-.}, at: __dev_xmit_skb net/core/dev.c:3502 [inline] mylove90#2: 00000000364bae92 (&(&sch->q.lock)->rlock){+.-.}, at: __dev_queue_xmit+0x14b8/0x3650 net/core/dev.c:3838 stack backtrace: CPU: 0 PID: 8855 Comm: syz-executor427 Not tainted 5.3.0-rc7+ #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Call Trace: __dump_stack lib/dump_stack.c:77 [inline] dump_stack+0x172/0x1f0 lib/dump_stack.c:113 lockdep_rcu_suspicious+0x153/0x15d kernel/locking/lockdep.c:5357 qdisc_root include/net/sch_generic.h:492 [inline] netem_enqueue+0x1cfb/0x2d80 net/sched/sch_netem.c:479 __dev_xmit_skb net/core/dev.c:3527 [inline] __dev_queue_xmit+0x15d2/0x3650 net/core/dev.c:3838 dev_queue_xmit+0x18/0x20 net/core/dev.c:3902 neigh_hh_output include/net/neighbour.h:500 [inline] neigh_output include/net/neighbour.h:509 [inline] ip_finish_output2+0x1726/0x2570 net/ipv4/ip_output.c:228 __ip_finish_output net/ipv4/ip_output.c:308 [inline] __ip_finish_output+0x5fc/0xb90 net/ipv4/ip_output.c:290 ip_finish_output+0x38/0x1f0 net/ipv4/ip_output.c:318 NF_HOOK_COND include/linux/netfilter.h:294 [inline] ip_mc_output+0x292/0xf40 net/ipv4/ip_output.c:417 dst_output include/net/dst.h:436 [inline] ip_local_out+0xbb/0x190 net/ipv4/ip_output.c:125 ip_send_skb+0x42/0xf0 net/ipv4/ip_output.c:1555 udp_send_skb.isra.0+0x6b2/0x1160 net/ipv4/udp.c:887 udp_sendmsg+0x1e96/0x2820 net/ipv4/udp.c:1174 inet_sendmsg+0x9e/0xe0 net/ipv4/af_inet.c:807 sock_sendmsg_nosec net/socket.c:637 [inline] sock_sendmsg+0xd7/0x130 net/socket.c:657 ___sys_sendmsg+0x3e2/0x920 net/socket.c:2311 __sys_sendmmsg+0x1bf/0x4d0 net/socket.c:2413 __do_sys_sendmmsg net/socket.c:2442 [inline] __se_sys_sendmmsg net/socket.c:2439 [inline] __x64_sys_sendmmsg+0x9d/0x100 net/socket.c:2439 do_syscall_64+0xfd/0x6a0 arch/x86/entry/common.c:296 entry_SYSCALL_64_after_hwframe+0x49/0xbe Signed-off-by: Eric Dumazet <[email protected]> Reported-by: syzbot <[email protected]> Signed-off-by: David S. Miller <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
AbhinandAK350
referenced
this issue
in AbhinandAK350/neon_kernel_onclite
Nov 10, 2019
[ Upstream commit 551842446ed695641a00782cd118cbb064a416a1 ] ifmsh->csa is an RCU-protected pointer. The writer context in ieee80211_mesh_finish_csa() is already mutually exclusive with wdev->sdata.mtx, but the RCU checker did not know this. Use rcu_dereference_protected() to avoid a warning. fixes the following warning: [ 12.519089] ============================= [ 12.520042] WARNING: suspicious RCU usage [ 12.520652] 5.1.0-rc7-wt+ #16 Tainted: G W [ 12.521409] ----------------------------- [ 12.521972] net/mac80211/mesh.c:1223 suspicious rcu_dereference_check() usage! [ 12.522928] other info that might help us debug this: [ 12.523984] rcu_scheduler_active = 2, debug_locks = 1 [ 12.524855] 5 locks held by kworker/u8:2/152: [ 12.525438] #0: 00000000057be08c ((wq_completion)phy0){+.+.}, at: process_one_work+0x1a2/0x620 [ 12.526607] mylove90#1: 0000000059c6b07a ((work_completion)(&sdata->csa_finalize_work)){+.+.}, at: process_one_work+0x1a2/0x620 [ 12.528001] mylove90#2: 00000000f184ba7d (&wdev->mtx){+.+.}, at: ieee80211_csa_finalize_work+0x2f/0x90 [ 12.529116] mylove90#3: 00000000831a1f54 (&local->mtx){+.+.}, at: ieee80211_csa_finalize_work+0x47/0x90 [ 12.530233] mylove90#4: 00000000fd06f988 (&local->chanctx_mtx){+.+.}, at: ieee80211_csa_finalize_work+0x51/0x90 Signed-off-by: Thomas Pedersen <[email protected]> Signed-off-by: Johannes Berg <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
AbhinandAK350
referenced
this issue
in AbhinandAK350/neon_kernel_onclite
Nov 10, 2019
[ Upstream commit 6da9f775175e516fc7229ceaa9b54f8f56aa7924 ] When debugging options are turned on, the rcu_read_lock() function might not be inlined. This results in lockdep's print_lock() function printing "rcu_read_lock+0x0/0x70" instead of rcu_read_lock()'s caller. For example: [ 10.579995] ============================= [ 10.584033] WARNING: suspicious RCU usage [ 10.588074] 4.18.0.memcg_v2+ mylove90#1 Not tainted [ 10.593162] ----------------------------- [ 10.597203] include/linux/rcupdate.h:281 Illegal context switch in RCU read-side critical section! [ 10.606220] [ 10.606220] other info that might help us debug this: [ 10.606220] [ 10.614280] [ 10.614280] rcu_scheduler_active = 2, debug_locks = 1 [ 10.620853] 3 locks held by systemd/1: [ 10.624632] #0: (____ptrval____) (&type->i_mutex_dir_key#5){.+.+}, at: lookup_slow+0x42/0x70 [ 10.633232] mylove90#1: (____ptrval____) (rcu_read_lock){....}, at: rcu_read_lock+0x0/0x70 [ 10.640954] mylove90#2: (____ptrval____) (rcu_read_lock){....}, at: rcu_read_lock+0x0/0x70 These "rcu_read_lock+0x0/0x70" strings are not providing any useful information. This commit therefore forces inlining of the rcu_read_lock() function so that rcu_read_lock()'s caller is instead shown. Signed-off-by: Waiman Long <[email protected]> Signed-off-by: Paul E. McKenney <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
AbhinandAK350
referenced
this issue
in AbhinandAK350/neon_kernel_onclite
Nov 10, 2019
[ Upstream commit 3f167e1921865b379a9becf03828e7202c7b4917 ] ipv4_pdp_add() is called in RCU read-side critical section. So GFP_KERNEL should not be used in the function. This patch make ipv4_pdp_add() to use GFP_ATOMIC instead of GFP_KERNEL. Test commands: gtp-link add gtp1 & gtp-tunnel add gtp1 v1 100 200 1.1.1.1 2.2.2.2 Splat looks like: [ 130.618881] ============================= [ 130.626382] WARNING: suspicious RCU usage [ 130.626994] 5.2.0-rc6+ #50 Not tainted [ 130.627622] ----------------------------- [ 130.628223] ./include/linux/rcupdate.h:266 Illegal context switch in RCU read-side critical section! [ 130.629684] [ 130.629684] other info that might help us debug this: [ 130.629684] [ 130.631022] [ 130.631022] rcu_scheduler_active = 2, debug_locks = 1 [ 130.632136] 4 locks held by gtp-tunnel/1025: [ 130.632925] #0: 000000002b93c8b7 (cb_lock){++++}, at: genl_rcv+0x15/0x40 [ 130.634159] mylove90#1: 00000000f17bc999 (genl_mutex){+.+.}, at: genl_rcv_msg+0xfb/0x130 [ 130.635487] mylove90#2: 00000000c644ed8e (rtnl_mutex){+.+.}, at: gtp_genl_new_pdp+0x18c/0x1150 [gtp] [ 130.636936] mylove90#3: 0000000007a1cde7 (rcu_read_lock){....}, at: gtp_genl_new_pdp+0x187/0x1150 [gtp] [ 130.638348] [ 130.638348] stack backtrace: [ 130.639062] CPU: 1 PID: 1025 Comm: gtp-tunnel Not tainted 5.2.0-rc6+ #50 [ 130.641318] Call Trace: [ 130.641707] dump_stack+0x7c/0xbb [ 130.642252] ___might_sleep+0x2c0/0x3b0 [ 130.642862] kmem_cache_alloc_trace+0x1cd/0x2b0 [ 130.643591] gtp_genl_new_pdp+0x6c5/0x1150 [gtp] [ 130.644371] genl_family_rcv_msg+0x63a/0x1030 [ 130.645074] ? mutex_lock_io_nested+0x1090/0x1090 [ 130.645845] ? genl_unregister_family+0x630/0x630 [ 130.646592] ? debug_show_all_locks+0x2d0/0x2d0 [ 130.647293] ? check_flags.part.40+0x440/0x440 [ 130.648099] genl_rcv_msg+0xa3/0x130 [ ... ] Fixes: 459aa66 ("gtp: add initial driver for datapath of GPRS Tunneling Protocol (GTP-U)") Signed-off-by: Taehee Yoo <[email protected]> Signed-off-by: David S. Miller <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
AbhinandAK350
referenced
this issue
in AbhinandAK350/neon_kernel_onclite
Nov 10, 2019
[ Upstream commit 181fa434d0514e40ebf6e9721f2b72700287b6e2 ] According to the PCI Local Bus specification Revision 3.0, section 6.8.1.3 (Message Control for MSI), endpoints that are Multiple Message Capable as defined by bits [3:1] in the Message Control for MSI can request a number of vectors that is power of two aligned. As specified in section 6.8.1.6 "Message data for MSI", the Multiple Message Enable field (bits [6:4] of the Message Control register) defines the number of low order message data bits the function is permitted to modify to generate its system software allocated vectors. The MSI controller in the Xilinx NWL PCIe controller supports a number of MSI vectors specified through a bitmap and the hwirq number for an MSI, that is the value written in the MSI data TLP is determined by the bitmap allocation. For instance, in a situation where two endpoints sitting on the PCI bus request the following MSI configuration, with the current PCI Xilinx bitmap allocation code (that does not align MSI vector allocation on a power of two boundary): Endpoint mylove90#1: Requesting 1 MSI vector - allocated bitmap bits 0 Endpoint mylove90#2: Requesting 2 MSI vectors - allocated bitmap bits [1,2] The bitmap value(s) corresponds to the hwirq number that is programmed into the Message Data for MSI field in the endpoint MSI capability and is detected by the root complex to fire the corresponding MSI irqs. The value written in Message Data for MSI field corresponds to the first bit allocated in the bitmap for Multi MSI vectors. The current Xilinx NWL MSI allocation code allows a bitmap allocation that is not a power of two boundaries, so endpoint mylove90#2, is allowed to toggle Message Data bit[0] to differentiate between its two vectors (meaning that the MSI data will be respectively 0x0 and 0x1 for the two vectors allocated to endpoint mylove90#2). This clearly aliases with the Endpoint mylove90#1 vector allocation, resulting in a broken Multi MSI implementation. Update the code to allocate MSI bitmap ranges with a power of two alignment, fixing the bug. Fixes: ab597d3 ("PCI: xilinx-nwl: Add support for Xilinx NWL PCIe Host Controller") Suggested-by: Marc Zyngier <[email protected]> Signed-off-by: Bharat Kumar Gogada <[email protected]> [[email protected]: updated commit log] Signed-off-by: Lorenzo Pieralisi <[email protected]> Acked-by: Marc Zyngier <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
AbhinandAK350
referenced
this issue
in AbhinandAK350/neon_kernel_onclite
Nov 10, 2019
[ Upstream commit 80e5302e4bc85a6b685b7668c36c6487b5f90e9a ] An impending change to enable HAVE_C_RECORDMCOUNT on powerpc leads to warnings such as the following: # modprobe kprobe_example ftrace-powerpc: Not expected bl: opcode is 3c4c0001 WARNING: CPU: 0 PID: 227 at kernel/trace/ftrace.c:2001 ftrace_bug+0x90/0x318 Modules linked in: CPU: 0 PID: 227 Comm: modprobe Not tainted 5.2.0-rc6-00678-g1c329100b942 mylove90#2 NIP: c000000000264318 LR: c00000000025d694 CTR: c000000000f5cd30 REGS: c000000001f2b7b0 TRAP: 0700 Not tainted (5.2.0-rc6-00678-g1c329100b942) MSR: 900000010282b033 <SF,HV,VEC,VSX,EE,FP,ME,IR,DR,RI,LE,TM[E]> CR: 28228222 XER: 00000000 CFAR: c0000000002642fc IRQMASK: 0 <snip> NIP [c000000000264318] ftrace_bug+0x90/0x318 LR [c00000000025d694] ftrace_process_locs+0x4f4/0x5e0 Call Trace: [c000000001f2ba40] [0000000000000004] 0x4 (unreliable) [c000000001f2bad0] [c00000000025d694] ftrace_process_locs+0x4f4/0x5e0 [c000000001f2bb90] [c00000000020ff10] load_module+0x25b0/0x30c0 [c000000001f2bd00] [c000000000210cb0] sys_finit_module+0xc0/0x130 [c000000001f2be20] [c00000000000bda4] system_call+0x5c/0x70 Instruction dump: 419e0018 2f83ffff 419e00bc 2f83ffea 409e00cc 4800001c 0fe00000 3c62ff96 39000001 39400000 386386d0 480000c4 <0fe00000> 3ce20003 39000001 3c62ff96 ---[ end trace 4c438d5cebf78381 ]--- ftrace failed to modify [<c0080000012a0008>] 0xc0080000012a0008 actual: 01:00:4c:3c Initializing ftrace call sites ftrace record flags: 2000000 (0) expected tramp: c00000000006af4c Looking at the relocation records in __mcount_loc shows a few spurious entries: RELOCATION RECORDS FOR [__mcount_loc]: OFFSET TYPE VALUE 0000000000000000 R_PPC64_ADDR64 .text.unlikely+0x0000000000000008 0000000000000008 R_PPC64_ADDR64 .text.unlikely+0x0000000000000014 0000000000000010 R_PPC64_ADDR64 .text.unlikely+0x0000000000000060 0000000000000018 R_PPC64_ADDR64 .text.unlikely+0x00000000000000b4 0000000000000020 R_PPC64_ADDR64 .init.text+0x0000000000000008 0000000000000028 R_PPC64_ADDR64 .init.text+0x0000000000000014 The first entry in each section is incorrect. Looking at the relocation records, the spurious entries correspond to the R_PPC64_ENTRY records: RELOCATION RECORDS FOR [.text.unlikely]: OFFSET TYPE VALUE 0000000000000000 R_PPC64_REL64 .TOC.-0x0000000000000008 0000000000000008 R_PPC64_ENTRY *ABS* 0000000000000014 R_PPC64_REL24 _mcount <snip> The problem is that we are not validating the return value from get_mcountsym() in sift_rel_mcount(). With this entry, mcountsym is 0, but Elf_r_sym(relp) also ends up being 0. Fix this by ensuring mcountsym is valid before processing the entry. Signed-off-by: Naveen N. Rao <[email protected]> Acked-by: Steven Rostedt (VMware) <[email protected]> Tested-by: Satheesh Rajendran <[email protected]> Signed-off-by: Michael Ellerman <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
AbhinandAK350
referenced
this issue
in AbhinandAK350/neon_kernel_onclite
Nov 10, 2019
[ Upstream commit 551842446ed695641a00782cd118cbb064a416a1 ] ifmsh->csa is an RCU-protected pointer. The writer context in ieee80211_mesh_finish_csa() is already mutually exclusive with wdev->sdata.mtx, but the RCU checker did not know this. Use rcu_dereference_protected() to avoid a warning. fixes the following warning: [ 12.519089] ============================= [ 12.520042] WARNING: suspicious RCU usage [ 12.520652] 5.1.0-rc7-wt+ #16 Tainted: G W [ 12.521409] ----------------------------- [ 12.521972] net/mac80211/mesh.c:1223 suspicious rcu_dereference_check() usage! [ 12.522928] other info that might help us debug this: [ 12.523984] rcu_scheduler_active = 2, debug_locks = 1 [ 12.524855] 5 locks held by kworker/u8:2/152: [ 12.525438] #0: 00000000057be08c ((wq_completion)phy0){+.+.}, at: process_one_work+0x1a2/0x620 [ 12.526607] mylove90#1: 0000000059c6b07a ((work_completion)(&sdata->csa_finalize_work)){+.+.}, at: process_one_work+0x1a2/0x620 [ 12.528001] mylove90#2: 00000000f184ba7d (&wdev->mtx){+.+.}, at: ieee80211_csa_finalize_work+0x2f/0x90 [ 12.529116] mylove90#3: 00000000831a1f54 (&local->mtx){+.+.}, at: ieee80211_csa_finalize_work+0x47/0x90 [ 12.530233] mylove90#4: 00000000fd06f988 (&local->chanctx_mtx){+.+.}, at: ieee80211_csa_finalize_work+0x51/0x90 Signed-off-by: Thomas Pedersen <[email protected]> Signed-off-by: Johannes Berg <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
AbhinandAK350
referenced
this issue
in AbhinandAK350/neon_kernel_onclite
Nov 13, 2019
[ Upstream commit 6da9f775175e516fc7229ceaa9b54f8f56aa7924 ] When debugging options are turned on, the rcu_read_lock() function might not be inlined. This results in lockdep's print_lock() function printing "rcu_read_lock+0x0/0x70" instead of rcu_read_lock()'s caller. For example: [ 10.579995] ============================= [ 10.584033] WARNING: suspicious RCU usage [ 10.588074] 4.18.0.memcg_v2+ mylove90#1 Not tainted [ 10.593162] ----------------------------- [ 10.597203] include/linux/rcupdate.h:281 Illegal context switch in RCU read-side critical section! [ 10.606220] [ 10.606220] other info that might help us debug this: [ 10.606220] [ 10.614280] [ 10.614280] rcu_scheduler_active = 2, debug_locks = 1 [ 10.620853] 3 locks held by systemd/1: [ 10.624632] #0: (____ptrval____) (&type->i_mutex_dir_key#5){.+.+}, at: lookup_slow+0x42/0x70 [ 10.633232] mylove90#1: (____ptrval____) (rcu_read_lock){....}, at: rcu_read_lock+0x0/0x70 [ 10.640954] mylove90#2: (____ptrval____) (rcu_read_lock){....}, at: rcu_read_lock+0x0/0x70 These "rcu_read_lock+0x0/0x70" strings are not providing any useful information. This commit therefore forces inlining of the rcu_read_lock() function so that rcu_read_lock()'s caller is instead shown. Signed-off-by: Waiman Long <[email protected]> Signed-off-by: Paul E. McKenney <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
AbhinandAK350
referenced
this issue
in AbhinandAK350/neon_kernel_onclite
Nov 13, 2019
[ Upstream commit 3f167e1921865b379a9becf03828e7202c7b4917 ] ipv4_pdp_add() is called in RCU read-side critical section. So GFP_KERNEL should not be used in the function. This patch make ipv4_pdp_add() to use GFP_ATOMIC instead of GFP_KERNEL. Test commands: gtp-link add gtp1 & gtp-tunnel add gtp1 v1 100 200 1.1.1.1 2.2.2.2 Splat looks like: [ 130.618881] ============================= [ 130.626382] WARNING: suspicious RCU usage [ 130.626994] 5.2.0-rc6+ #50 Not tainted [ 130.627622] ----------------------------- [ 130.628223] ./include/linux/rcupdate.h:266 Illegal context switch in RCU read-side critical section! [ 130.629684] [ 130.629684] other info that might help us debug this: [ 130.629684] [ 130.631022] [ 130.631022] rcu_scheduler_active = 2, debug_locks = 1 [ 130.632136] 4 locks held by gtp-tunnel/1025: [ 130.632925] #0: 000000002b93c8b7 (cb_lock){++++}, at: genl_rcv+0x15/0x40 [ 130.634159] mylove90#1: 00000000f17bc999 (genl_mutex){+.+.}, at: genl_rcv_msg+0xfb/0x130 [ 130.635487] mylove90#2: 00000000c644ed8e (rtnl_mutex){+.+.}, at: gtp_genl_new_pdp+0x18c/0x1150 [gtp] [ 130.636936] mylove90#3: 0000000007a1cde7 (rcu_read_lock){....}, at: gtp_genl_new_pdp+0x187/0x1150 [gtp] [ 130.638348] [ 130.638348] stack backtrace: [ 130.639062] CPU: 1 PID: 1025 Comm: gtp-tunnel Not tainted 5.2.0-rc6+ #50 [ 130.641318] Call Trace: [ 130.641707] dump_stack+0x7c/0xbb [ 130.642252] ___might_sleep+0x2c0/0x3b0 [ 130.642862] kmem_cache_alloc_trace+0x1cd/0x2b0 [ 130.643591] gtp_genl_new_pdp+0x6c5/0x1150 [gtp] [ 130.644371] genl_family_rcv_msg+0x63a/0x1030 [ 130.645074] ? mutex_lock_io_nested+0x1090/0x1090 [ 130.645845] ? genl_unregister_family+0x630/0x630 [ 130.646592] ? debug_show_all_locks+0x2d0/0x2d0 [ 130.647293] ? check_flags.part.40+0x440/0x440 [ 130.648099] genl_rcv_msg+0xa3/0x130 [ ... ] Fixes: 459aa66 ("gtp: add initial driver for datapath of GPRS Tunneling Protocol (GTP-U)") Signed-off-by: Taehee Yoo <[email protected]> Signed-off-by: David S. Miller <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
AbhinandAK350
referenced
this issue
in AbhinandAK350/neon_kernel_onclite
Nov 13, 2019
[ Upstream commit 181fa434d0514e40ebf6e9721f2b72700287b6e2 ] According to the PCI Local Bus specification Revision 3.0, section 6.8.1.3 (Message Control for MSI), endpoints that are Multiple Message Capable as defined by bits [3:1] in the Message Control for MSI can request a number of vectors that is power of two aligned. As specified in section 6.8.1.6 "Message data for MSI", the Multiple Message Enable field (bits [6:4] of the Message Control register) defines the number of low order message data bits the function is permitted to modify to generate its system software allocated vectors. The MSI controller in the Xilinx NWL PCIe controller supports a number of MSI vectors specified through a bitmap and the hwirq number for an MSI, that is the value written in the MSI data TLP is determined by the bitmap allocation. For instance, in a situation where two endpoints sitting on the PCI bus request the following MSI configuration, with the current PCI Xilinx bitmap allocation code (that does not align MSI vector allocation on a power of two boundary): Endpoint mylove90#1: Requesting 1 MSI vector - allocated bitmap bits 0 Endpoint mylove90#2: Requesting 2 MSI vectors - allocated bitmap bits [1,2] The bitmap value(s) corresponds to the hwirq number that is programmed into the Message Data for MSI field in the endpoint MSI capability and is detected by the root complex to fire the corresponding MSI irqs. The value written in Message Data for MSI field corresponds to the first bit allocated in the bitmap for Multi MSI vectors. The current Xilinx NWL MSI allocation code allows a bitmap allocation that is not a power of two boundaries, so endpoint mylove90#2, is allowed to toggle Message Data bit[0] to differentiate between its two vectors (meaning that the MSI data will be respectively 0x0 and 0x1 for the two vectors allocated to endpoint mylove90#2). This clearly aliases with the Endpoint mylove90#1 vector allocation, resulting in a broken Multi MSI implementation. Update the code to allocate MSI bitmap ranges with a power of two alignment, fixing the bug. Fixes: ab597d3 ("PCI: xilinx-nwl: Add support for Xilinx NWL PCIe Host Controller") Suggested-by: Marc Zyngier <[email protected]> Signed-off-by: Bharat Kumar Gogada <[email protected]> [[email protected]: updated commit log] Signed-off-by: Lorenzo Pieralisi <[email protected]> Acked-by: Marc Zyngier <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
AbhinandAK350
referenced
this issue
in AbhinandAK350/neon_kernel_onclite
Nov 13, 2019
[ Upstream commit 80e5302e4bc85a6b685b7668c36c6487b5f90e9a ] An impending change to enable HAVE_C_RECORDMCOUNT on powerpc leads to warnings such as the following: # modprobe kprobe_example ftrace-powerpc: Not expected bl: opcode is 3c4c0001 WARNING: CPU: 0 PID: 227 at kernel/trace/ftrace.c:2001 ftrace_bug+0x90/0x318 Modules linked in: CPU: 0 PID: 227 Comm: modprobe Not tainted 5.2.0-rc6-00678-g1c329100b942 mylove90#2 NIP: c000000000264318 LR: c00000000025d694 CTR: c000000000f5cd30 REGS: c000000001f2b7b0 TRAP: 0700 Not tainted (5.2.0-rc6-00678-g1c329100b942) MSR: 900000010282b033 <SF,HV,VEC,VSX,EE,FP,ME,IR,DR,RI,LE,TM[E]> CR: 28228222 XER: 00000000 CFAR: c0000000002642fc IRQMASK: 0 <snip> NIP [c000000000264318] ftrace_bug+0x90/0x318 LR [c00000000025d694] ftrace_process_locs+0x4f4/0x5e0 Call Trace: [c000000001f2ba40] [0000000000000004] 0x4 (unreliable) [c000000001f2bad0] [c00000000025d694] ftrace_process_locs+0x4f4/0x5e0 [c000000001f2bb90] [c00000000020ff10] load_module+0x25b0/0x30c0 [c000000001f2bd00] [c000000000210cb0] sys_finit_module+0xc0/0x130 [c000000001f2be20] [c00000000000bda4] system_call+0x5c/0x70 Instruction dump: 419e0018 2f83ffff 419e00bc 2f83ffea 409e00cc 4800001c 0fe00000 3c62ff96 39000001 39400000 386386d0 480000c4 <0fe00000> 3ce20003 39000001 3c62ff96 ---[ end trace 4c438d5cebf78381 ]--- ftrace failed to modify [<c0080000012a0008>] 0xc0080000012a0008 actual: 01:00:4c:3c Initializing ftrace call sites ftrace record flags: 2000000 (0) expected tramp: c00000000006af4c Looking at the relocation records in __mcount_loc shows a few spurious entries: RELOCATION RECORDS FOR [__mcount_loc]: OFFSET TYPE VALUE 0000000000000000 R_PPC64_ADDR64 .text.unlikely+0x0000000000000008 0000000000000008 R_PPC64_ADDR64 .text.unlikely+0x0000000000000014 0000000000000010 R_PPC64_ADDR64 .text.unlikely+0x0000000000000060 0000000000000018 R_PPC64_ADDR64 .text.unlikely+0x00000000000000b4 0000000000000020 R_PPC64_ADDR64 .init.text+0x0000000000000008 0000000000000028 R_PPC64_ADDR64 .init.text+0x0000000000000014 The first entry in each section is incorrect. Looking at the relocation records, the spurious entries correspond to the R_PPC64_ENTRY records: RELOCATION RECORDS FOR [.text.unlikely]: OFFSET TYPE VALUE 0000000000000000 R_PPC64_REL64 .TOC.-0x0000000000000008 0000000000000008 R_PPC64_ENTRY *ABS* 0000000000000014 R_PPC64_REL24 _mcount <snip> The problem is that we are not validating the return value from get_mcountsym() in sift_rel_mcount(). With this entry, mcountsym is 0, but Elf_r_sym(relp) also ends up being 0. Fix this by ensuring mcountsym is valid before processing the entry. Signed-off-by: Naveen N. Rao <[email protected]> Acked-by: Steven Rostedt (VMware) <[email protected]> Tested-by: Satheesh Rajendran <[email protected]> Signed-off-by: Michael Ellerman <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
AbhinandAK350
referenced
this issue
in AbhinandAK350/neon_kernel_onclite
Nov 13, 2019
commit cf3591ef832915892f2499b7e54b51d4c578b28c upstream. Revert the commit bd293d071ffe65e645b4d8104f9d8fe15ea13862. The proper fix has been made available with commit d0a255e795ab ("loop: set PF_MEMALLOC_NOIO for the worker thread"). Note that the fix offered by commit bd293d071ffe doesn't really prevent the deadlock from occuring - if we look at the stacktrace reported by Junxiao Bi, we see that it hangs in bit_wait_io and not on the mutex - i.e. it has already successfully taken the mutex. Changing the mutex from mutex_lock to mutex_trylock won't help with deadlocks that happen afterwards. PID: 474 TASK: ffff8813e11f4600 CPU: 10 COMMAND: "kswapd0" #0 [ffff8813dedfb938] __schedule at ffffffff8173f405 mylove90#1 [ffff8813dedfb990] schedule at ffffffff8173fa27 mylove90#2 [ffff8813dedfb9b0] schedule_timeout at ffffffff81742fec mylove90#3 [ffff8813dedfba60] io_schedule_timeout at ffffffff8173f186 mylove90#4 [ffff8813dedfbaa0] bit_wait_io at ffffffff8174034f mylove90#5 [ffff8813dedfbac0] __wait_on_bit at ffffffff8173fec8 #6 [ffff8813dedfbb10] out_of_line_wait_on_bit at ffffffff8173ff81 #7 [ffff8813dedfbb90] __make_buffer_clean at ffffffffa038736f [dm_bufio] #8 [ffff8813dedfbbb0] __try_evict_buffer at ffffffffa0387bb8 [dm_bufio] #9 [ffff8813dedfbbd0] dm_bufio_shrink_scan at ffffffffa0387cc3 [dm_bufio] #10 [ffff8813dedfbc40] shrink_slab at ffffffff811a87ce #11 [ffff8813dedfbd30] shrink_zone at ffffffff811ad778 #12 [ffff8813dedfbdc0] kswapd at ffffffff811ae92f #13 [ffff8813dedfbec0] kthread at ffffffff810a8428 #14 [ffff8813dedfbf50] ret_from_fork at ffffffff81745242 Signed-off-by: Mikulas Patocka <[email protected]> Cc: [email protected] Fixes: bd293d071ffe ("dm bufio: fix deadlock with loop device") Depends-on: d0a255e795ab ("loop: set PF_MEMALLOC_NOIO for the worker thread") Signed-off-by: Mike Snitzer <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
AbhinandAK350
referenced
this issue
in AbhinandAK350/neon_kernel_onclite
Nov 13, 2019
[ Upstream commit 86968ef21596515958d5f0a40233d02be78ecec0 ] Calling ceph_buffer_put() in __ceph_setxattr() may end up freeing the i_xattrs.prealloc_blob buffer while holding the i_ceph_lock. This can be fixed by postponing the call until later, when the lock is released. The following backtrace was triggered by fstests generic/117. BUG: sleeping function called from invalid context at mm/vmalloc.c:2283 in_atomic(): 1, irqs_disabled(): 0, pid: 650, name: fsstress 3 locks held by fsstress/650: #0: 00000000870a0fe8 (sb_writers#8){.+.+}, at: mnt_want_write+0x20/0x50 mylove90#1: 00000000ba0c4c74 (&type->i_mutex_dir_key#6){++++}, at: vfs_setxattr+0x55/0xa0 mylove90#2: 000000008dfbb3f2 (&(&ci->i_ceph_lock)->rlock){+.+.}, at: __ceph_setxattr+0x297/0x810 CPU: 1 PID: 650 Comm: fsstress Not tainted 5.2.0+ #437 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.12.1-0-ga5cab58-prebuilt.qemu.org 04/01/2014 Call Trace: dump_stack+0x67/0x90 ___might_sleep.cold+0x9f/0xb1 vfree+0x4b/0x60 ceph_buffer_release+0x1b/0x60 __ceph_setxattr+0x2b4/0x810 __vfs_setxattr+0x66/0x80 __vfs_setxattr_noperm+0x59/0xf0 vfs_setxattr+0x81/0xa0 setxattr+0x115/0x230 ? filename_lookup+0xc9/0x140 ? rcu_read_lock_sched_held+0x74/0x80 ? rcu_sync_lockdep_assert+0x2e/0x60 ? __sb_start_write+0x142/0x1a0 ? mnt_want_write+0x20/0x50 path_setxattr+0xba/0xd0 __x64_sys_lsetxattr+0x24/0x30 do_syscall_64+0x50/0x1c0 entry_SYSCALL_64_after_hwframe+0x49/0xbe RIP: 0033:0x7ff23514359a Signed-off-by: Luis Henriques <[email protected]> Reviewed-by: Jeff Layton <[email protected]> Signed-off-by: Ilya Dryomov <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
AbhinandAK350
referenced
this issue
in AbhinandAK350/neon_kernel_onclite
Nov 13, 2019
[ Upstream commit af8a85a41734f37b67ba8ce69d56b685bee4ac48 ] Calling ceph_buffer_put() in fill_inode() may result in freeing the i_xattrs.blob buffer while holding the i_ceph_lock. This can be fixed by postponing the call until later, when the lock is released. The following backtrace was triggered by fstests generic/070. BUG: sleeping function called from invalid context at mm/vmalloc.c:2283 in_atomic(): 1, irqs_disabled(): 0, pid: 3852, name: kworker/0:4 6 locks held by kworker/0:4/3852: #0: 000000004270f6bb ((wq_completion)ceph-msgr){+.+.}, at: process_one_work+0x1b8/0x5f0 mylove90#1: 00000000eb420803 ((work_completion)(&(&con->work)->work)){+.+.}, at: process_one_work+0x1b8/0x5f0 mylove90#2: 00000000be1c53a4 (&s->s_mutex){+.+.}, at: dispatch+0x288/0x1476 mylove90#3: 00000000559cb958 (&mdsc->snap_rwsem){++++}, at: dispatch+0x2eb/0x1476 mylove90#4: 000000000d5ebbae (&req->r_fill_mutex){+.+.}, at: dispatch+0x2fc/0x1476 mylove90#5: 00000000a83d0514 (&(&ci->i_ceph_lock)->rlock){+.+.}, at: fill_inode.isra.0+0xf8/0xf70 CPU: 0 PID: 3852 Comm: kworker/0:4 Not tainted 5.2.0+ #441 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.12.1-0-ga5cab58-prebuilt.qemu.org 04/01/2014 Workqueue: ceph-msgr ceph_con_workfn Call Trace: dump_stack+0x67/0x90 ___might_sleep.cold+0x9f/0xb1 vfree+0x4b/0x60 ceph_buffer_release+0x1b/0x60 fill_inode.isra.0+0xa9b/0xf70 ceph_fill_trace+0x13b/0xc70 ? dispatch+0x2eb/0x1476 dispatch+0x320/0x1476 ? __mutex_unlock_slowpath+0x4d/0x2a0 ceph_con_workfn+0xc97/0x2ec0 ? process_one_work+0x1b8/0x5f0 process_one_work+0x244/0x5f0 worker_thread+0x4d/0x3e0 kthread+0x105/0x140 ? process_one_work+0x5f0/0x5f0 ? kthread_park+0x90/0x90 ret_from_fork+0x3a/0x50 Signed-off-by: Luis Henriques <[email protected]> Reviewed-by: Jeff Layton <[email protected]> Signed-off-by: Ilya Dryomov <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
AbhinandAK350
referenced
this issue
in AbhinandAK350/neon_kernel_onclite
Nov 14, 2019
[ Upstream commit fe163e534e5eecdfd7b5920b0dfd24c458ee85d6 ] syzbot reported: BUG: KMSAN: uninit-value in capi_write+0x791/0xa90 drivers/isdn/capi/capi.c:700 CPU: 0 PID: 10025 Comm: syz-executor379 Not tainted 4.20.0-rc7+ mylove90#2 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Call Trace: __dump_stack lib/dump_stack.c:77 [inline] dump_stack+0x173/0x1d0 lib/dump_stack.c:113 kmsan_report+0x12e/0x2a0 mm/kmsan/kmsan.c:613 __msan_warning+0x82/0xf0 mm/kmsan/kmsan_instr.c:313 capi_write+0x791/0xa90 drivers/isdn/capi/capi.c:700 do_loop_readv_writev fs/read_write.c:703 [inline] do_iter_write+0x83e/0xd80 fs/read_write.c:961 vfs_writev fs/read_write.c:1004 [inline] do_writev+0x397/0x840 fs/read_write.c:1039 __do_sys_writev fs/read_write.c:1112 [inline] __se_sys_writev+0x9b/0xb0 fs/read_write.c:1109 __x64_sys_writev+0x4a/0x70 fs/read_write.c:1109 do_syscall_64+0xbc/0xf0 arch/x86/entry/common.c:291 entry_SYSCALL_64_after_hwframe+0x63/0xe7 [...] The problem is that capi_write() is reading past the end of the message. Fix it by checking the message's length in the needed places. Reported-and-tested-by: [email protected] Signed-off-by: Eric Biggers <[email protected]> Signed-off-by: David S. Miller <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
AbhinandAK350
referenced
this issue
in AbhinandAK350/neon_kernel_onclite
Nov 14, 2019
[ Upstream commit 79c92ca42b5a3e0ea172ea2ce8df8e125af237da ] When receiving a deauthentication/disassociation frame from a TDLS peer, a station should not disconnect the current AP, but only disable the current TDLS link if it's enabled. Without this change, a TDLS issue can be reproduced by following the steps as below: 1. STA-1 and STA-2 are connected to AP, bidirection traffic is running between STA-1 and STA-2. 2. Set up TDLS link between STA-1 and STA-2, stay for a while, then teardown TDLS link. 3. Repeat step mylove90#2 and monitor the connection between STA and AP. During the test, one STA may send a deauthentication/disassociation frame to another, after TDLS teardown, with reason code 6/7, which means: Class 2/3 frame received from nonassociated STA. On receive this frame, the receiver STA will disconnect the current AP and then reconnect. It's not a expected behavior, purpose of this frame should be disabling the TDLS link, not the link with AP. Cc: [email protected] Signed-off-by: Yu Wang <[email protected]> Signed-off-by: Johannes Berg <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
AbhinandAK350
referenced
this issue
in AbhinandAK350/neon_kernel_onclite
Nov 14, 2019
commit f8659d68e2bee5b86a1beaf7be42d942e1fc81f4 upstream. Define the working variables to be unsigned long to be compatible with for_each_set_bit and change types as needed. While we are at it remove unused variables from a couple of functions. This was found because of the following KASAN warning: ================================================================== BUG: KASAN: stack-out-of-bounds in find_first_bit+0x19/0x70 Read of size 8 at addr ffff888362d778d0 by task kworker/u308:2/1889 CPU: 21 PID: 1889 Comm: kworker/u308:2 Tainted: G W 5.3.0-rc2-mm1+ mylove90#2 Hardware name: Intel Corporation W2600CR/W2600CR, BIOS SE5C600.86B.02.04.0003.102320141138 10/23/2014 Workqueue: ib-comp-unb-wq ib_cq_poll_work [ib_core] Call Trace: dump_stack+0x9a/0xf0 ? find_first_bit+0x19/0x70 print_address_description+0x6c/0x332 ? find_first_bit+0x19/0x70 ? find_first_bit+0x19/0x70 __kasan_report.cold.6+0x1a/0x3b ? find_first_bit+0x19/0x70 kasan_report+0xe/0x12 find_first_bit+0x19/0x70 pma_get_opa_portstatus+0x5cc/0xa80 [hfi1] ? ret_from_fork+0x3a/0x50 ? pma_get_opa_port_ectrs+0x200/0x200 [hfi1] ? stack_trace_consume_entry+0x80/0x80 hfi1_process_mad+0x39b/0x26c0 [hfi1] ? __lock_acquire+0x65e/0x21b0 ? clear_linkup_counters+0xb0/0xb0 [hfi1] ? check_chain_key+0x1d7/0x2e0 ? lock_downgrade+0x3a0/0x3a0 ? match_held_lock+0x2e/0x250 ib_mad_recv_done+0x698/0x15e0 [ib_core] ? clear_linkup_counters+0xb0/0xb0 [hfi1] ? ib_mad_send_done+0xc80/0xc80 [ib_core] ? mark_held_locks+0x79/0xa0 ? _raw_spin_unlock_irqrestore+0x44/0x60 ? rvt_poll_cq+0x1e1/0x340 [rdmavt] __ib_process_cq+0x97/0x100 [ib_core] ib_cq_poll_work+0x31/0xb0 [ib_core] process_one_work+0x4ee/0xa00 ? pwq_dec_nr_in_flight+0x110/0x110 ? do_raw_spin_lock+0x113/0x1d0 worker_thread+0x57/0x5a0 ? process_one_work+0xa00/0xa00 kthread+0x1bb/0x1e0 ? kthread_create_on_node+0xc0/0xc0 ret_from_fork+0x3a/0x50 The buggy address belongs to the page: page:ffffea000d8b5dc0 refcount:0 mapcount:0 mapping:0000000000000000 index:0x0 flags: 0x17ffffc0000000() raw: 0017ffffc0000000 0000000000000000 ffffea000d8b5dc8 0000000000000000 raw: 0000000000000000 0000000000000000 00000000ffffffff 0000000000000000 page dumped because: kasan: bad access detected addr ffff888362d778d0 is located in stack of task kworker/u308:2/1889 at offset 32 in frame: pma_get_opa_portstatus+0x0/0xa80 [hfi1] this frame has 1 object: [32, 36) 'vl_select_mask' Memory state around the buggy address: ffff888362d77780: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ffff888362d77800: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 >ffff888362d77880: 00 00 00 00 00 00 f1 f1 f1 f1 04 f2 f2 f2 00 00 ^ ffff888362d77900: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ffff888362d77980: 00 00 00 00 00 00 00 00 f1 f1 f1 f1 04 f2 f2 f2 ================================================================== Cc: <[email protected]> Fixes: 7724105 ("IB/hfi1: add driver files") Link: https://lore.kernel.org/r/[email protected] Reviewed-by: Mike Marciniszyn <[email protected]> Signed-off-by: Ira Weiny <[email protected]> Signed-off-by: Kaike Wan <[email protected]> Signed-off-by: Dennis Dalessandro <[email protected]> Signed-off-by: Jason Gunthorpe <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
AbhinandAK350
referenced
this issue
in AbhinandAK350/neon_kernel_onclite
Nov 16, 2019
commit ab5758848039de9a4b249d46e4ab591197eebaf2 upstream. ccw console is created early in start_kernel and used before css is initialized or ccw console subchannel is registered. Until then console subchannel does not have a parent. For that reason assume subchannels with no parent are not pseudo subchannels. This fixes the following kasan finding: BUG: KASAN: global-out-of-bounds in sch_is_pseudo_sch+0x8e/0x98 Read of size 8 at addr 00000000000005e8 by task swapper/0/0 CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.3.0-rc8-07370-g6ac43dd12538 mylove90#2 Hardware name: IBM 2964 NC9 702 (z/VM 6.4.0) Call Trace: ([<000000000012cd76>] show_stack+0x14e/0x1e0) [<0000000001f7fb44>] dump_stack+0x1a4/0x1f8 [<00000000007d7afc>] print_address_description+0x64/0x3c8 [<00000000007d75f6>] __kasan_report+0x14e/0x180 [<00000000018a2986>] sch_is_pseudo_sch+0x8e/0x98 [<000000000189b950>] cio_enable_subchannel+0x1d0/0x510 [<00000000018cac7c>] ccw_device_recognition+0x12c/0x188 [<0000000002ceb1a8>] ccw_device_enable_console+0x138/0x340 [<0000000002cf1cbe>] con3215_init+0x25e/0x300 [<0000000002c8770a>] console_init+0x68a/0x9b8 [<0000000002c6a3d6>] start_kernel+0x4fe/0x728 [<0000000000100070>] startup_continue+0x70/0xd0 Cc: [email protected] Reviewed-by: Sebastian Ott <[email protected]> Signed-off-by: Vasily Gorbik <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
AbhinandAK350
referenced
this issue
in AbhinandAK350/neon_kernel_onclite
Nov 16, 2019
[ Upstream commit 0216234c2eed1367a318daeb9f4a97d8217412a0 ] We release wrong pointer on error path in cpu_cache_level__read function, leading to segfault: (gdb) r record ls Starting program: /root/perf/tools/perf/perf record ls ... [ perf record: Woken up 1 times to write data ] double free or corruption (out) Thread 1 "perf" received signal SIGABRT, Aborted. 0x00007ffff7463798 in raise () from /lib64/power9/libc.so.6 (gdb) bt #0 0x00007ffff7463798 in raise () from /lib64/power9/libc.so.6 mylove90#1 0x00007ffff7443bac in abort () from /lib64/power9/libc.so.6 mylove90#2 0x00007ffff74af8bc in __libc_message () from /lib64/power9/libc.so.6 mylove90#3 0x00007ffff74b92b8 in malloc_printerr () from /lib64/power9/libc.so.6 mylove90#4 0x00007ffff74bb874 in _int_free () from /lib64/power9/libc.so.6 mylove90#5 0x0000000010271260 in __zfree (ptr=0x7fffffffa0b0) at ../../lib/zalloc.. #6 0x0000000010139340 in cpu_cache_level__read (cache=0x7fffffffa090, cac.. #7 0x0000000010143c90 in build_caches (cntp=0x7fffffffa118, size=<optimiz.. ... Releasing the proper pointer. Fixes: 720e98b ("perf tools: Add perf data cache feature") Signed-off-by: Jiri Olsa <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Michael Petlan <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: [email protected]: # v4.6+ Link: http://lore.kernel.org/lkml/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
AbhinandAK350
referenced
this issue
in AbhinandAK350/neon_kernel_onclite
Nov 16, 2019
[ Upstream commit 443f2d5ba13d65ccfd879460f77941875159d154 ] Observe a segmentation fault when 'perf stat' is asked to repeat forever with the interval option. Without fix: # perf stat -r 0 -I 5000 -e cycles -a sleep 10 # time counts unit events 5.000211692 3,13,89,82,34,157 cycles 10.000380119 1,53,98,52,22,294 cycles 10.040467280 17,16,79,265 cycles Segmentation fault This problem was only observed when we use forever option aka -r 0 and works with limited repeats. Calling print_counter with ts being set to NULL, is not a correct option when interval is set. Hence avoid print_counter(NULL,..) if interval is set. With fix: # perf stat -r 0 -I 5000 -e cycles -a sleep 10 # time counts unit events 5.019866622 3,15,14,43,08,697 cycles 10.039865756 3,15,16,31,95,261 cycles 10.059950628 1,26,05,47,158 cycles 5.009902655 3,14,52,62,33,932 cycles 10.019880228 3,14,52,22,89,154 cycles 10.030543876 66,90,18,333 cycles 5.009848281 3,14,51,98,25,437 cycles 10.029854402 3,15,14,93,04,918 cycles 5.009834177 3,14,51,95,92,316 cycles Committer notes: Did the 'git bisect' to find the cset introducing the problem to add the Fixes tag below, and at that time the problem reproduced as: (gdb) run stat -r0 -I500 sleep 1 <SNIP> Program received signal SIGSEGV, Segmentation fault. print_interval (prefix=prefix@entry=0x7fffffffc8d0 "", ts=ts@entry=0x0) at builtin-stat.c:866 866 sprintf(prefix, "%6lu.%09lu%s", ts->tv_sec, ts->tv_nsec, csv_sep); (gdb) bt #0 print_interval (prefix=prefix@entry=0x7fffffffc8d0 "", ts=ts@entry=0x0) at builtin-stat.c:866 mylove90#1 0x000000000041860a in print_counters (ts=ts@entry=0x0, argc=argc@entry=2, argv=argv@entry=0x7fffffffd640) at builtin-stat.c:938 mylove90#2 0x0000000000419a7f in cmd_stat (argc=2, argv=0x7fffffffd640, prefix=<optimized out>) at builtin-stat.c:1411 mylove90#3 0x000000000045c65a in run_builtin (p=p@entry=0x6291b8 <commands+216>, argc=argc@entry=5, argv=argv@entry=0x7fffffffd640) at perf.c:370 mylove90#4 0x000000000045c893 in handle_internal_command (argc=5, argv=0x7fffffffd640) at perf.c:429 mylove90#5 0x000000000045c8f1 in run_argv (argcp=argcp@entry=0x7fffffffd4ac, argv=argv@entry=0x7fffffffd4a0) at perf.c:473 #6 0x000000000045cac9 in main (argc=<optimized out>, argv=<optimized out>) at perf.c:588 (gdb) Mostly the same as just before this patch: Program received signal SIGSEGV, Segmentation fault. 0x00000000005874a7 in print_interval (config=0xa1f2a0 <stat_config>, evlist=0xbc9b90, prefix=0x7fffffffd1c0 "`", ts=0x0) at util/stat-display.c:964 964 sprintf(prefix, "%6lu.%09lu%s", ts->tv_sec, ts->tv_nsec, config->csv_sep); (gdb) bt #0 0x00000000005874a7 in print_interval (config=0xa1f2a0 <stat_config>, evlist=0xbc9b90, prefix=0x7fffffffd1c0 "`", ts=0x0) at util/stat-display.c:964 mylove90#1 0x0000000000588047 in perf_evlist__print_counters (evlist=0xbc9b90, config=0xa1f2a0 <stat_config>, _target=0xa1f0c0 <target>, ts=0x0, argc=2, argv=0x7fffffffd670) at util/stat-display.c:1172 mylove90#2 0x000000000045390f in print_counters (ts=0x0, argc=2, argv=0x7fffffffd670) at builtin-stat.c:656 mylove90#3 0x0000000000456bb5 in cmd_stat (argc=2, argv=0x7fffffffd670) at builtin-stat.c:1960 mylove90#4 0x00000000004dd2e0 in run_builtin (p=0xa30e00 <commands+288>, argc=5, argv=0x7fffffffd670) at perf.c:310 mylove90#5 0x00000000004dd54d in handle_internal_command (argc=5, argv=0x7fffffffd670) at perf.c:362 #6 0x00000000004dd694 in run_argv (argcp=0x7fffffffd4cc, argv=0x7fffffffd4c0) at perf.c:406 #7 0x00000000004dda11 in main (argc=5, argv=0x7fffffffd670) at perf.c:531 (gdb) Fixes: d4f63a4 ("perf stat: Introduce print_counters function") Signed-off-by: Srikar Dronamraju <[email protected]> Acked-by: Jiri Olsa <[email protected]> Tested-by: Arnaldo Carvalho de Melo <[email protected]> Tested-by: Ravi Bangoria <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Naveen N. Rao <[email protected]> Cc: [email protected] # v4.2+ Link: http://lore.kernel.org/lkml/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
AbhinandAK350
referenced
this issue
in AbhinandAK350/neon_kernel_onclite
Nov 16, 2019
[ Upstream commit 2190168aaea42c31bff7b9a967e7b045f07df095 ] On excessive bit errors for the FCP channel ingress fibre path, the channel notifies us. Previously, we only emitted a kernel message and a trace record. Since performance can become suboptimal with I/O timeouts due to bit errors, we now stop using an FCP device by default on channel notification so multipath on top can timely failover to other paths. A new module parameter zfcp.ber_stop can be used to get zfcp old behavior. User explanation of new kernel message: * Description: * The FCP channel reported that its bit error threshold has been exceeded. * These errors might result from a problem with the physical components * of the local fibre link into the FCP channel. * The problem might be damage or malfunction of the cable or * cable connection between the FCP channel and * the adjacent fabric switch port or the point-to-point peer. * Find details about the errors in the HBA trace for the FCP device. * The zfcp device driver closed down the FCP device * to limit the performance impact from possible I/O command timeouts. * User action: * Check for problems on the local fibre link, ensure that fibre optics are * clean and functional, and all cables are properly plugged. * After the repair action, you can manually recover the FCP device by * writing "0" into its "failed" sysfs attribute. * If recovery through sysfs is not possible, set the CHPID of the device * offline and back online on the service element. Fixes: 1da177e ("Linux-2.6.12-rc2") Cc: <[email protected]> mylove90#2.6.30+ Link: https://lore.kernel.org/r/[email protected] Reviewed-by: Jens Remus <[email protected]> Reviewed-by: Benjamin Block <[email protected]> Signed-off-by: Steffen Maier <[email protected]> Signed-off-by: Martin K. Petersen <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
AbhinandAK350
referenced
this issue
in AbhinandAK350/neon_kernel_onclite
Nov 16, 2019
commit e4f8e513c3d353c134ad4eef9fd0bba12406c7c8 upstream. A long time ago we fixed a similar deadlock in show_slab_objects() [1]. However, it is apparently due to the commits like 01fb58bcba63 ("slab: remove synchronous synchronize_sched() from memcg cache deactivation path") and 03afc0e ("slab: get_online_mems for kmem_cache_{create,destroy,shrink}"), this kind of deadlock is back by just reading files in /sys/kernel/slab which will generate a lockdep splat below. Since the "mem_hotplug_lock" here is only to obtain a stable online node mask while racing with NUMA node hotplug, in the worst case, the results may me miscalculated while doing NUMA node hotplug, but they shall be corrected by later reads of the same files. WARNING: possible circular locking dependency detected ------------------------------------------------------ cat/5224 is trying to acquire lock: ffff900012ac3120 (mem_hotplug_lock.rw_sem){++++}, at: show_slab_objects+0x94/0x3a8 but task is already holding lock: b8ff009693eee398 (kn->count#45){++++}, at: kernfs_seq_start+0x44/0xf0 which lock already depends on the new lock. the existing dependency chain (in reverse order) is: -> mylove90#2 (kn->count#45){++++}: lock_acquire+0x31c/0x360 __kernfs_remove+0x290/0x490 kernfs_remove+0x30/0x44 sysfs_remove_dir+0x70/0x88 kobject_del+0x50/0xb0 sysfs_slab_unlink+0x2c/0x38 shutdown_cache+0xa0/0xf0 kmemcg_cache_shutdown_fn+0x1c/0x34 kmemcg_workfn+0x44/0x64 process_one_work+0x4f4/0x950 worker_thread+0x390/0x4bc kthread+0x1cc/0x1e8 ret_from_fork+0x10/0x18 -> mylove90#1 (slab_mutex){+.+.}: lock_acquire+0x31c/0x360 __mutex_lock_common+0x16c/0xf78 mutex_lock_nested+0x40/0x50 memcg_create_kmem_cache+0x38/0x16c memcg_kmem_cache_create_func+0x3c/0x70 process_one_work+0x4f4/0x950 worker_thread+0x390/0x4bc kthread+0x1cc/0x1e8 ret_from_fork+0x10/0x18 -> #0 (mem_hotplug_lock.rw_sem){++++}: validate_chain+0xd10/0x2bcc __lock_acquire+0x7f4/0xb8c lock_acquire+0x31c/0x360 get_online_mems+0x54/0x150 show_slab_objects+0x94/0x3a8 total_objects_show+0x28/0x34 slab_attr_show+0x38/0x54 sysfs_kf_seq_show+0x198/0x2d4 kernfs_seq_show+0xa4/0xcc seq_read+0x30c/0x8a8 kernfs_fop_read+0xa8/0x314 __vfs_read+0x88/0x20c vfs_read+0xd8/0x10c ksys_read+0xb0/0x120 __arm64_sys_read+0x54/0x88 el0_svc_handler+0x170/0x240 el0_svc+0x8/0xc other info that might help us debug this: Chain exists of: mem_hotplug_lock.rw_sem --> slab_mutex --> kn->count#45 Possible unsafe locking scenario: CPU0 CPU1 ---- ---- lock(kn->count#45); lock(slab_mutex); lock(kn->count#45); lock(mem_hotplug_lock.rw_sem); *** DEADLOCK *** 3 locks held by cat/5224: #0: 9eff00095b14b2a0 (&p->lock){+.+.}, at: seq_read+0x4c/0x8a8 mylove90#1: 0eff008997041480 (&of->mutex){+.+.}, at: kernfs_seq_start+0x34/0xf0 mylove90#2: b8ff009693eee398 (kn->count#45){++++}, at: kernfs_seq_start+0x44/0xf0 stack backtrace: Call trace: dump_backtrace+0x0/0x248 show_stack+0x20/0x2c dump_stack+0xd0/0x140 print_circular_bug+0x368/0x380 check_noncircular+0x248/0x250 validate_chain+0xd10/0x2bcc __lock_acquire+0x7f4/0xb8c lock_acquire+0x31c/0x360 get_online_mems+0x54/0x150 show_slab_objects+0x94/0x3a8 total_objects_show+0x28/0x34 slab_attr_show+0x38/0x54 sysfs_kf_seq_show+0x198/0x2d4 kernfs_seq_show+0xa4/0xcc seq_read+0x30c/0x8a8 kernfs_fop_read+0xa8/0x314 __vfs_read+0x88/0x20c vfs_read+0xd8/0x10c ksys_read+0xb0/0x120 __arm64_sys_read+0x54/0x88 el0_svc_handler+0x170/0x240 el0_svc+0x8/0xc I think it is important to mention that this doesn't expose the show_slab_objects to use-after-free. There is only a single path that might really race here and that is the slab hotplug notifier callback __kmem_cache_shrink (via slab_mem_going_offline_callback) but that path doesn't really destroy kmem_cache_node data structures. [1] http://lkml.iu.edu/hypermail/linux/kernel/1101.0/02850.html [[email protected]: add comment explaining why we don't need mem_hotplug_lock] Link: http://lkml.kernel.org/r/[email protected] Fixes: 01fb58bcba63 ("slab: remove synchronous synchronize_sched() from memcg cache deactivation path") Fixes: 03afc0e ("slab: get_online_mems for kmem_cache_{create,destroy,shrink}") Signed-off-by: Qian Cai <[email protected]> Acked-by: Michal Hocko <[email protected]> Cc: Christoph Lameter <[email protected]> Cc: Pekka Enberg <[email protected]> Cc: David Rientjes <[email protected]> Cc: Joonsoo Kim <[email protected]> Cc: Tejun Heo <[email protected]> Cc: Vladimir Davydov <[email protected]> Cc: Roman Gushchin <[email protected]> Cc: <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
AbhinandAK350
referenced
this issue
in AbhinandAK350/neon_kernel_onclite
Nov 16, 2019
[ Upstream commit b66f31efbdad95ec274345721d99d1d835e6de01 ] This patch fixes the lock inversion complaint: ============================================ WARNING: possible recursive locking detected 5.3.0-rc7-dbg+ mylove90#1 Not tainted -------------------------------------------- kworker/u16:6/171 is trying to acquire lock: 00000000035c6e6c (&id_priv->handler_mutex){+.+.}, at: rdma_destroy_id+0x78/0x4a0 [rdma_cm] but task is already holding lock: 00000000bc7c307d (&id_priv->handler_mutex){+.+.}, at: iw_conn_req_handler+0x151/0x680 [rdma_cm] other info that might help us debug this: Possible unsafe locking scenario: CPU0 ---- lock(&id_priv->handler_mutex); lock(&id_priv->handler_mutex); *** DEADLOCK *** May be due to missing lock nesting notation 3 locks held by kworker/u16:6/171: #0: 00000000e2eaa773 ((wq_completion)iw_cm_wq){+.+.}, at: process_one_work+0x472/0xac0 mylove90#1: 000000001efd357b ((work_completion)(&work->work)mylove90#3){+.+.}, at: process_one_work+0x476/0xac0 mylove90#2: 00000000bc7c307d (&id_priv->handler_mutex){+.+.}, at: iw_conn_req_handler+0x151/0x680 [rdma_cm] stack backtrace: CPU: 3 PID: 171 Comm: kworker/u16:6 Not tainted 5.3.0-rc7-dbg+ mylove90#1 Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011 Workqueue: iw_cm_wq cm_work_handler [iw_cm] Call Trace: dump_stack+0x8a/0xd6 __lock_acquire.cold+0xe1/0x24d lock_acquire+0x106/0x240 __mutex_lock+0x12e/0xcb0 mutex_lock_nested+0x1f/0x30 rdma_destroy_id+0x78/0x4a0 [rdma_cm] iw_conn_req_handler+0x5c9/0x680 [rdma_cm] cm_work_handler+0xe62/0x1100 [iw_cm] process_one_work+0x56d/0xac0 worker_thread+0x7a/0x5d0 kthread+0x1bc/0x210 ret_from_fork+0x24/0x30 This is not a bug as there are actually two lock classes here. Link: https://lore.kernel.org/r/[email protected] Fixes: de910bd ("RDMA/cma: Simplify locking needed for serialization of callbacks") Signed-off-by: Bart Van Assche <[email protected]> Reviewed-by: Jason Gunthorpe <[email protected]> Signed-off-by: Jason Gunthorpe <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
AbhinandAK350
referenced
this issue
in AbhinandAK350/neon_kernel_onclite
Nov 16, 2019
commit 159d2c7d8106177bd9a986fd005a311fe0d11285 upstream. qdisc_root() use from netem_enqueue() triggers a lockdep warning. __dev_queue_xmit() uses rcu_read_lock_bh() which is not equivalent to rcu_read_lock() + local_bh_disable_bh as far as lockdep is concerned. WARNING: suspicious RCU usage 5.3.0-rc7+ #0 Not tainted ----------------------------- include/net/sch_generic.h:492 suspicious rcu_dereference_check() usage! other info that might help us debug this: rcu_scheduler_active = 2, debug_locks = 1 3 locks held by syz-executor427/8855: #0: 00000000b5525c01 (rcu_read_lock_bh){....}, at: lwtunnel_xmit_redirect include/net/lwtunnel.h:92 [inline] #0: 00000000b5525c01 (rcu_read_lock_bh){....}, at: ip_finish_output2+0x2dc/0x2570 net/ipv4/ip_output.c:214 mylove90#1: 00000000b5525c01 (rcu_read_lock_bh){....}, at: __dev_queue_xmit+0x20a/0x3650 net/core/dev.c:3804 mylove90#2: 00000000364bae92 (&(&sch->q.lock)->rlock){+.-.}, at: spin_lock include/linux/spinlock.h:338 [inline] mylove90#2: 00000000364bae92 (&(&sch->q.lock)->rlock){+.-.}, at: __dev_xmit_skb net/core/dev.c:3502 [inline] mylove90#2: 00000000364bae92 (&(&sch->q.lock)->rlock){+.-.}, at: __dev_queue_xmit+0x14b8/0x3650 net/core/dev.c:3838 stack backtrace: CPU: 0 PID: 8855 Comm: syz-executor427 Not tainted 5.3.0-rc7+ #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Call Trace: __dump_stack lib/dump_stack.c:77 [inline] dump_stack+0x172/0x1f0 lib/dump_stack.c:113 lockdep_rcu_suspicious+0x153/0x15d kernel/locking/lockdep.c:5357 qdisc_root include/net/sch_generic.h:492 [inline] netem_enqueue+0x1cfb/0x2d80 net/sched/sch_netem.c:479 __dev_xmit_skb net/core/dev.c:3527 [inline] __dev_queue_xmit+0x15d2/0x3650 net/core/dev.c:3838 dev_queue_xmit+0x18/0x20 net/core/dev.c:3902 neigh_hh_output include/net/neighbour.h:500 [inline] neigh_output include/net/neighbour.h:509 [inline] ip_finish_output2+0x1726/0x2570 net/ipv4/ip_output.c:228 __ip_finish_output net/ipv4/ip_output.c:308 [inline] __ip_finish_output+0x5fc/0xb90 net/ipv4/ip_output.c:290 ip_finish_output+0x38/0x1f0 net/ipv4/ip_output.c:318 NF_HOOK_COND include/linux/netfilter.h:294 [inline] ip_mc_output+0x292/0xf40 net/ipv4/ip_output.c:417 dst_output include/net/dst.h:436 [inline] ip_local_out+0xbb/0x190 net/ipv4/ip_output.c:125 ip_send_skb+0x42/0xf0 net/ipv4/ip_output.c:1555 udp_send_skb.isra.0+0x6b2/0x1160 net/ipv4/udp.c:887 udp_sendmsg+0x1e96/0x2820 net/ipv4/udp.c:1174 inet_sendmsg+0x9e/0xe0 net/ipv4/af_inet.c:807 sock_sendmsg_nosec net/socket.c:637 [inline] sock_sendmsg+0xd7/0x130 net/socket.c:657 ___sys_sendmsg+0x3e2/0x920 net/socket.c:2311 __sys_sendmmsg+0x1bf/0x4d0 net/socket.c:2413 __do_sys_sendmmsg net/socket.c:2442 [inline] __se_sys_sendmmsg net/socket.c:2439 [inline] __x64_sys_sendmmsg+0x9d/0x100 net/socket.c:2439 do_syscall_64+0xfd/0x6a0 arch/x86/entry/common.c:296 entry_SYSCALL_64_after_hwframe+0x49/0xbe Signed-off-by: Eric Dumazet <[email protected]> Reported-by: syzbot <[email protected]> Signed-off-by: David S. Miller <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
mylove90
referenced
this issue
in mylove90/android_kernel_xiaomi_onc
Nov 18, 2019
[ Upstream commit 6da9f775175e516fc7229ceaa9b54f8f56aa7924 ] When debugging options are turned on, the rcu_read_lock() function might not be inlined. This results in lockdep's print_lock() function printing "rcu_read_lock+0x0/0x70" instead of rcu_read_lock()'s caller. For example: [ 10.579995] ============================= [ 10.584033] WARNING: suspicious RCU usage [ 10.588074] 4.18.0.memcg_v2+ #1 Not tainted [ 10.593162] ----------------------------- [ 10.597203] include/linux/rcupdate.h:281 Illegal context switch in RCU read-side critical section! [ 10.606220] [ 10.606220] other info that might help us debug this: [ 10.606220] [ 10.614280] [ 10.614280] rcu_scheduler_active = 2, debug_locks = 1 [ 10.620853] 3 locks held by systemd/1: [ 10.624632] #0: (____ptrval____) (&type->i_mutex_dir_key#5){.+.+}, at: lookup_slow+0x42/0x70 [ 10.633232] #1: (____ptrval____) (rcu_read_lock){....}, at: rcu_read_lock+0x0/0x70 [ 10.640954] #2: (____ptrval____) (rcu_read_lock){....}, at: rcu_read_lock+0x0/0x70 These "rcu_read_lock+0x0/0x70" strings are not providing any useful information. This commit therefore forces inlining of the rcu_read_lock() function so that rcu_read_lock()'s caller is instead shown. Signed-off-by: Waiman Long <[email protected]> Signed-off-by: Paul E. McKenney <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
mylove90
referenced
this issue
in mylove90/android_kernel_xiaomi_onc
Nov 18, 2019
[ Upstream commit 3f167e1921865b379a9becf03828e7202c7b4917 ] ipv4_pdp_add() is called in RCU read-side critical section. So GFP_KERNEL should not be used in the function. This patch make ipv4_pdp_add() to use GFP_ATOMIC instead of GFP_KERNEL. Test commands: gtp-link add gtp1 & gtp-tunnel add gtp1 v1 100 200 1.1.1.1 2.2.2.2 Splat looks like: [ 130.618881] ============================= [ 130.626382] WARNING: suspicious RCU usage [ 130.626994] 5.2.0-rc6+ #50 Not tainted [ 130.627622] ----------------------------- [ 130.628223] ./include/linux/rcupdate.h:266 Illegal context switch in RCU read-side critical section! [ 130.629684] [ 130.629684] other info that might help us debug this: [ 130.629684] [ 130.631022] [ 130.631022] rcu_scheduler_active = 2, debug_locks = 1 [ 130.632136] 4 locks held by gtp-tunnel/1025: [ 130.632925] #0: 000000002b93c8b7 (cb_lock){++++}, at: genl_rcv+0x15/0x40 [ 130.634159] #1: 00000000f17bc999 (genl_mutex){+.+.}, at: genl_rcv_msg+0xfb/0x130 [ 130.635487] #2: 00000000c644ed8e (rtnl_mutex){+.+.}, at: gtp_genl_new_pdp+0x18c/0x1150 [gtp] [ 130.636936] #3: 0000000007a1cde7 (rcu_read_lock){....}, at: gtp_genl_new_pdp+0x187/0x1150 [gtp] [ 130.638348] [ 130.638348] stack backtrace: [ 130.639062] CPU: 1 PID: 1025 Comm: gtp-tunnel Not tainted 5.2.0-rc6+ #50 [ 130.641318] Call Trace: [ 130.641707] dump_stack+0x7c/0xbb [ 130.642252] ___might_sleep+0x2c0/0x3b0 [ 130.642862] kmem_cache_alloc_trace+0x1cd/0x2b0 [ 130.643591] gtp_genl_new_pdp+0x6c5/0x1150 [gtp] [ 130.644371] genl_family_rcv_msg+0x63a/0x1030 [ 130.645074] ? mutex_lock_io_nested+0x1090/0x1090 [ 130.645845] ? genl_unregister_family+0x630/0x630 [ 130.646592] ? debug_show_all_locks+0x2d0/0x2d0 [ 130.647293] ? check_flags.part.40+0x440/0x440 [ 130.648099] genl_rcv_msg+0xa3/0x130 [ ... ] Fixes: 459aa66 ("gtp: add initial driver for datapath of GPRS Tunneling Protocol (GTP-U)") Signed-off-by: Taehee Yoo <[email protected]> Signed-off-by: David S. Miller <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
mylove90
referenced
this issue
in mylove90/android_kernel_xiaomi_onc
Nov 18, 2019
[ Upstream commit 181fa434d0514e40ebf6e9721f2b72700287b6e2 ] According to the PCI Local Bus specification Revision 3.0, section 6.8.1.3 (Message Control for MSI), endpoints that are Multiple Message Capable as defined by bits [3:1] in the Message Control for MSI can request a number of vectors that is power of two aligned. As specified in section 6.8.1.6 "Message data for MSI", the Multiple Message Enable field (bits [6:4] of the Message Control register) defines the number of low order message data bits the function is permitted to modify to generate its system software allocated vectors. The MSI controller in the Xilinx NWL PCIe controller supports a number of MSI vectors specified through a bitmap and the hwirq number for an MSI, that is the value written in the MSI data TLP is determined by the bitmap allocation. For instance, in a situation where two endpoints sitting on the PCI bus request the following MSI configuration, with the current PCI Xilinx bitmap allocation code (that does not align MSI vector allocation on a power of two boundary): Endpoint #1: Requesting 1 MSI vector - allocated bitmap bits 0 Endpoint #2: Requesting 2 MSI vectors - allocated bitmap bits [1,2] The bitmap value(s) corresponds to the hwirq number that is programmed into the Message Data for MSI field in the endpoint MSI capability and is detected by the root complex to fire the corresponding MSI irqs. The value written in Message Data for MSI field corresponds to the first bit allocated in the bitmap for Multi MSI vectors. The current Xilinx NWL MSI allocation code allows a bitmap allocation that is not a power of two boundaries, so endpoint #2, is allowed to toggle Message Data bit[0] to differentiate between its two vectors (meaning that the MSI data will be respectively 0x0 and 0x1 for the two vectors allocated to endpoint #2). This clearly aliases with the Endpoint #1 vector allocation, resulting in a broken Multi MSI implementation. Update the code to allocate MSI bitmap ranges with a power of two alignment, fixing the bug. Fixes: ab597d3 ("PCI: xilinx-nwl: Add support for Xilinx NWL PCIe Host Controller") Suggested-by: Marc Zyngier <[email protected]> Signed-off-by: Bharat Kumar Gogada <[email protected]> [[email protected]: updated commit log] Signed-off-by: Lorenzo Pieralisi <[email protected]> Acked-by: Marc Zyngier <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
mylove90
referenced
this issue
in mylove90/android_kernel_xiaomi_onc
Nov 18, 2019
[ Upstream commit 80e5302e4bc85a6b685b7668c36c6487b5f90e9a ] An impending change to enable HAVE_C_RECORDMCOUNT on powerpc leads to warnings such as the following: # modprobe kprobe_example ftrace-powerpc: Not expected bl: opcode is 3c4c0001 WARNING: CPU: 0 PID: 227 at kernel/trace/ftrace.c:2001 ftrace_bug+0x90/0x318 Modules linked in: CPU: 0 PID: 227 Comm: modprobe Not tainted 5.2.0-rc6-00678-g1c329100b942 #2 NIP: c000000000264318 LR: c00000000025d694 CTR: c000000000f5cd30 REGS: c000000001f2b7b0 TRAP: 0700 Not tainted (5.2.0-rc6-00678-g1c329100b942) MSR: 900000010282b033 <SF,HV,VEC,VSX,EE,FP,ME,IR,DR,RI,LE,TM[E]> CR: 28228222 XER: 00000000 CFAR: c0000000002642fc IRQMASK: 0 <snip> NIP [c000000000264318] ftrace_bug+0x90/0x318 LR [c00000000025d694] ftrace_process_locs+0x4f4/0x5e0 Call Trace: [c000000001f2ba40] [0000000000000004] 0x4 (unreliable) [c000000001f2bad0] [c00000000025d694] ftrace_process_locs+0x4f4/0x5e0 [c000000001f2bb90] [c00000000020ff10] load_module+0x25b0/0x30c0 [c000000001f2bd00] [c000000000210cb0] sys_finit_module+0xc0/0x130 [c000000001f2be20] [c00000000000bda4] system_call+0x5c/0x70 Instruction dump: 419e0018 2f83ffff 419e00bc 2f83ffea 409e00cc 4800001c 0fe00000 3c62ff96 39000001 39400000 386386d0 480000c4 <0fe00000> 3ce20003 39000001 3c62ff96 ---[ end trace 4c438d5cebf78381 ]--- ftrace failed to modify [<c0080000012a0008>] 0xc0080000012a0008 actual: 01:00:4c:3c Initializing ftrace call sites ftrace record flags: 2000000 (0) expected tramp: c00000000006af4c Looking at the relocation records in __mcount_loc shows a few spurious entries: RELOCATION RECORDS FOR [__mcount_loc]: OFFSET TYPE VALUE 0000000000000000 R_PPC64_ADDR64 .text.unlikely+0x0000000000000008 0000000000000008 R_PPC64_ADDR64 .text.unlikely+0x0000000000000014 0000000000000010 R_PPC64_ADDR64 .text.unlikely+0x0000000000000060 0000000000000018 R_PPC64_ADDR64 .text.unlikely+0x00000000000000b4 0000000000000020 R_PPC64_ADDR64 .init.text+0x0000000000000008 0000000000000028 R_PPC64_ADDR64 .init.text+0x0000000000000014 The first entry in each section is incorrect. Looking at the relocation records, the spurious entries correspond to the R_PPC64_ENTRY records: RELOCATION RECORDS FOR [.text.unlikely]: OFFSET TYPE VALUE 0000000000000000 R_PPC64_REL64 .TOC.-0x0000000000000008 0000000000000008 R_PPC64_ENTRY *ABS* 0000000000000014 R_PPC64_REL24 _mcount <snip> The problem is that we are not validating the return value from get_mcountsym() in sift_rel_mcount(). With this entry, mcountsym is 0, but Elf_r_sym(relp) also ends up being 0. Fix this by ensuring mcountsym is valid before processing the entry. Signed-off-by: Naveen N. Rao <[email protected]> Acked-by: Steven Rostedt (VMware) <[email protected]> Tested-by: Satheesh Rajendran <[email protected]> Signed-off-by: Michael Ellerman <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
I got this error while building this kernel
Error:
[1/1] /home/Dhina_17/android/line.../out/soong/.bootstrap/build.ninja[55/56] glob prebuilts/ndk/cpufeatures.bp [80/80] /home/Dhina_17/android/li...oid/lineage/out/soong/build.ninjaFAILED: /home/Dhina_17/android/lineage/out/soong/build.ninja
/home/Dhina_17/android/lineage/out/soong/.bootstrap/bin/soong_build -t -l /home/Dhina_17/android/lineage/out/.module_paths/Android.bp.list -b /home/Dhina_17/android/lineage/out/soong -n /home/Dhina_17/android/lineage/out -d /home/Dhina_17/android/lineage/out/soong/build.ninja.d -o /home/Dhina_17/android/lineage/out/soong/build.ninja Android.bperror: kernel/xiaomi/onclite/scripts/dtc-aosp/dtc/Android.bp:15:9: unrecognized property "dist"
error: kernel/xiaomi/onclite/scripts/dtc-aosp/dtc/libfdt/Android.bp:1:1: module "libfdt" already defined
external/dtc/libfdt/Android.bp:1:1 <-- previous definition here
ninja: build stopped: subcommand failed.
12:42:36 soong bootstrap failed with: exit status 1
failed to build some targets (12 seconds)
Dhina_17@instance-1:~/android/lineage$
If i delete the dtc-aosp , it fixed.. how to build with dtc-aosp ? How to fix
The text was updated successfully, but these errors were encountered: