-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Rollup patch to add LXC support #1
base: ElementalX-1.00
Are you sure you want to change the base?
Conversation
Incidentally, this should "properly" be committed as at least two patches--one to update the UID/GID sites, and a second to add the defconfig. My availability is pretty limited until the end of January though. Happy to refactor if you (1) care and (2) get the chance. Also also, this should probably be upstreamed to AOSP Gerrit--I'm happy to go that route, it's just not something I can do quickly right now. |
(cf lxc/lxc#380) |
This patch introduces an i_dir_level field to support large directory. Previously, f2fs maintains multi-level hash tables to find a dentry quickly from a bunch of chiild dentries in a directory, and the hash tables consist of the following tree structure as below. In Documentation/filesystems/f2fs.txt, ---------------------- A : bucket B : block N : MAX_DIR_HASH_DEPTH ---------------------- level #0 | A(2B) | level #1 | A(2B) - A(2B) | level #2 | A(2B) - A(2B) - A(2B) - A(2B) . | . . . . level #N/2 | A(2B) - A(2B) - A(2B) - A(2B) - A(2B) - ... - A(2B) . | . . . . level #N | A(4B) - A(4B) - A(4B) - A(4B) - A(4B) - ... - A(4B) But, if we can guess that a directory will handle a number of child files, we don't need to traverse the tree from level #0 to #N all the time. Since the lower level tables contain relatively small number of dentries, the miss ratio of the target dentry is likely to be high. In order to avoid that, we can configure the hash tables sparsely from level #0 like this. level #0 | A(2B) - A(2B) - A(2B) - A(2B) level #1 | A(2B) - A(2B) - A(2B) - A(2B) - A(2B) - ... - A(2B) . | . . . . level #N/2 | A(2B) - A(2B) - A(2B) - A(2B) - A(2B) - ... - A(2B) . | . . . . level #N | A(4B) - A(4B) - A(4B) - A(4B) - A(4B) - ... - A(4B) With this structure, we can skip the ineffective tree searches in lower level hash tables. This patch adds just a facility for this by introducing i_dir_level in f2fs_inode. Signed-off-by: Jaegeuk Kim <[email protected]>
Movable allocation requests can get CMA memory. CMA memory is mapped as non-cached normal memory. f2fs code can perform test_bit operations on the memory allocated, which inturn uses ldr/str exclusive instuctions on ARM. ldr/str exclusive instructions casue unhandled exceptions and kernel panics. [ 6934.858377] Unhandled fault: unknown 53 (0x96000035) at 0xffffffc0699c1000 [ 6934.866608] Internal error: : 96000035 [#1] PREEMPT SMP [ 6934.871883] CPU: 1 PID: 666 Comm: LazyTaskWriterT Tainted: G W 3.10.40-g3bdd559-dirty #39 [ 6934.880975] task: ffffffc06079a040 ti: ffffffc02a7bc000 task.ti: ffffffc02a7bc000 [ 6934.888565] PC is at test_and_clear_bit+0x14/0x40 [ 6934.893304] LR is at f2fs_delete_entry+0xd8/0x28c Bug 1550455 Change-Id: I9645296a052301820063b9737bf06c8c9e059986 Signed-off-by: Krishna Reddy <[email protected]>
We can summarize the roll forward recovery scenarios as follows. [Term] F: fsync_mark, D: dentry_mark 1. inode(x) | CP | inode(x) | dnode(F) -> Update the latest inode(x). 2. inode(x) | CP | inode(F) | dnode(F) -> No problem. 3. inode(x) | CP | dnode(F) | inode(x) -> Recover to the latest dnode(F), and drop the last inode(x) 4. inode(x) | CP | dnode(F) | inode(F) -> No problem. 5. CP | inode(x) | dnode(F) -> The inode(DF) was missing. Should drop this dnode(F). 6. CP | inode(DF) | dnode(F) -> No problem. 7. CP | dnode(F) | inode(DF) -> If f2fs_iget fails, then goto next to find inode(DF). 8. CP | dnode(F) | inode(x) -> If f2fs_iget fails, then goto next to find inode(DF). But it will fail due to no inode(DF). So, this patch adds some missing points such as #1, #5, #7, and #8. Signed-off-by: Huang Ying <[email protected]> Signed-off-by: Jaegeuk Kim <[email protected]> Conflicts: fs/f2fs/recovery.c Change-Id: I7ef859c0de58cdce90528c556958aca575360010
An issue was observed when a userspace task exits. The page which hits error here is the zero page. In binder mmap, the whole of vma is not mapped. On a task crash, when debuggerd reads the binder regions, the unmapped areas fall to do_anonymous_page in handle_pte_fault, due to the absence of a vm_fault handler. This results in zero page being mapped. Later in zap_pte_range, vm_normal_page returns zero page in the case of VM_MIXEDMAP and it results in the error. BUG: Bad page map in process mediaserver pte:9dff379f pmd:9bfbd831 page:c0ed8e60 count:1 mapcount:-1 mapping: (null) index:0x0 page flags: 0x404(referenced|reserved) addr:40c3f000 vm_flags:10220051 anon_vma: (null) mapping:d9fe0764 index:fd vma->vm_ops->fault: (null) vma->vm_file->f_op->mmap: binder_mmap+0x0/0x274 CPU: 0 PID: 1463 Comm: mediaserver Tainted: G W 3.10.17+ #1 [<c001549c>] (unwind_backtrace+0x0/0x11c) from [<c001200c>] (show_stack+0x10/0x14) [<c001200c>] (show_stack+0x10/0x14) from [<c0103d78>] (print_bad_pte+0x158/0x190) [<c0103d78>] (print_bad_pte+0x158/0x190) from [<c01055f0>] (unmap_single_vma+0x2e4/0x598) [<c01055f0>] (unmap_single_vma+0x2e4/0x598) from [<c010618c>] (unmap_vmas+0x34/0x50) [<c010618c>] (unmap_vmas+0x34/0x50) from [<c010a9e4>] (exit_mmap+0xc8/0x1e8) [<c010a9e4>] (exit_mmap+0xc8/0x1e8) from [<c00520f0>] (mmput+0x54/0xd0) [<c00520f0>] (mmput+0x54/0xd0) from [<c005972c>] (do_exit+0x360/0x990) [<c005972c>] (do_exit+0x360/0x990) from [<c0059ef0>] (do_group_exit+0x84/0xc0) [<c0059ef0>] (do_group_exit+0x84/0xc0) from [<c0066de0>] (get_signal_to_deliver+0x4d4/0x548) [<c0066de0>] (get_signal_to_deliver+0x4d4/0x548) from [<c0011500>] (do_signal+0xa8/0x3b8) Add a vm_fault handler which returns VM_FAULT_SIGBUS, and prevents the wrong fallback to do_anonymous_page. Change-Id: I43c227e489f74f4907f199caf99f571b61883064 Signed-off-by: Vinayak Menon <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
While disabling ConfigFS Android gadget, android_disconnect() calls kill_all_hid_devices(), if CONFIG_USB_CONFIGFS_F_ACC is enabled, to free the registered HIDs without checking whether the USB accessory device really exist or not. If USB accessory device doesn't exist then we run into following kernel panic: ----8<---- [ 136.724761] Unable to handle kernel NULL pointer dereference at virtual address 00000064 [ 136.724809] pgd = c0204000 [ 136.731924] [00000064] *pgd=00000000 [ 136.737830] Internal error: Oops: 5 [#1] SMP ARM [ 136.738108] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 3.18.0-rc4-00400-gf75300e-dirty #76 [ 136.742788] task: c0fb19d8 ti: c0fa4000 task.ti: c0fa4000 [ 136.750890] PC is at _raw_spin_lock_irqsave+0x24/0x60 [ 136.756246] LR is at kill_all_hid_devices+0x24/0x114 ---->8---- This patch adds a test to check if USB Accessory device exists before freeing HIDs. Change-Id: Ie229feaf0de3f4f7a151fcaa9a994e34e15ff73b Signed-off-by: Amit Pundir <[email protected]> (cherry picked from commit 32a71bce154cb89a549b9b7d28e8cf03b889d849)
[ Upstream commit eed4d839b0cdf9d84b0a9bc63de90fd5e1e886fb ] Use dst_entry held by sk_dst_get() to retrieve tunnel's PMTU. The dst_mtu(__sk_dst_get(tunnel->sock)) call was racy. __sk_dst_get() could return NULL if tunnel->sock->sk_dst_cache was reset just before the call, thus making dst_mtu() dereference a NULL pointer: [ 1937.661598] BUG: unable to handle kernel NULL pointer dereference at 0000000000000020 [ 1937.664005] IP: [<ffffffffa049db88>] pppol2tp_connect+0x33d/0x41e [l2tp_ppp] [ 1937.664005] PGD daf0c067 PUD d9f93067 PMD 0 [ 1937.664005] Oops: 0000 [#1] SMP [ 1937.664005] Modules linked in: l2tp_ppp l2tp_netlink l2tp_core ip6table_filter ip6_tables iptable_filter ip_tables ebtable_nat ebtables x_tables udp_tunnel pppoe pppox ppp_generic slhc deflate ctr twofish_generic twofish_x86_64_3way xts lrw gf128mul glue_helper twofish_x86_64 twofish_common blowfish_generic blowfish_x86_64 blowfish_common des_generic cbc xcbc rmd160 sha512_generic hmac crypto_null af_key xfrm_algo 8021q garp bridge stp llc tun atmtcp clip atm ext3 mbcache jbd iTCO_wdt coretemp kvm_intel iTCO_vendor_support kvm pcspkr evdev ehci_pci lpc_ich mfd_core i5400_edac edac_core i5k_amb shpchp button processor thermal_sys xfs crc32c_generic libcrc32c dm_mod usbhid sg hid sr_mod sd_mod cdrom crc_t10dif crct10dif_common ata_generic ahci ata_piix tg3 libahci libata uhci_hcd ptp ehci_hcd pps_core usbcore scsi_mod libphy usb_common [last unloaded: l2tp_core] [ 1937.664005] CPU: 0 PID: 10022 Comm: l2tpstress Tainted: G O 3.17.0-rc1 #1 [ 1937.664005] Hardware name: HP ProLiant DL160 G5, BIOS O12 08/22/2008 [ 1937.664005] task: ffff8800d8fda790 ti: ffff8800c43c4000 task.ti: ffff8800c43c4000 [ 1937.664005] RIP: 0010:[<ffffffffa049db88>] [<ffffffffa049db88>] pppol2tp_connect+0x33d/0x41e [l2tp_ppp] [ 1937.664005] RSP: 0018:ffff8800c43c7de8 EFLAGS: 00010282 [ 1937.664005] RAX: ffff8800da8a7240 RBX: ffff8800d8c64600 RCX: 000001c325a137b5 [ 1937.664005] RDX: 8c6318c6318c6320 RSI: 000000000000010c RDI: 0000000000000000 [ 1937.664005] RBP: ffff8800c43c7ea8 R08: 0000000000000000 R09: 0000000000000000 [ 1937.664005] R10: ffffffffa048e2c0 R11: ffff8800d8c64600 R12: ffff8800ca7a5000 [ 1937.664005] R13: ffff8800c439bf40 R14: 000000000000000c R15: 0000000000000009 [ 1937.664005] FS: 00007fd7f610f700(0000) GS:ffff88011a600000(0000) knlGS:0000000000000000 [ 1937.664005] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b [ 1937.664005] CR2: 0000000000000020 CR3: 00000000d9d75000 CR4: 00000000000027e0 [ 1937.664005] Stack: [ 1937.664005] ffffffffa049da80 ffff8800d8fda790 000000000000005b ffff880000000009 [ 1937.664005] ffff8800daf3f200 0000000000000003 ffff8800c43c7e48 ffffffff81109b57 [ 1937.664005] ffffffff81109b0e ffffffff8114c566 0000000000000000 0000000000000000 [ 1937.664005] Call Trace: [ 1937.664005] [<ffffffffa049da80>] ? pppol2tp_connect+0x235/0x41e [l2tp_ppp] [ 1937.664005] [<ffffffff81109b57>] ? might_fault+0x9e/0xa5 [ 1937.664005] [<ffffffff81109b0e>] ? might_fault+0x55/0xa5 [ 1937.664005] [<ffffffff8114c566>] ? rcu_read_unlock+0x1c/0x26 [ 1937.664005] [<ffffffff81309196>] SYSC_connect+0x87/0xb1 [ 1937.664005] [<ffffffff813e56f7>] ? sysret_check+0x1b/0x56 [ 1937.664005] [<ffffffff8107590d>] ? trace_hardirqs_on_caller+0x145/0x1a1 [ 1937.664005] [<ffffffff81213dee>] ? trace_hardirqs_on_thunk+0x3a/0x3f [ 1937.664005] [<ffffffff8114c262>] ? spin_lock+0x9/0xb [ 1937.664005] [<ffffffff813092b4>] SyS_connect+0x9/0xb [ 1937.664005] [<ffffffff813e56d2>] system_call_fastpath+0x16/0x1b [ 1937.664005] Code: 10 2a 84 81 e8 65 76 bd e0 65 ff 0c 25 10 bb 00 00 4d 85 ed 74 37 48 8b 85 60 ff ff ff 48 8b 80 88 01 00 00 48 8b b8 10 02 00 00 <48> 8b 47 20 ff 50 20 85 c0 74 0f 83 e8 28 89 83 10 01 00 00 89 [ 1937.664005] RIP [<ffffffffa049db88>] pppol2tp_connect+0x33d/0x41e [l2tp_ppp] [ 1937.664005] RSP <ffff8800c43c7de8> [ 1937.664005] CR2: 0000000000000020 [ 1939.559375] ---[ end trace 82d44500f28f8708 ]--- Fixes: f34c4a35d879 ("l2tp: take PMTU from tunnel UDP socket") Signed-off-by: Guillaume Nault <[email protected]> Acked-by: Eric Dumazet <[email protected]> Signed-off-by: David S. Miller <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
commit 35425ea2492175fd39f6116481fe98b2b3ddd4ca upstream. Christopher Head 2014-06-28 05:26:20 UTC described: "I tried to reproduce this on 3.12.21. Instead, when I do "echo hello > foo" in an ecryptfs mount with ecryptfs_xattr specified, I get a kernel crash: BUG: unable to handle kernel NULL pointer dereference at (null) IP: [<ffffffff8110eb39>] fsstack_copy_attr_all+0x2/0x61 PGD d7840067 PUD b2c3c067 PMD 0 Oops: 0002 [#1] SMP Modules linked in: nvidia(PO) CPU: 3 PID: 3566 Comm: bash Tainted: P O 3.12.21-gentoo-r1 #2 Hardware name: ASUSTek Computer Inc. G60JX/G60JX, BIOS 206 03/15/2010 task: ffff8801948944c0 ti: ffff8800bad70000 task.ti: ffff8800bad70000 RIP: 0010:[<ffffffff8110eb39>] [<ffffffff8110eb39>] fsstack_copy_attr_all+0x2/0x61 RSP: 0018:ffff8800bad71c10 EFLAGS: 00010246 RAX: 00000000000181a4 RBX: ffff880198648480 RCX: 0000000000000000 RDX: 0000000000000004 RSI: ffff880172010450 RDI: 0000000000000000 RBP: ffff880198490e40 R08: 0000000000000000 R09: 0000000000000000 R10: ffff880172010450 R11: ffffea0002c51e80 R12: 0000000000002000 R13: 000000000000001a R14: 0000000000000000 R15: ffff880198490e40 FS: 00007ff224caa700(0000) GS:ffff88019fcc0000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000000 CR3: 00000000bb07f000 CR4: 00000000000007e0 Stack: ffffffff811826e8 ffff8800a39d8000 0000000000000000 000000000000001a ffff8800a01d0000 ffff8800a39d8000 ffffffff81185fd5 ffffffff81082c2c 00000001a39d8000 53d0abbc98490e40 0000000000000037 ffff8800a39d8220 Call Trace: [<ffffffff811826e8>] ? ecryptfs_setxattr+0x40/0x52 [<ffffffff81185fd5>] ? ecryptfs_write_metadata+0x1b3/0x223 [<ffffffff81082c2c>] ? should_resched+0x5/0x23 [<ffffffff8118322b>] ? ecryptfs_initialize_file+0xaf/0xd4 [<ffffffff81183344>] ? ecryptfs_create+0xf4/0x142 [<ffffffff810f8c0d>] ? vfs_create+0x48/0x71 [<ffffffff810f9c86>] ? do_last.isra.68+0x559/0x952 [<ffffffff810f7ce7>] ? link_path_walk+0xbd/0x458 [<ffffffff810fa2a3>] ? path_openat+0x224/0x472 [<ffffffff810fa7bd>] ? do_filp_open+0x2b/0x6f [<ffffffff81103606>] ? __alloc_fd+0xd6/0xe7 [<ffffffff810ee6ab>] ? do_sys_open+0x65/0xe9 [<ffffffff8157d022>] ? system_call_fastpath+0x16/0x1b RIP [<ffffffff8110eb39>] fsstack_copy_attr_all+0x2/0x61 RSP <ffff8800bad71c10> CR2: 0000000000000000 ---[ end trace df9dba5f1ddb8565 ]---" If we create a file when we mount with ecryptfs_xattr_metadata option, we will encounter a crash in this path: ->ecryptfs_create ->ecryptfs_initialize_file ->ecryptfs_write_metadata ->ecryptfs_write_metadata_to_xattr ->ecryptfs_setxattr ->fsstack_copy_attr_all It's because our dentry->d_inode used in fsstack_copy_attr_all is NULL, and it will be initialized when ecryptfs_initialize_file finish. So we should skip copying attr from lower inode when the value of ->d_inode is invalid. Signed-off-by: Chao Yu <[email protected]> Signed-off-by: Tyler Hicks <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
commit 086ba77a6db00ed858ff07451bedee197df868c9 upstream. ARM has some private syscalls (for example, set_tls(2)) which lie outside the range of NR_syscalls. If any of these are called while syscall tracing is being performed, out-of-bounds array access will occur in the ftrace and perf sys_{enter,exit} handlers. # trace-cmd record -e raw_syscalls:* true && trace-cmd report ... true-653 [000] 384.675777: sys_enter: NR 192 (0, 1000, 3, 4000022, ffffffff, 0) true-653 [000] 384.675812: sys_exit: NR 192 = 1995915264 true-653 [000] 384.675971: sys_enter: NR 983045 (76f74480, 76f74000, 76f74b28, 76f74480, 76f76f74, 1) true-653 [000] 384.675988: sys_exit: NR 983045 = 0 ... # trace-cmd record -e syscalls:* true [ 17.289329] Unable to handle kernel paging request at virtual address aaaaaace [ 17.289590] pgd = 9e71c000 [ 17.289696] [aaaaaace] *pgd=00000000 [ 17.289985] Internal error: Oops: 5 [#1] PREEMPT SMP ARM [ 17.290169] Modules linked in: [ 17.290391] CPU: 0 PID: 704 Comm: true Not tainted 3.18.0-rc2+ #21 [ 17.290585] task: 9f4dab00 ti: 9e710000 task.ti: 9e710000 [ 17.290747] PC is at ftrace_syscall_enter+0x48/0x1f8 [ 17.290866] LR is at syscall_trace_enter+0x124/0x184 Fix this by ignoring out-of-NR_syscalls-bounds syscall numbers. Commit cd0980f "tracing: Check invalid syscall nr while tracing syscalls" added the check for less than zero, but it should have also checked for greater than NR_syscalls. Link: http://lkml.kernel.org/p/[email protected] Fixes: cd0980f "tracing: Check invalid syscall nr while tracing syscalls" Signed-off-by: Rabin Vincent <[email protected]> Signed-off-by: Steven Rostedt <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
commit 3b1deef6b1289a99505858a3b212c5b50adf0c2f upstream. evm_inode_setxattr() can be called with no value. The function does not check the length so that following command can be used to produce the kernel oops: setfattr -n security.evm FOO. This patch fixes it. Changes in v3: * there is no reason to return different error codes for EVM_XATTR_HMAC and non EVM_XATTR_HMAC. Remove unnecessary test then. Changes in v2: * testing for validity of xattr type [ 1106.396921] BUG: unable to handle kernel NULL pointer dereference at (null) [ 1106.398192] IP: [<ffffffff812af7b8>] evm_inode_setxattr+0x2a/0x48 [ 1106.399244] PGD 29048067 PUD 290d7067 PMD 0 [ 1106.399953] Oops: 0000 [#1] SMP [ 1106.400020] Modules linked in: bridge stp llc evdev serio_raw i2c_piix4 button fuse [ 1106.400020] CPU: 0 PID: 3635 Comm: setxattr Not tainted 3.16.0-kds+ #2936 [ 1106.400020] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011 [ 1106.400020] task: ffff8800291a0000 ti: ffff88002917c000 task.ti: ffff88002917c000 [ 1106.400020] RIP: 0010:[<ffffffff812af7b8>] [<ffffffff812af7b8>] evm_inode_setxattr+0x2a/0x48 [ 1106.400020] RSP: 0018:ffff88002917fd50 EFLAGS: 00010246 [ 1106.400020] RAX: 0000000000000000 RBX: ffff88002917fdf8 RCX: 0000000000000000 [ 1106.400020] RDX: 0000000000000000 RSI: ffffffff818136d3 RDI: ffff88002917fdf8 [ 1106.400020] RBP: ffff88002917fd68 R08: 0000000000000000 R09: 00000000003ec1df [ 1106.400020] R10: 0000000000000000 R11: 0000000000000000 R12: ffff8800438a0a00 [ 1106.400020] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000 [ 1106.400020] FS: 00007f7dfa7d7740(0000) GS:ffff88005da00000(0000) knlGS:0000000000000000 [ 1106.400020] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 1106.400020] CR2: 0000000000000000 CR3: 000000003763e000 CR4: 00000000000006f0 [ 1106.400020] Stack: [ 1106.400020] ffff8800438a0a00 ffff88002917fdf8 0000000000000000 ffff88002917fd98 [ 1106.400020] ffffffff812a1030 ffff8800438a0a00 ffff88002917fdf8 0000000000000000 [ 1106.400020] 0000000000000000 ffff88002917fde0 ffffffff8116d08a ffff88002917fdc8 [ 1106.400020] Call Trace: [ 1106.400020] [<ffffffff812a1030>] security_inode_setxattr+0x5d/0x6a [ 1106.400020] [<ffffffff8116d08a>] vfs_setxattr+0x6b/0x9f [ 1106.400020] [<ffffffff8116d1e0>] setxattr+0x122/0x16c [ 1106.400020] [<ffffffff811687e8>] ? mnt_want_write+0x21/0x45 [ 1106.400020] [<ffffffff8114d011>] ? __sb_start_write+0x10f/0x143 [ 1106.400020] [<ffffffff811687e8>] ? mnt_want_write+0x21/0x45 [ 1106.400020] [<ffffffff811687c0>] ? __mnt_want_write+0x48/0x4f [ 1106.400020] [<ffffffff8116d3e6>] SyS_setxattr+0x6e/0xb0 [ 1106.400020] [<ffffffff81529da9>] system_call_fastpath+0x16/0x1b [ 1106.400020] Code: c3 0f 1f 44 00 00 55 48 89 e5 41 55 49 89 d5 41 54 49 89 fc 53 48 89 f3 48 c7 c6 d3 36 81 81 48 89 df e8 18 22 04 00 85 c0 75 07 <41> 80 7d 00 02 74 0d 48 89 de 4c 89 e7 e8 5a fe ff ff eb 03 83 [ 1106.400020] RIP [<ffffffff812af7b8>] evm_inode_setxattr+0x2a/0x48 [ 1106.400020] RSP <ffff88002917fd50> [ 1106.400020] CR2: 0000000000000000 [ 1106.428061] ---[ end trace ae08331628ba3050 ]--- Reported-by: Jan Kara <[email protected]> Signed-off-by: Dmitry Kasatkin <[email protected]> Signed-off-by: Mimi Zohar <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
commit 1bf1890e86869032099b539bc83b098be12fc5a7 upstream. I ran into this error after a ubiupdatevol, because I forgot to backport e9110361a9a4 UBI: fix the volumes tree sorting criteria. UBI error: process_pool_aeb: orphaned volume in fastmap pool UBI error: ubi_scan_fastmap: Attach by fastmap failed, doing a full scan! kmem_cache_destroy ubi_ainf_peb_slab: Slab cache still has objects CPU: 0 PID: 1 Comm: swapper Not tainted 3.14.18-00053-gf05cac8dbf85 #1 [<c000d298>] (unwind_backtrace) from [<c000baa8>] (show_stack+0x10/0x14) [<c000baa8>] (show_stack) from [<c01b7a68>] (destroy_ai+0x230/0x244) [<c01b7a68>] (destroy_ai) from [<c01b8fd4>] (ubi_attach+0x98/0x1ec) [<c01b8fd4>] (ubi_attach) from [<c01ade90>] (ubi_attach_mtd_dev+0x2b8/0x868) [<c01ade90>] (ubi_attach_mtd_dev) from [<c038b510>] (ubi_init+0x1dc/0x2ac) [<c038b510>] (ubi_init) from [<c0008860>] (do_one_initcall+0x94/0x140) [<c0008860>] (do_one_initcall) from [<c037aadc>] (kernel_init_freeable+0xe8/0x1b0) [<c037aadc>] (kernel_init_freeable) from [<c02730ac>] (kernel_init+0x8/0xe4) [<c02730ac>] (kernel_init) from [<c00093f0>] (ret_from_fork+0x14/0x24) UBI: scanning is finished Freeing the cache in the error path fixes the Slab error. Tested on at91sam9g35 (3.14.18+fastmap backports) Signed-off-by: Richard Genoud <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
commit d3051b489aa81ca9ba62af366149ef42b8dae97c upstream. A panic was seen in the following sitation. There are two threads running on the system. The first thread is a system monitoring thread that is reading /proc/modules. The second thread is loading and unloading a module (in this example I'm using my simple dummy-module.ko). Note, in the "real world" this occurred with the qlogic driver module. When doing this, the following panic occurred: ------------[ cut here ]------------ kernel BUG at kernel/module.c:3739! invalid opcode: 0000 [#1] SMP Modules linked in: binfmt_misc sg nfsv3 rpcsec_gss_krb5 nfsv4 dns_resolver nfs fscache intel_powerclamp coretemp kvm_intel kvm crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel aesni_intel lrw igb gf128mul glue_helper iTCO_wdt iTCO_vendor_support ablk_helper ptp sb_edac cryptd pps_core edac_core shpchp i2c_i801 pcspkr wmi lpc_ich ioatdma mfd_core dca ipmi_si nfsd ipmi_msghandler auth_rpcgss nfs_acl lockd sunrpc xfs libcrc32c sr_mod cdrom sd_mod crc_t10dif crct10dif_common mgag200 syscopyarea sysfillrect sysimgblt i2c_algo_bit drm_kms_helper ttm isci drm libsas ahci libahci scsi_transport_sas libata i2c_core dm_mirror dm_region_hash dm_log dm_mod [last unloaded: dummy_module] CPU: 37 PID: 186343 Comm: cat Tainted: GF O-------------- 3.10.0+ #7 Hardware name: Intel Corporation S2600CP/S2600CP, BIOS RMLSDP.86I.00.29.D696.1311111329 11/11/2013 task: ffff8807fd2d8000 ti: ffff88080fa7c000 task.ti: ffff88080fa7c000 RIP: 0010:[<ffffffff810d64c5>] [<ffffffff810d64c5>] module_flags+0xb5/0xc0 RSP: 0018:ffff88080fa7fe18 EFLAGS: 00010246 RAX: 0000000000000003 RBX: ffffffffa03b5200 RCX: 0000000000000000 RDX: 0000000000001000 RSI: ffff88080fa7fe38 RDI: ffffffffa03b5000 RBP: ffff88080fa7fe28 R08: 0000000000000010 R09: 0000000000000000 R10: 0000000000000000 R11: 000000000000000f R12: ffffffffa03b5000 R13: ffffffffa03b5008 R14: ffffffffa03b5200 R15: ffffffffa03b5000 FS: 00007f6ae57ef740(0000) GS:ffff88101e7a0000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000404f70 CR3: 0000000ffed48000 CR4: 00000000001407e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Stack: ffffffffa03b5200 ffff8810101e4800 ffff88080fa7fe70 ffffffff810d666c ffff88081e807300 000000002e0f2fbf 0000000000000000 ffff88100f257b00 ffffffffa03b5008 ffff88080fa7ff48 ffff8810101e4800 ffff88080fa7fee0 Call Trace: [<ffffffff810d666c>] m_show+0x19c/0x1e0 [<ffffffff811e4d7e>] seq_read+0x16e/0x3b0 [<ffffffff812281ed>] proc_reg_read+0x3d/0x80 [<ffffffff811c0f2c>] vfs_read+0x9c/0x170 [<ffffffff811c1a58>] SyS_read+0x58/0xb0 [<ffffffff81605829>] system_call_fastpath+0x16/0x1b Code: 48 63 c2 83 c2 01 c6 04 03 29 48 63 d2 eb d9 0f 1f 80 00 00 00 00 48 63 d2 c6 04 13 2d 41 8b 0c 24 8d 50 02 83 f9 01 75 b2 eb cb <0f> 0b 66 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 55 48 89 e5 41 RIP [<ffffffff810d64c5>] module_flags+0xb5/0xc0 RSP <ffff88080fa7fe18> Consider the two processes running on the system. CPU 0 (/proc/modules reader) CPU 1 (loading/unloading module) CPU 0 opens /proc/modules, and starts displaying data for each module by traversing the modules list via fs/seq_file.c:seq_open() and fs/seq_file.c:seq_read(). For each module in the modules list, seq_read does op->start() <-- this is a pointer to m_start() op->show() <- this is a pointer to m_show() op->stop() <-- this is a pointer to m_stop() The m_start(), m_show(), and m_stop() module functions are defined in kernel/module.c. The m_start() and m_stop() functions acquire and release the module_mutex respectively. ie) When reading /proc/modules, the module_mutex is acquired and released for each module. m_show() is called with the module_mutex held. It accesses the module struct data and attempts to write out module data. It is in this code path that the above BUG_ON() warning is encountered, specifically m_show() calls static char *module_flags(struct module *mod, char *buf) { int bx = 0; BUG_ON(mod->state == MODULE_STATE_UNFORMED); ... The other thread, CPU 1, in unloading the module calls the syscall delete_module() defined in kernel/module.c. The module_mutex is acquired for a short time, and then released. free_module() is called without the module_mutex. free_module() then sets mod->state = MODULE_STATE_UNFORMED, also without the module_mutex. Some additional code is called and then the module_mutex is reacquired to remove the module from the modules list: /* Now we can delete it from the lists */ mutex_lock(&module_mutex); stop_machine(__unlink_module, mod, NULL); mutex_unlock(&module_mutex); This is the sequence of events that leads to the panic. CPU 1 is removing dummy_module via delete_module(). It acquires the module_mutex, and then releases it. CPU 1 has NOT set dummy_module->state to MODULE_STATE_UNFORMED yet. CPU 0, which is reading the /proc/modules, acquires the module_mutex and acquires a pointer to the dummy_module which is still in the modules list. CPU 0 calls m_show for dummy_module. The check in m_show() for MODULE_STATE_UNFORMED passed for dummy_module even though it is being torn down. Meanwhile CPU 1, which has been continuing to remove dummy_module without holding the module_mutex, now calls free_module() and sets dummy_module->state to MODULE_STATE_UNFORMED. CPU 0 now calls module_flags() with dummy_module and ... static char *module_flags(struct module *mod, char *buf) { int bx = 0; BUG_ON(mod->state == MODULE_STATE_UNFORMED); and BOOM. Acquire and release the module_mutex lock around the setting of MODULE_STATE_UNFORMED in the teardown path, which should resolve the problem. Testing: In the unpatched kernel I can panic the system within 1 minute by doing while (true) do insmod dummy_module.ko; rmmod dummy_module.ko; done and while (true) do cat /proc/modules; done in separate terminals. In the patched kernel I was able to run just over one hour without seeing any issues. I also verified the output of panic via sysrq-c and the output of /proc/modules looks correct for all three states for the dummy_module. dummy_module 12661 0 - Unloading 0xffffffffa03a5000 (OE-) dummy_module 12661 0 - Live 0xffffffffa03bb000 (OE) dummy_module 14015 1 - Loading 0xffffffffa03a5000 (OE+) Signed-off-by: Prarit Bhargava <[email protected]> Reviewed-by: Oleg Nesterov <[email protected]> Signed-off-by: Rusty Russell <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
commit 824269c5868d2a7a26417e5ef3841a27d42c6139 upstream. In older versions of the IIO framework it was possible to pass a completely different set of channels to iio_buffer_register() as the one that is assigned to the IIO device. Commit 959d295 ("staging:iio: make iio_sw_buffer_preenable much more general.") introduced a restriction that requires that the set of channels that is passed to iio_buffer_register() is a subset of the channels assigned to the IIO device as the IIO core will use the list of channels that is assigned to the device to lookup a channel by scan index in iio_compute_scan_bytes(). If it can not find the channel the function will crash. This patch fixes the issue by making sure that the same set of channels is assigned to the IIO device and passed to iio_buffer_register(). Fixes the follow NULL pointer derefernce kernel crash: Unable to handle kernel NULL pointer dereference at virtual address 00000016 pgd = d53d0000 [00000016] *pgd=1534e831, *pte=00000000, *ppte=00000000 Internal error: Oops: 17 [#1] PREEMPT SMP ARM Modules linked in: CPU: 1 PID: 1626 Comm: bash Not tainted 3.15.0-19969-g2a180eb-dirty #9545 task: d6c124c0 ti: d539a000 task.ti: d539a000 PC is at iio_compute_scan_bytes+0x34/0xa8 LR is at iio_compute_scan_bytes+0x34/0xa8 pc : [<c03052e4>] lr : [<c03052e4>] psr: 60070013 sp : d539beb8 ip : 00000001 fp : 00000000 r10: 00000002 r9 : 00000000 r8 : 00000001 r7 : 00000000 r6 : d6dc8800 r5 : d7571000 r4 : 00000002 r3 : d7571000 r2 : 00000044 r1 : 00000001 r0 : 00000000 Flags: nZCv IRQs on FIQs on Mode SVC_32 ISA ARM Segment user Control: 18c5387d Table: 153d004a DAC: 00000015 Process bash (pid: 1626, stack limit = 0xd539a240) Stack: (0xd539beb8 to 0xd539c000) bea0: c02fc0e4 d7571000 bec0: d76c1640 d6dc8800 d757117c 00000000 d757112c c0305b04 d76c1690 d76c1640 bee0: d7571188 00000002 00000000 d7571000 d539a000 00000000 000dd1c8 c0305d54 bf00: d7571010 0160b868 00000002 c69d3900 d7573278 d7573308 c69d3900 c01ece90 bf20: 00000002 c0103fac c0103f6c d539bf88 00000002 c69d3b00 c69d3b0c c0103468 bf40: 00000000 00000000 d7694a00 00000002 000af408 d539bf88 c000dd84 c00b2f94 bf60: d7694a00 000af408 00000002 d7694a00 d7694a00 00000002 000af408 c000dd84 bf80: 00000000 c00b32d0 00000000 00000000 00000002 b6f1aa78 00000002 000af408 bfa0: 00000004 c000dc00 b6f1aa78 00000002 00000001 000af408 00000002 00000000 bfc0: b6f1aa78 00000002 000af408 00000004 be806a4c 000a6094 00000000 000dd1c8 bfe0: 00000000 be8069cc b6e8ab77 b6ec125c 40070010 00000001 22940489 154a5007 [<c03052e4>] (iio_compute_scan_bytes) from [<c0305b04>] (__iio_update_buffers+0x248/0x438) [<c0305b04>] (__iio_update_buffers) from [<c0305d54>] (iio_buffer_store_enable+0x60/0x7c) [<c0305d54>] (iio_buffer_store_enable) from [<c01ece90>] (dev_attr_store+0x18/0x24) [<c01ece90>] (dev_attr_store) from [<c0103fac>] (sysfs_kf_write+0x40/0x4c) [<c0103fac>] (sysfs_kf_write) from [<c0103468>] (kernfs_fop_write+0x110/0x154) [<c0103468>] (kernfs_fop_write) from [<c00b2f94>] (vfs_write+0xd0/0x160) [<c00b2f94>] (vfs_write) from [<c00b32d0>] (SyS_write+0x40/0x78) [<c00b32d0>] (SyS_write) from [<c000dc00>] (ret_fast_syscall+0x0/0x30) Code: ea00000e e1a01008 e1a00005 ebfff6fc (e5d0a016) Fixes: 959d295 ("staging:iio: make iio_sw_buffer_preenable much more general.") Signed-off-by: Lars-Peter Clausen <[email protected]> Signed-off-by: Jonathan Cameron <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
commit e10554738cab4224e097c2f9d975ea781a4fcde4 upstream. In older versions of the IIO framework it was possible to pass a completely different set of channels to iio_buffer_register() as the one that is assigned to the IIO device. Commit 959d295 ("staging:iio: make iio_sw_buffer_preenable much more general.") introduced a restriction that requires that the set of channels that is passed to iio_buffer_register() is a subset of the channels assigned to the IIO device as the IIO core will use the list of channels that is assigned to the device to lookup a channel by scan index in iio_compute_scan_bytes(). If it can not find the channel the function will crash. This patch fixes the issue by making sure that the same set of channels is assigned to the IIO device and passed to iio_buffer_register(). Note that we need to remove the IIO_CHAN_INFO_RAW and IIO_CHAN_INFO_SCALE info attributes from the channels since we don't actually want those to be registered. Fixes the following crash: Unable to handle kernel NULL pointer dereference at virtual address 00000016 pgd = d2094000 [00000016] *pgd=16e39831, *pte=00000000, *ppte=00000000 Internal error: Oops: 17 [#1] PREEMPT SMP ARM Modules linked in: CPU: 1 PID: 1695 Comm: bash Not tainted 3.17.0-06329-g29461ee #9686 task: d7768040 ti: d5bd4000 task.ti: d5bd4000 PC is at iio_compute_scan_bytes+0x38/0xc0 LR is at iio_compute_scan_bytes+0x34/0xc0 pc : [<c0316de8>] lr : [<c0316de4>] psr: 60070013 sp : d5bd5ec0 ip : 00000000 fp : 00000000 r10: d769f934 r9 : 00000000 r8 : 00000001 r7 : 00000000 r6 : c8fc6240 r5 : d769f800 r4 : 00000000 r3 : d769f800 r2 : 00000000 r1 : ffffffff r0 : 00000000 Flags: nZCv IRQs on FIQs on Mode SVC_32 ISA ARM Segment user Control: 18c5387d Table: 1209404a DAC: 00000015 Process bash (pid: 1695, stack limit = 0xd5bd4240) Stack: (0xd5bd5ec0 to 0xd5bd6000) 5ec0: d769f800 d7435640 c8fc6240 d769f984 00000000 c03175a4 d7435690 d7435640 5ee0: d769f990 00000002 00000000 d769f800 d5bd4000 00000000 000b43a8 c03177f4 5f00: d769f810 0162b8c8 00000002 c8fc7e00 d77f1d08 d77f1da8 c8fc7e00 c01faf1c 5f20: 00000002 c010694c c010690c d5bd5f88 00000002 c8fc6840 c8fc684c c0105e08 5f40: 00000000 00000000 d20d1580 00000002 000af408 d5bd5f88 c000de84 c00b76d4 5f60: d20d1580 000af408 00000002 d20d1580 d20d1580 00000002 000af408 c000de84 5f80: 00000000 c00b7a44 00000000 00000000 00000002 b6ebea78 00000002 000af408 5fa0: 00000004 c000dd00 b6ebea78 00000002 00000001 000af408 00000002 00000000 5fc0: b6ebea78 00000002 000af408 00000004 bee96a4c 000a6094 00000000 000b43a8 5fe0: 00000000 bee969cc b6e2eb77 b6e6525c 40070010 00000001 00000000 00000000 [<c0316de8>] (iio_compute_scan_bytes) from [<c03175a4>] (__iio_update_buffers+0x248/0x438) [<c03175a4>] (__iio_update_buffers) from [<c03177f4>] (iio_buffer_store_enable+0x60/0x7c) [<c03177f4>] (iio_buffer_store_enable) from [<c01faf1c>] (dev_attr_store+0x18/0x24) [<c01faf1c>] (dev_attr_store) from [<c010694c>] (sysfs_kf_write+0x40/0x4c) [<c010694c>] (sysfs_kf_write) from [<c0105e08>] (kernfs_fop_write+0x110/0x154) [<c0105e08>] (kernfs_fop_write) from [<c00b76d4>] (vfs_write+0xbc/0x170) [<c00b76d4>] (vfs_write) from [<c00b7a44>] (SyS_write+0x40/0x78) [<c00b7a44>] (SyS_write) from [<c000dd00>] (ret_fast_syscall+0x0/0x30) Fixes: 959d295 ("staging:iio: make iio_sw_buffer_preenable much more general.") Signed-off-by: Lars-Peter Clausen <[email protected]> Signed-off-by: Jonathan Cameron <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
[ Upstream commit bfa6b18c680450c17512c741ed1d818695747621 ] Currently, there's no guarantee that udc->driver will be valid when using soft_connect sysfs interface. In fact, we can very easily trigger a NULL pointer dereference by trying to disconnect when a gadget driver isn't loaded. Fix this bug: ~# echo disconnect > soft_connect [ 33.685743] Unable to handle kernel NULL pointer dereference at virtual address 00000014 [ 33.694221] pgd = ed0cc000 [ 33.697174] [00000014] *pgd=ae351831, *pte=00000000, *ppte=00000000 [ 33.703766] Internal error: Oops: 17 [#1] SMP ARM [ 33.708697] Modules linked in: xhci_plat_hcd xhci_hcd snd_soc_davinci_mcasp snd_soc_tlv320aic3x snd_soc_edma snd_soc_omap snd_soc_evm snd_soc_core dwc3 snd_compress snd_pcm_dmaengine snd_pcm snd_timer snd lis3lv02d_i2c matrix_keypad lis3lv02d dwc3_omap input_polldev soundcore [ 33.734372] CPU: 0 PID: 1457 Comm: bash Not tainted 3.17.0-09740-ga93416e-dirty #345 [ 33.742457] task: ee71ce00 ti: ee68a000 task.ti: ee68a000 [ 33.748116] PC is at usb_udc_softconn_store+0xa4/0xec [ 33.753416] LR is at mark_held_locks+0x78/0x90 [ 33.758057] pc : [<c04df128>] lr : [<c00896a4>] psr: 20000013 [ 33.758057] sp : ee68bec8 ip : c0c00008 fp : ee68bee4 [ 33.770050] r10: ee6b394c r9 : ee68bf80 r8 : ee6062c0 [ 33.775508] r7 : 00000000 r6 : ee6062c0 r5 : 0000000b r4 : ee739408 [ 33.782346] r3 : 00000000 r2 : 00000000 r1 : ee71d390 r0 : ee664170 [ 33.789168] Flags: nzCv IRQs on FIQs on Mode SVC_32 ISA ARM Segment user [ 33.796636] Control: 10c5387d Table: ad0cc059 DAC: 00000015 [ 33.802638] Process bash (pid: 1457, stack limit = 0xee68a248) [ 33.808740] Stack: (0xee68bec8 to 0xee68c000) [ 33.813299] bec0: 0000000b c0411284 ee6062c0 00000000 ee68bef4 ee68bee8 [ 33.821862] bee0: c04112ac c04df090 ee68bf14 ee68bef8 c01c2868 c0411290 0000000b ee6b3940 [ 33.830419] bf00: 00000000 00000000 ee68bf4c ee68bf18 c01c1a24 c01c2818 00000000 00000000 [ 33.838990] bf20: ee61b940 ee2f47c0 0000000b 000ce408 ee68bf80 c000f304 ee68a000 00000000 [ 33.847544] bf40: ee68bf7c ee68bf50 c0152dd8 c01c1960 ee68bf7c c0170af8 ee68bf7c ee2f47c0 [ 33.856099] bf60: ee2f47c0 000ce408 0000000b c000f304 ee68bfa4 ee68bf80 c0153330 c0152d34 [ 33.864653] bf80: 00000000 00000000 0000000b 000ce408 b6e7fb50 00000004 00000000 ee68bfa8 [ 33.873204] bfa0: c000f080 c01532e8 0000000b 000ce408 00000001 000ce408 0000000b 00000000 [ 33.881763] bfc0: 0000000b 000ce408 b6e7fb50 00000004 0000000b 00000000 000c5758 00000000 [ 33.890319] bfe0: 00000000 bec2c924 b6de422d b6e1d226 40000030 00000001 75716d2f 00657565 [ 33.898890] [<c04df128>] (usb_udc_softconn_store) from [<c04112ac>] (dev_attr_store+0x28/0x34) [ 33.907920] [<c04112ac>] (dev_attr_store) from [<c01c2868>] (sysfs_kf_write+0x5c/0x60) [ 33.916200] [<c01c2868>] (sysfs_kf_write) from [<c01c1a24>] (kernfs_fop_write+0xd0/0x194) [ 33.924773] [<c01c1a24>] (kernfs_fop_write) from [<c0152dd8>] (vfs_write+0xb0/0x1bc) [ 33.932874] [<c0152dd8>] (vfs_write) from [<c0153330>] (SyS_write+0x54/0xb0) [ 33.940247] [<c0153330>] (SyS_write) from [<c000f080>] (ret_fast_syscall+0x0/0x48) [ 33.948160] Code: e1a01007 e12fff33 e5140004 e5143008 (e5933014) [ 33.954625] ---[ end trace f849bead94eab7ea ]--- Fixes: 2ccea03 (usb: gadget: introduce UDC Class) Signed-off-by: Felipe Balbi <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
commit e4a60d139060975eb956717e4f63ae348d4d8cc5 upstream. There is a race condition when removing glue directory. It can be reproduced in following test: path 1: Add first child device device_add() get_device_parent() /*find parent from glue_dirs.list*/ list_for_each_entry(k, &dev->class->p->glue_dirs.list, entry) if (k->parent == parent_kobj) { kobj = kobject_get(k); break; } .... class_dir_create_and_add() path2: Remove last child device under glue dir device_del() cleanup_device_parent() cleanup_glue_dir() kobject_put(glue_dir); If path2 has been called cleanup_glue_dir(), but not call kobject_put(glue_dir), the glue dir is still in parent's kset list. Meanwhile, path1 find the glue dir from the glue_dirs.list. Path2 may release glue dir before path1 call kobject_get(). So kernel will report the warning and bug_on. This is a "classic" problem we have of a kref in a list that can be found while the last instance could be removed at the same time. This patch reuse gdp_mutex to fix this race condition. The following calltrace is captured in kernel 3.4, but the latest kernel still has this bug. ----------------------------------------------------- <4>[ 3965.441471] WARNING: at ...include/linux/kref.h:41 kobject_get+0x33/0x40() <4>[ 3965.441474] Hardware name: Romley <4>[ 3965.441475] Modules linked in: isd_iop(O) isd_xda(O)... ... <4>[ 3965.441605] Call Trace: <4>[ 3965.441611] [<ffffffff8103717a>] warn_slowpath_common+0x7a/0xb0 <4>[ 3965.441615] [<ffffffff810371c5>] warn_slowpath_null+0x15/0x20 <4>[ 3965.441618] [<ffffffff81215963>] kobject_get+0x33/0x40 <4>[ 3965.441624] [<ffffffff812d1e45>] get_device_parent.isra.11+0x135/0x1f0 <4>[ 3965.441627] [<ffffffff812d22d4>] device_add+0xd4/0x6d0 <4>[ 3965.441631] [<ffffffff812d0dbc>] ? dev_set_name+0x3c/0x40 .... <2>[ 3965.441912] kernel BUG at ..../fs/sysfs/group.c:65! <4>[ 3965.441915] invalid opcode: 0000 [#1] SMP ... <4>[ 3965.686743] [<ffffffff811a677e>] sysfs_create_group+0xe/0x10 <4>[ 3965.686748] [<ffffffff810cfb04>] blk_trace_init_sysfs+0x14/0x20 <4>[ 3965.686753] [<ffffffff811fcabb>] blk_register_queue+0x3b/0x120 <4>[ 3965.686756] [<ffffffff812030bc>] add_disk+0x1cc/0x490 .... ------------------------------------------------------- Signed-off-by: Yijing Wang <[email protected]> Signed-off-by: Weng Meiling <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
…ction (cherry-pick from commit 81da9b1) There is no point in overriding the size class below. It causes fatal corruption on the next chunk on the 3264-bytes size class, which is the last size class that is not huge. For example, if the requested size was exactly 3264 bytes, current zsmalloc allocates and returns a chunk from the size class of 3264 bytes, not 4096. User access to this chunk may overwrite head of the next adjacent chunk. Here is the panic log captured when freelist was corrupted due to this: Kernel BUG at ffffffc00030659c [verbose debug info unavailable] Internal error: Oops - BUG: 96000006 [#1] PREEMPT SMP Modules linked in: exynos-snapshot: core register saved(CPU:5) CPUMERRSR: 0000000000000000, L2MERRSR: 0000000000000000 exynos-snapshot: context saved(CPU:5) exynos-snapshot: item - log_kevents is disabled CPU: 5 PID: 898 Comm: kswapd0 Not tainted 3.10.61-4497415-eng #1 task: ffffffc0b8783d80 ti: ffffffc0b71e8000 task.ti: ffffffc0b71e8000 PC is at obj_idx_to_offset+0x0/0x1c LR is at obj_malloc+0x44/0xe8 pc : [<ffffffc00030659c>] lr : [<ffffffc000306604>] pstate: a0000045 sp : ffffffc0b71eb790 x29: ffffffc0b71eb790 x28: ffffffc00204c000 x27: 000000000001d96f x26: 0000000000000000 x25: ffffffc098cc3500 x24: ffffffc0a13f2810 x23: ffffffc098cc3501 x22: ffffffc0a13f2800 x21: 000011e1a02006e3 x20: ffffffc0a13f2800 x19: ffffffbc02a7e000 x18: 0000000000000000 x17: 0000000000000000 x16: 0000000000000feb x15: 0000000000000000 x14: 00000000a01003e3 x13: 0000000000000020 x12: fffffffffffffff0 x11: ffffffc08b264000 x10: 00000000e3a01004 x9 : ffffffc08b263fea x8 : ffffffc0b1e611c0 x7 : ffffffc000307d24 x6 : 0000000000000000 x5 : 0000000000000038 x4 : 000000000000011e x3 : ffffffbc00003e90 x2 : 0000000000000cc0 x1 : 00000000d0100371 x0 : ffffffbc00003e90 Bug: 25951511 Change-Id: I0c82f61aa779ddf906212ab6e47e16c088fe683c Reported-by: Sooyong Suk <[email protected]> Signed-off-by: Heesub Shin <[email protected]> Tested-by: Sooyong Suk <[email protected]> Acked-by: Minchan Kim <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
commit 3017f07 upstream. When walk_page_range walk a memory map's page tables, it'll skip VM_PFNMAP area, then variable 'next' will to assign to vma->vm_end, it maybe larger than 'end'. In next loop, 'addr' will be larger than 'next'. Then in /proc/XXXX/pagemap file reading procedure, the 'addr' will growing forever in pagemap_pte_range, pte_to_pagemap_entry will access the wrong pte. BUG: Bad page map in process procrank pte:8437526f pmd:785de067 addr:9108d000 vm_flags:00200073 anon_vma:f0d99020 mapping: (null) index:9108d CPU: 1 PID: 4974 Comm: procrank Tainted: G B W O 3.10.1+ #1 Call Trace: dump_stack+0x16/0x18 print_bad_pte+0x114/0x1b0 vm_normal_page+0x56/0x60 pagemap_pte_range+0x17a/0x1d0 walk_page_range+0x19e/0x2c0 pagemap_read+0x16e/0x200 vfs_read+0x84/0x150 SyS_read+0x4a/0x80 syscall_call+0x7/0xb Signed-off-by: Liu ShuoX <[email protected]> Signed-off-by: Chen LinX <[email protected]> Acked-by: Kirill A. Shutemov <[email protected]> Reviewed-by: Naoya Horiguchi <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
commit d6785d9152147596f60234157da2b02540c3e60f upstream. Running the following command: busybox cat /sys/kernel/debug/tracing/trace_pipe > /dev/null with any tracing enabled pretty very quickly leads to various NULL pointer dereferences and VM BUG_ON()s, such as these: BUG: unable to handle kernel NULL pointer dereference at 0000000000000020 IP: [<ffffffff8119df6c>] generic_pipe_buf_release+0xc/0x40 Call Trace: [<ffffffff811c48a3>] splice_direct_to_actor+0x143/0x1e0 [<ffffffff811c42e0>] ? generic_pipe_buf_nosteal+0x10/0x10 [<ffffffff811c49cf>] do_splice_direct+0x8f/0xb0 [<ffffffff81196869>] do_sendfile+0x199/0x380 [<ffffffff81197600>] SyS_sendfile64+0x90/0xa0 [<ffffffff8192cbee>] entry_SYSCALL_64_fastpath+0x12/0x6d page dumped because: VM_BUG_ON_PAGE(atomic_read(&page->_count) == 0) kernel BUG at include/linux/mm.h:367! invalid opcode: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC RIP: [<ffffffff8119df9c>] generic_pipe_buf_release+0x3c/0x40 Call Trace: [<ffffffff811c48a3>] splice_direct_to_actor+0x143/0x1e0 [<ffffffff811c42e0>] ? generic_pipe_buf_nosteal+0x10/0x10 [<ffffffff811c49cf>] do_splice_direct+0x8f/0xb0 [<ffffffff81196869>] do_sendfile+0x199/0x380 [<ffffffff81197600>] SyS_sendfile64+0x90/0xa0 [<ffffffff8192cd1e>] tracesys_phase2+0x84/0x89 (busybox's cat uses sendfile(2), unlike the coreutils version) This is because tracing_splice_read_pipe() can call splice_to_pipe() with spd->nr_pages == 0. spd_pages underflows in splice_to_pipe() and we fill the page pointers and the other fields of the pipe_buffers with garbage. All other callers of splice_to_pipe() avoid calling it when nr_pages == 0, and we could make tracing_splice_read_pipe() do that too, but it seems reasonable to have splice_to_page() handle this condition gracefully. Cc: [email protected] Signed-off-by: Rabin Vincent <[email protected]> Reviewed-by: Christoph Hellwig <[email protected]> Signed-off-by: Al Viro <[email protected]> Signed-off-by: Willy Tarreau <[email protected]>
…antiated commit 3c2e2266a5bd2d1cef258e6e54dca1d99946379f upstream. arm:pxa_defconfig can result in the following crash if the max1111 driver is not instantiated. Unhandled fault: page domain fault (0x01b) at 0x00000000 pgd = c0004000 [00000000] *pgd=00000000 Internal error: : 1b [#1] PREEMPT ARM Modules linked in: CPU: 0 PID: 300 Comm: kworker/0:1 Not tainted 4.5.0-01301-g1701f680407c #10 Hardware name: SHARP Akita Workqueue: events sharpsl_charge_toggle task: c390a000 ti: c391e000 task.ti: c391e000 PC is at max1111_read_channel+0x20/0x30 LR is at sharpsl_pm_pxa_read_max1111+0x2c/0x3c pc : [<c03aaab0>] lr : [<c0024b50>] psr: 20000013 ... [<c03aaab0>] (max1111_read_channel) from [<c0024b50>] (sharpsl_pm_pxa_read_max1111+0x2c/0x3c) [<c0024b50>] (sharpsl_pm_pxa_read_max1111) from [<c00262e0>] (spitzpm_read_devdata+0x5c/0xc4) [<c00262e0>] (spitzpm_read_devdata) from [<c0024094>] (sharpsl_check_battery_temp+0x78/0x110) [<c0024094>] (sharpsl_check_battery_temp) from [<c0024f9c>] (sharpsl_charge_toggle+0x48/0x110) [<c0024f9c>] (sharpsl_charge_toggle) from [<c004429c>] (process_one_work+0x14c/0x48c) [<c004429c>] (process_one_work) from [<c0044618>] (worker_thread+0x3c/0x5d4) [<c0044618>] (worker_thread) from [<c004a238>] (kthread+0xd0/0xec) [<c004a238>] (kthread) from [<c000a670>] (ret_from_fork+0x14/0x24) This can occur because the SPI controller driver (SPI_PXA2XX) is built as module and thus not necessarily loaded. While building SPI_PXA2XX into the kernel would make the problem disappear, it appears prudent to ensure that the driver is instantiated before accessing its data structures. Cc: Arnd Bergmann <[email protected]> Cc: [email protected] Signed-off-by: Guenter Roeck <[email protected]> Signed-off-by: Willy Tarreau <[email protected]>
…er() commit 894f2fc44f2f3f48c36c973b1123f6ab298be160 upstream. When unexpected situation happened (e.g. tx/rx irq happened while DMAC is used), the usbhsf_pkt_handler() was possible to cause NULL pointer dereference like the followings: Unable to handle kernel NULL pointer dereference at virtual address 00000000 pgd = c0004000 [00000000] *pgd=00000000 Internal error: Oops: 80000007 [#1] SMP ARM Modules linked in: usb_f_acm u_serial g_serial libcomposite CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.5.0-rc6-00842-gac57066-dirty #63 Hardware name: Generic R8A7790 (Flattened Device Tree) task: c0729c00 ti: c0724000 task.ti: c0724000 PC is at 0x0 LR is at usbhsf_pkt_handler+0xac/0x118 pc : [<00000000>] lr : [<c03257e0>] psr: 60000193 sp : c0725db8 ip : 00000000 fp : c0725df4 r10: 00000001 r9 : 00000193 r8 : ef3ccab4 r7 : ef3cca10 r6 : eea4586c r5 : 00000000 r4 : ef19ceb4 r3 : 00000000 r2 : 0000009c r1 : c0725dc4 r0 : ef19ceb4 This patch adds a condition to avoid the dereference. Fixes: e73a989 ("usb: renesas_usbhs: add DMAEngine support") Cc: <[email protected]> # v3.1+ Signed-off-by: Yoshihiro Shimoda <[email protected]> Signed-off-by: Felipe Balbi <[email protected]> Signed-off-by: Willy Tarreau <[email protected]>
commit 6ae645d5fa385f3787bf1723639cd907fe5865e7 upstream. NULL pointer derefence happens when booting with DTB because the platform data for haptic device is not set in supplied data from parent MFD device. The MFD device creates only platform data (from Device Tree) for itself, not for haptic child. Unable to handle kernel NULL pointer dereference at virtual address 0000009c pgd = c0004000 [0000009c] *pgd=00000000 Internal error: Oops: 5 [#1] PREEMPT SMP ARM (max8997_haptic_probe) from [<c03f9cec>] (platform_drv_probe+0x4c/0xb0) (platform_drv_probe) from [<c03f8440>] (driver_probe_device+0x214/0x2c0) (driver_probe_device) from [<c03f8598>] (__driver_attach+0xac/0xb0) (__driver_attach) from [<c03f67ac>] (bus_for_each_dev+0x68/0x9c) (bus_for_each_dev) from [<c03f7a38>] (bus_add_driver+0x1a0/0x218) (bus_add_driver) from [<c03f8db0>] (driver_register+0x78/0xf8) (driver_register) from [<c0101774>] (do_one_initcall+0x90/0x1d8) (do_one_initcall) from [<c0a00dbc>] (kernel_init_freeable+0x15c/0x1fc) (kernel_init_freeable) from [<c06bb5b4>] (kernel_init+0x8/0x114) (kernel_init) from [<c0107938>] (ret_from_fork+0x14/0x3c) Signed-off-by: Marek Szyprowski <[email protected]> Cc: <[email protected]> Fixes: 104594b ("Input: add driver support for MAX8997-haptic") [k.kozlowski: Write commit message, add CC-stable] Signed-off-by: Krzysztof Kozlowski <[email protected]> Signed-off-by: Dmitry Torokhov <[email protected]> Signed-off-by: Willy Tarreau <[email protected]>
commit d8a5094 upstream. We get a NULL pointer dereference on omap3 for thumb2 compiled kernels: Internal error: Oops: 80000005 [#1] SMP THUMB2 ... [<c046497b>] (_raw_spin_unlock_irqrestore) from [<c0024375>] (omap3_enter_idle_bm+0xc5/0x178) [<c0024375>] (omap3_enter_idle_bm) from [<c0374e63>] (cpuidle_enter_state+0x77/0x27c) [<c0374e63>] (cpuidle_enter_state) from [<c00627f1>] (cpu_startup_entry+0x155/0x23c) [<c00627f1>] (cpu_startup_entry) from [<c06b9a47>] (start_kernel+0x32f/0x338) [<c06b9a47>] (start_kernel) from [<8000807f>] (0x8000807f) The power management related assembly on omaps needs to interact with ARM mode bootrom code, so we need to keep most of the related assembly in ARM mode. Turns out this error is because of missing ENDPROC for assembly code as suggested by Stephen Boyd <[email protected]>. Let's fix the problem by adding ENDPROC in two places to sleep34xx.S. Let's also remove the now duplicate custom code for mode switching. This has been unnecessary since commit 6ebbf2c ("ARM: convert all "mov.* pc, reg" to "bx reg" for ARMv6+"). And let's also remove the comments about local variables, they are now just confusing after the ENDPROC. The reason why ENDPROC makes a difference is it sets .type and then the compiler knows what to do with the thumb bit as explained at: https://wiki.ubuntu.com/ARM/Thumb2PortingHowto Reported-by: Kevin Hilman <[email protected]> Tested-by: Kevin Hilman <[email protected]> Signed-off-by: Tony Lindgren <[email protected]> Signed-off-by: Willy Tarreau <[email protected]>
commit b49b927f16acee626c56a1af4ab4cb062f75b5df upstream. We shouldn't be calling clk_prepare_enable()/clk_prepare_disable() in an atomic context. Fixes the following issue: [ 5.830970] ehci-omap: OMAP-EHCI Host Controller driver [ 5.830974] driver_register 'ehci-omap' [ 5.895849] driver_register 'wl1271_sdio' [ 5.896870] BUG: scheduling while atomic: udevd/994/0x00000002 [ 5.896876] 4 locks held by udevd/994: [ 5.896904] #0: (&dev->mutex){......}, at: [<c049597c>] __driver_attach+0x60/0xac [ 5.896923] #1: (&dev->mutex){......}, at: [<c049598c>] __driver_attach+0x70/0xac [ 5.896946] #2: (tll_lock){+.+...}, at: [<c04c2630>] omap_tll_enable+0x2c/0xd0 [ 5.896966] #3: (prepare_lock){+.+...}, at: [<c05ce9c8>] clk_prepare_lock+0x48/0xe0 [ 5.897042] Modules linked in: wlcore_sdio(+) ehci_omap(+) dwc3_omap snd_soc_ts3a225e leds_is31fl319x bq27xxx_battery_i2c tsc2007 bq27xxx_battery bq2429x_charger ina2xx tca8418_keypad as5013 leds_tca6507 twl6040_vibra gpio_twl6040 bmp085_i2c(+) palmas_gpadc usb3503 palmas_pwrbutton bmg160_i2c(+) bmp085 bma150(+) bmg160_core bmp280 input_polldev snd_soc_omap_mcbsp snd_soc_omap_mcpdm snd_soc_omap snd_pcm_dmaengine [ 5.897048] Preemption disabled at:[< (null)>] (null) [ 5.897051] [ 5.897059] CPU: 0 PID: 994 Comm: udevd Not tainted 4.6.0-rc5-letux+ #233 [ 5.897062] Hardware name: Generic OMAP5 (Flattened Device Tree) [ 5.897076] [<c010e714>] (unwind_backtrace) from [<c010af34>] (show_stack+0x10/0x14) [ 5.897087] [<c010af34>] (show_stack) from [<c040aa7c>] (dump_stack+0x88/0xc0) [ 5.897099] [<c040aa7c>] (dump_stack) from [<c020c558>] (__schedule_bug+0xac/0xd0) [ 5.897111] [<c020c558>] (__schedule_bug) from [<c06f3d44>] (__schedule+0x88/0x7e4) [ 5.897120] [<c06f3d44>] (__schedule) from [<c06f46d8>] (schedule+0x9c/0xc0) [ 5.897129] [<c06f46d8>] (schedule) from [<c06f4904>] (schedule_preempt_disabled+0x14/0x20) [ 5.897140] [<c06f4904>] (schedule_preempt_disabled) from [<c06f64e4>] (mutex_lock_nested+0x258/0x43c) [ 5.897150] [<c06f64e4>] (mutex_lock_nested) from [<c05ce9c8>] (clk_prepare_lock+0x48/0xe0) [ 5.897160] [<c05ce9c8>] (clk_prepare_lock) from [<c05d0e7c>] (clk_prepare+0x10/0x28) [ 5.897169] [<c05d0e7c>] (clk_prepare) from [<c04c2668>] (omap_tll_enable+0x64/0xd0) [ 5.897180] [<c04c2668>] (omap_tll_enable) from [<c04c1728>] (usbhs_runtime_resume+0x18/0x17c) [ 5.897192] [<c04c1728>] (usbhs_runtime_resume) from [<c049d404>] (pm_generic_runtime_resume+0x2c/0x40) [ 5.897202] [<c049d404>] (pm_generic_runtime_resume) from [<c049f180>] (__rpm_callback+0x38/0x68) [ 5.897210] [<c049f180>] (__rpm_callback) from [<c049f220>] (rpm_callback+0x70/0x88) [ 5.897218] [<c049f220>] (rpm_callback) from [<c04a0a00>] (rpm_resume+0x4ec/0x7ec) [ 5.897227] [<c04a0a00>] (rpm_resume) from [<c04a0f48>] (__pm_runtime_resume+0x4c/0x64) [ 5.897236] [<c04a0f48>] (__pm_runtime_resume) from [<c04958dc>] (driver_probe_device+0x30/0x70) [ 5.897246] [<c04958dc>] (driver_probe_device) from [<c04959a4>] (__driver_attach+0x88/0xac) [ 5.897256] [<c04959a4>] (__driver_attach) from [<c04940f8>] (bus_for_each_dev+0x50/0x84) [ 5.897267] [<c04940f8>] (bus_for_each_dev) from [<c0494e40>] (bus_add_driver+0xcc/0x1e4) [ 5.897276] [<c0494e40>] (bus_add_driver) from [<c0496914>] (driver_register+0xac/0xf4) [ 5.897286] [<c0496914>] (driver_register) from [<c01018e0>] (do_one_initcall+0x100/0x1b8) [ 5.897296] [<c01018e0>] (do_one_initcall) from [<c01c7a54>] (do_init_module+0x58/0x1c0) [ 5.897304] [<c01c7a54>] (do_init_module) from [<c01c8a3c>] (SyS_finit_module+0x88/0x90) [ 5.897313] [<c01c8a3c>] (SyS_finit_module) from [<c0107120>] (ret_fast_syscall+0x0/0x1c) [ 5.912697] ------------[ cut here ]------------ [ 5.912711] WARNING: CPU: 0 PID: 994 at kernel/sched/core.c:2996 _raw_spin_unlock+0x28/0x58 [ 5.912717] DEBUG_LOCKS_WARN_ON(val > preempt_count()) Reported-by: H. Nikolaus Schaller <[email protected]> Tested-by: H. Nikolaus Schaller <[email protected]> Signed-off-by: Roger Quadros <[email protected]> Signed-off-by: Lee Jones <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Willy Tarreau <[email protected]>
commit 8e96a87c5431c256feb65bcfc5aec92d9f7839b6 upstream. Userspace can quite legitimately perform an exec() syscall with a suspended transaction. exec() does not return to the old process, rather it load a new one and starts that, the expectation therefore is that the new process starts not in a transaction. Currently exec() is not treated any differently to any other syscall which creates problems. Firstly it could allow a new process to start with a suspended transaction for a binary that no longer exists. This means that the checkpointed state won't be valid and if the suspended transaction were ever to be resumed and subsequently aborted (a possibility which is exceedingly likely as exec()ing will likely doom the transaction) the new process will jump to invalid state. Secondly the incorrect attempt to keep the transactional state while still zeroing state for the new process creates at least two TM Bad Things. The first triggers on the rfid to return to userspace as start_thread() has given the new process a 'clean' MSR but the suspend will still be set in the hardware MSR. The second TM Bad Thing triggers in __switch_to() as the processor is still transactionally suspended but __switch_to() wants to zero the TM sprs for the new process. This is an example of the outcome of calling exec() with a suspended transaction. Note the first 700 is likely the first TM bad thing decsribed earlier only the kernel can't report it as we've loaded userspace registers. c000000000009980 is the rfid in fast_exception_return() Bad kernel stack pointer 3fffcfa1a370 at c000000000009980 Oops: Bad kernel stack pointer, sig: 6 [#1] CPU: 0 PID: 2006 Comm: tm-execed Not tainted NIP: c000000000009980 LR: 0000000000000000 CTR: 0000000000000000 REGS: c00000003ffefd40 TRAP: 0700 Not tainted MSR: 8000000300201031 <SF,ME,IR,DR,LE,TM[SE]> CR: 00000000 XER: 00000000 CFAR: c0000000000098b4 SOFTE: 0 PACATMSCRATCH: b00000010000d033 GPR00: 0000000000000000 00003fffcfa1a370 0000000000000000 0000000000000000 GPR04: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 GPR08: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 GPR12: 00003fff966611c0 0000000000000000 0000000000000000 0000000000000000 NIP [c000000000009980] fast_exception_return+0xb0/0xb8 LR [0000000000000000] (null) Call Trace: Instruction dump: f84d0278 e9a100d8 7c7b03a6 e84101a0 7c4ff120 e8410170 7c5a03a6 e8010070 e8410080 e8610088 e8810090 e8210078 <4c000024> 48000000 e8610178 88ed023b Kernel BUG at c000000000043e80 [verbose debug info unavailable] Unexpected TM Bad Thing exception at c000000000043e80 (msr 0x201033) Oops: Unrecoverable exception, sig: 6 [#2] CPU: 0 PID: 2006 Comm: tm-execed Tainted: G D task: c0000000fbea6d80 ti: c00000003ffec000 task.ti: c0000000fb7ec000 NIP: c000000000043e80 LR: c000000000015a24 CTR: 0000000000000000 REGS: c00000003ffef7e0 TRAP: 0700 Tainted: G D MSR: 8000000300201033 <SF,ME,IR,DR,RI,LE,TM[SE]> CR: 28002828 XER: 00000000 CFAR: c000000000015a20 SOFTE: 0 PACATMSCRATCH: b00000010000d033 GPR00: 0000000000000000 c00000003ffefa60 c000000000db5500 c0000000fbead000 GPR04: 8000000300001033 2222222222222222 2222222222222222 00000000ff160000 GPR08: 0000000000000000 800000010000d033 c0000000fb7e3ea0 c00000000fe00004 GPR12: 0000000000002200 c00000000fe00000 0000000000000000 0000000000000000 GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 GPR20: 0000000000000000 0000000000000000 c0000000fbea7410 00000000ff160000 GPR24: c0000000ffe1f600 c0000000fbea8700 c0000000fbea8700 c0000000fbead000 GPR28: c000000000e20198 c0000000fbea6d80 c0000000fbeab680 c0000000fbea6d80 NIP [c000000000043e80] tm_restore_sprs+0xc/0x1c LR [c000000000015a24] __switch_to+0x1f4/0x420 Call Trace: Instruction dump: 7c800164 4e800020 7c0022a6 f80304a8 7c0222a6 f80304b0 7c0122a6 f80304b8 4e800020 e80304a8 7c0023a6 e80304b0 <7c0223a6> e80304b8 7c0123a6 4e800020 This fixes CVE-2016-5828. Fixes: bc2a940 ("powerpc: Hook in new transactional memory code") Cc: [email protected] # v3.9+ Signed-off-by: Cyril Bur <[email protected]> Signed-off-by: Michael Ellerman <[email protected]> Signed-off-by: Willy Tarreau <[email protected]>
commit d14bdb553f9196169f003058ae1cdabe514470e6 upstream. MOV to DR6 or DR7 causes a #GP if an attempt is made to write a 1 to any of bits 63:32. However, this is not detected at KVM_SET_DEBUGREGS time, and the next KVM_RUN oopses: general protection fault: 0000 [#1] SMP CPU: 2 PID: 14987 Comm: a.out Not tainted 4.4.9-300.fc23.x86_64 #1 Hardware name: LENOVO 2325F51/2325F51, BIOS G2ET32WW (1.12 ) 05/30/2012 [...] Call Trace: [<ffffffffa072c93d>] kvm_arch_vcpu_ioctl_run+0x141d/0x14e0 [kvm] [<ffffffffa071405d>] kvm_vcpu_ioctl+0x33d/0x620 [kvm] [<ffffffff81241648>] do_vfs_ioctl+0x298/0x480 [<ffffffff812418a9>] SyS_ioctl+0x79/0x90 [<ffffffff817a0f2e>] entry_SYSCALL_64_fastpath+0x12/0x71 Code: 55 83 ff 07 48 89 e5 77 27 89 ff ff 24 fd 90 87 80 81 0f 23 fe 5d c3 0f 23 c6 5d c3 0f 23 ce 5d c3 0f 23 d6 5d c3 0f 23 de 5d c3 <0f> 23 f6 5d c3 0f 0b 66 66 66 66 66 2e 0f 1f 84 00 00 00 00 00 RIP [<ffffffff810639eb>] native_set_debugreg+0x2b/0x40 RSP <ffff88005836bd50> Testcase (beautified/reduced from syzkaller output): #include <unistd.h> #include <sys/syscall.h> #include <string.h> #include <stdint.h> #include <linux/kvm.h> #include <fcntl.h> #include <sys/ioctl.h> long r[8]; int main() { struct kvm_debugregs dr = { 0 }; r[2] = open("/dev/kvm", O_RDONLY); r[3] = ioctl(r[2], KVM_CREATE_VM, 0); r[4] = ioctl(r[3], KVM_CREATE_VCPU, 7); memcpy(&dr, "\x5d\x6a\x6b\xe8\x57\x3b\x4b\x7e\xcf\x0d\xa1\x72" "\xa3\x4a\x29\x0c\xfc\x6d\x44\x00\xa7\x52\xc7\xd8" "\x00\xdb\x89\x9d\x78\xb5\x54\x6b\x6b\x13\x1c\xe9" "\x5e\xd3\x0e\x40\x6f\xb4\x66\xf7\x5b\xe3\x36\xcb", 48); r[7] = ioctl(r[4], KVM_SET_DEBUGREGS, &dr); r[6] = ioctl(r[4], KVM_RUN, 0); } Reported-by: Dmitry Vyukov <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]> Signed-off-by: Radim Kr�má� <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Willy Tarreau <[email protected]>
commit 8b78f260887df532da529f225c49195d18fef36b upstream. One of the debian buildd servers had this crash in the syslog without any other information: Unaligned handler failed, ret = -2 clock_adjtime (pid 22578): Unaligned data reference (code 28) CPU: 1 PID: 22578 Comm: clock_adjtime Tainted: G E 4.5.0-2-parisc64-smp #1 Debian 4.5.4-1 task: 000000007d9960f8 ti: 00000001bde7c000 task.ti: 00000001bde7c000 YZrvWESTHLNXBCVMcbcbcbcbOGFRQPDI PSW: 00001000000001001111100000001111 Tainted: G E r00-03 000000ff0804f80f 00000001bde7c2b0 00000000402d2be8 00000001bde7c2b0 r04-07 00000000409e1fd0 00000000fa6f7fff 00000001bde7c148 00000000fa6f7fff r08-11 0000000000000000 00000000ffffffff 00000000fac9bb7b 000000000002b4d4 r12-15 000000000015241c 000000000015242c 000000000000002d 00000000fac9bb7b r16-19 0000000000028800 0000000000000001 0000000000000070 00000001bde7c218 r20-23 0000000000000000 00000001bde7c210 0000000000000002 0000000000000000 r24-27 0000000000000000 0000000000000000 00000001bde7c148 00000000409e1fd0 r28-31 0000000000000001 00000001bde7c320 00000001bde7c350 00000001bde7c218 sr00-03 0000000001200000 0000000001200000 0000000000000000 0000000001200000 sr04-07 0000000000000000 0000000000000000 0000000000000000 0000000000000000 IASQ: 0000000000000000 0000000000000000 IAOQ: 00000000402d2e84 00000000402d2e88 IIR: 0ca0d089 ISR: 0000000001200000 IOR: 00000000fa6f7fff CPU: 1 CR30: 00000001bde7c000 CR31: ffffffffffffffff ORIG_R28: 00000002369fe628 IAOQ[0]: compat_get_timex+0x2dc/0x3c0 IAOQ[1]: compat_get_timex+0x2e0/0x3c0 RP(r2): compat_get_timex+0x40/0x3c0 Backtrace: [<00000000402d4608>] compat_SyS_clock_adjtime+0x40/0xc0 [<0000000040205024>] syscall_exit+0x0/0x14 This means the userspace program clock_adjtime called the clock_adjtime() syscall and then crashed inside the compat_get_timex() function. Syscalls should never crash programs, but instead return EFAULT. The IIR register contains the executed instruction, which disassebles into "ldw 0(sr3,r5),r9". This load-word instruction is part of __get_user() which tried to read the word at %r5/IOR (0xfa6f7fff). This means the unaligned handler jumped in. The unaligned handler is able to emulate all ldw instructions, but it fails if it fails to read the source e.g. because of page fault. The following program reproduces the problem: #define _GNU_SOURCE #include <unistd.h> #include <sys/syscall.h> #include <sys/mman.h> int main(void) { /* allocate 8k */ char *ptr = mmap(NULL, 2*4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0); /* free second half (upper 4k) and make it invalid. */ munmap(ptr+4096, 4096); /* syscall where first int is unaligned and clobbers into invalid memory region */ /* syscall should return EFAULT */ return syscall(__NR_clock_adjtime, 0, ptr+4095); } To fix this issue we simply need to check if the faulting instruction address is in the exception fixup table when the unaligned handler failed. If it is, call the fixup routine instead of crashing. While looking at the unaligned handler I found another issue as well: The target register should not be modified if the handler was unsuccessful. Signed-off-by: Helge Deller <[email protected]> Cc: [email protected] Signed-off-by: Willy Tarreau <[email protected]>
commit fe7a7c57629e8dcbc0e297363a9b2366d67a6dc5 upstream. Currently, the mesh paths associated with a nexthop station are cleaned up in the following code path: __sta_info_destroy_part1 synchronize_net() __sta_info_destroy_part2 -> cleanup_single_sta -> mesh_sta_cleanup -> mesh_plink_deactivate -> mesh_path_flush_by_nexthop However, there are a couple of problems here: 1) the paths aren't flushed at all if the MPM is running in userspace (e.g. when using wpa_supplicant or authsae) 2) there is no synchronize_rcu between removing the path and readers accessing the nexthop, which means the following race is possible: CPU0 CPU1 ~~~~ ~~~~ sta_info_destroy_part1() synchronize_net() rcu_read_lock() mesh_nexthop_resolve() mpath = mesh_path_lookup() [...] -> mesh_path_flush_by_nexthop() sta = rcu_dereference( mpath->next_hop) kfree(sta) access sta <-- CRASH Fix both of these by unconditionally flushing paths before destroying the sta, and by adding a synchronize_net() after path flush to ensure no active readers can still dereference the sta. Fixes this crash: [ 348.529295] BUG: unable to handle kernel paging request at 00020040 [ 348.530014] IP: [<f929245d>] ieee80211_mps_set_frame_flags+0x40/0xaa [mac80211] [ 348.530014] *pde = 00000000 [ 348.530014] Oops: 0000 [#1] PREEMPT [ 348.530014] Modules linked in: drbg ansi_cprng ctr ccm ppp_generic slhc ipt_MASQUERADE nf_nat_masquerade_ipv4 8021q ] [ 348.530014] CPU: 0 PID: 20597 Comm: wget Tainted: G O 4.6.0-rc5-wt=V1 #1 [ 348.530014] Hardware name: To Be Filled By O.E.M./To be filled by O.E.M., BIOS 080016 11/07/2014 [ 348.530014] task: f64fa280 ti: f4f9c000 task.ti: f4f9c000 [ 348.530014] EIP: 0060:[<f929245d>] EFLAGS: 00010246 CPU: 0 [ 348.530014] EIP is at ieee80211_mps_set_frame_flags+0x40/0xaa [mac80211] [ 348.530014] EAX: f4ce63e0 EBX: 00000088 ECX: f3788416 EDX: 00020008 [ 348.530014] ESI: 00000000 EDI: 00000088 EBP: f6409a4c ESP: f6409a40 [ 348.530014] DS: 007b ES: 007b FS: 0000 GS: 0033 SS: 0068 [ 348.530014] CR0: 80050033 CR2: 00020040 CR3: 33190000 CR4: 00000690 [ 348.530014] Stack: [ 348.530014] 00000000 f4ce63e0 f5f9bd80 f6409a64 f9291d80 0000ce67 f5d51e00 f4ce63e0 [ 348.530014] f3788416 f6409a80 f9291dc1 f4ce8320 f4ce63e0 f5d51e00 f4ce63e0 f4ce8320 [ 348.530014] f6409a98 f9277f6f 00000000 00000000 0000007c 00000000 f6409b2c f9278dd1 [ 348.530014] Call Trace: [ 348.530014] [<f9291d80>] mesh_nexthop_lookup+0xbb/0xc8 [mac80211] [ 348.530014] [<f9291dc1>] mesh_nexthop_resolve+0x34/0xd8 [mac80211] [ 348.530014] [<f9277f6f>] ieee80211_xmit+0x92/0xc1 [mac80211] [ 348.530014] [<f9278dd1>] __ieee80211_subif_start_xmit+0x807/0x83c [mac80211] [ 348.530014] [<c04df012>] ? sch_direct_xmit+0xd7/0x1b3 [ 348.530014] [<c022a8c6>] ? __local_bh_enable_ip+0x5d/0x7b [ 348.530014] [<f956870c>] ? nf_nat_ipv4_out+0x4c/0xd0 [nf_nat_ipv4] [ 348.530014] [<f957e036>] ? iptable_nat_ipv4_fn+0xf/0xf [iptable_nat] [ 348.530014] [<c04c6f45>] ? netif_skb_features+0x14d/0x30a [ 348.530014] [<f9278e10>] ieee80211_subif_start_xmit+0xa/0xe [mac80211] [ 348.530014] [<c04c769c>] dev_hard_start_xmit+0x1f8/0x267 [ 348.530014] [<c04c7261>] ? validate_xmit_skb.isra.120.part.121+0x10/0x253 [ 348.530014] [<c04defc6>] sch_direct_xmit+0x8b/0x1b3 [ 348.530014] [<c04c7a9c>] __dev_queue_xmit+0x2c8/0x513 [ 348.530014] [<c04c7cfb>] dev_queue_xmit+0xa/0xc [ 348.530014] [<f91bfc7a>] batadv_send_skb_packet+0xd6/0xec [batman_adv] [ 348.530014] [<f91bfdc4>] batadv_send_unicast_skb+0x15/0x4a [batman_adv] [ 348.530014] [<f91b5938>] batadv_dat_send_data+0x27e/0x310 [batman_adv] [ 348.530014] [<f91c30b5>] ? batadv_tt_global_hash_find.isra.11+0x8/0xa [batman_adv] [ 348.530014] [<f91b63f3>] batadv_dat_snoop_outgoing_arp_request+0x208/0x23d [batman_adv] [ 348.530014] [<f91c0cd9>] batadv_interface_tx+0x206/0x385 [batman_adv] [ 348.530014] [<c04c769c>] dev_hard_start_xmit+0x1f8/0x267 [ 348.530014] [<c04c7261>] ? validate_xmit_skb.isra.120.part.121+0x10/0x253 [ 348.530014] [<c04defc6>] sch_direct_xmit+0x8b/0x1b3 [ 348.530014] [<c04c7a9c>] __dev_queue_xmit+0x2c8/0x513 [ 348.530014] [<f80cbd2a>] ? igb_xmit_frame+0x57/0x72 [igb] [ 348.530014] [<c04c7cfb>] dev_queue_xmit+0xa/0xc [ 348.530014] [<f843a326>] br_dev_queue_push_xmit+0xeb/0xfb [bridge] [ 348.530014] [<f843a35f>] br_forward_finish+0x29/0x74 [bridge] [ 348.530014] [<f843a23b>] ? deliver_clone+0x3b/0x3b [bridge] [ 348.530014] [<f843a714>] __br_forward+0x89/0xe7 [bridge] [ 348.530014] [<f843a336>] ? br_dev_queue_push_xmit+0xfb/0xfb [bridge] [ 348.530014] [<f843a234>] deliver_clone+0x34/0x3b [bridge] [ 348.530014] [<f843a68b>] ? br_flood+0x95/0x95 [bridge] [ 348.530014] [<f843a66d>] br_flood+0x77/0x95 [bridge] [ 348.530014] [<f843a809>] br_flood_forward+0x13/0x1a [bridge] [ 348.530014] [<f843a68b>] ? br_flood+0x95/0x95 [bridge] [ 348.530014] [<f843b877>] br_handle_frame_finish+0x392/0x3db [bridge] [ 348.530014] [<c04e9b2b>] ? nf_iterate+0x2b/0x6b [ 348.530014] [<f843baa6>] br_handle_frame+0x1e6/0x240 [bridge] [ 348.530014] [<f843b4e5>] ? br_handle_local_finish+0x6a/0x6a [bridge] [ 348.530014] [<c04c4ba0>] __netif_receive_skb_core+0x43a/0x66b [ 348.530014] [<f843b8c0>] ? br_handle_frame_finish+0x3db/0x3db [bridge] [ 348.530014] [<c023cea4>] ? resched_curr+0x19/0x37 [ 348.530014] [<c0240707>] ? check_preempt_wakeup+0xbf/0xfe [ 348.530014] [<c0255dec>] ? ktime_get_with_offset+0x5c/0xfc [ 348.530014] [<c04c4fc1>] __netif_receive_skb+0x47/0x55 [ 348.530014] [<c04c57ba>] netif_receive_skb_internal+0x40/0x5a [ 348.530014] [<c04c61ef>] napi_gro_receive+0x3a/0x94 [ 348.530014] [<f80ce8d5>] igb_poll+0x6fd/0x9ad [igb] [ 348.530014] [<c0242bd8>] ? swake_up_locked+0x14/0x26 [ 348.530014] [<c04c5d29>] net_rx_action+0xde/0x250 [ 348.530014] [<c022a743>] __do_softirq+0x8a/0x163 [ 348.530014] [<c022a6b9>] ? __hrtimer_tasklet_trampoline+0x19/0x19 [ 348.530014] [<c021100f>] do_softirq_own_stack+0x26/0x2c [ 348.530014] <IRQ> [ 348.530014] [<c022a957>] irq_exit+0x31/0x6f [ 348.530014] [<c0210eb2>] do_IRQ+0x8d/0xa0 [ 348.530014] [<c058152c>] common_interrupt+0x2c/0x40 [ 348.530014] Code: e7 8c 00 66 81 ff 88 00 75 12 85 d2 75 0e b2 c3 b8 83 e9 29 f9 e8 a7 5f f9 c6 eb 74 66 81 e3 8c 005 [ 348.530014] EIP: [<f929245d>] ieee80211_mps_set_frame_flags+0x40/0xaa [mac80211] SS:ESP 0068:f6409a40 [ 348.530014] CR2: 0000000000020040 [ 348.530014] ---[ end trace 48556ac26779732e ]--- [ 348.530014] Kernel panic - not syncing: Fatal exception in interrupt [ 348.530014] Kernel Offset: disabled Reported-by: Fred Veldini <[email protected]> Tested-by: Fred Veldini <[email protected]> Signed-off-by: Bob Copeland <[email protected]> Signed-off-by: Johannes Berg <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Willy Tarreau <[email protected]>
…ut event commit 635682a upstream. A case can occur when sctp_accept() is called by the user during a heartbeat timeout event after the 4-way handshake. Since sctp_assoc_migrate() changes both assoc->base.sk and assoc->ep, the bh_sock_lock in sctp_generate_heartbeat_event() will be taken with the listening socket but released with the new association socket. The result is a deadlock on any future attempts to take the listening socket lock. Note that this race can occur with other SCTP timeouts that take the bh_lock_sock() in the event sctp_accept() is called. BUG: soft lockup - CPU#9 stuck for 67s! [swapper:0] ... RIP: 0010:[<ffffffff8152d48e>] [<ffffffff8152d48e>] _spin_lock+0x1e/0x30 RSP: 0018:ffff880028323b20 EFLAGS: 00000206 RAX: 0000000000000002 RBX: ffff880028323b20 RCX: 0000000000000000 RDX: 0000000000000000 RSI: ffff880028323be0 RDI: ffff8804632c4b48 RBP: ffffffff8100bb93 R08: 0000000000000000 R09: 0000000000000000 R10: ffff880610662280 R11: 0000000000000100 R12: ffff880028323aa0 R13: ffff8804383c3880 R14: ffff880028323a90 R15: ffffffff81534225 FS: 0000000000000000(0000) GS:ffff880028320000(0000) knlGS:0000000000000000 CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b CR2: 00000000006df528 CR3: 0000000001a85000 CR4: 00000000000006e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Process swapper (pid: 0, threadinfo ffff880616b70000, task ffff880616b6cab0) Stack: ffff880028323c40 ffffffffa01c2582 ffff880614cfb020 0000000000000000 <d> 0100000000000000 00000014383a6c44 ffff8804383c3880 ffff880614e93c00 <d> ffff880614e93c00 0000000000000000 ffff8804632c4b00 ffff8804383c38b8 Call Trace: <IRQ> [<ffffffffa01c2582>] ? sctp_rcv+0x492/0xa10 [sctp] [<ffffffff8148c559>] ? nf_iterate+0x69/0xb0 [<ffffffff814974a0>] ? ip_local_deliver_finish+0x0/0x2d0 [<ffffffff8148c716>] ? nf_hook_slow+0x76/0x120 [<ffffffff814974a0>] ? ip_local_deliver_finish+0x0/0x2d0 [<ffffffff8149757d>] ? ip_local_deliver_finish+0xdd/0x2d0 [<ffffffff81497808>] ? ip_local_deliver+0x98/0xa0 [<ffffffff81496ccd>] ? ip_rcv_finish+0x12d/0x440 [<ffffffff81497255>] ? ip_rcv+0x275/0x350 [<ffffffff8145cfeb>] ? __netif_receive_skb+0x4ab/0x750 ... With lockdep debugging: ===================================== [ BUG: bad unlock balance detected! ] ------------------------------------- CslRx/12087 is trying to release lock (slock-AF_INET) at: [<ffffffffa01bcae0>] sctp_generate_timeout_event+0x40/0xe0 [sctp] but there are no more locks to release! other info that might help us debug this: 2 locks held by CslRx/12087: #0: (&asoc->timers[i]){+.-...}, at: [<ffffffff8108ce1f>] run_timer_softirq+0x16f/0x3e0 #1: (slock-AF_INET){+.-...}, at: [<ffffffffa01bcac3>] sctp_generate_timeout_event+0x23/0xe0 [sctp] Ensure the socket taken is also the same one that is released by saving a copy of the socket before entering the timeout event critical section. Signed-off-by: Karl Heiss <[email protected]> Signed-off-by: David S. Miller <[email protected]> [wt: adjusted, 3.10 uses sctp_bh_unlock_sock() instead of bh_lock_sock()] Signed-off-by: Willy Tarreau <[email protected]>
commit d3e6952cfb7ba5f4bfa29d4803ba91f96ce1204d upstream. I ran into this: kasan: CONFIG_KASAN_INLINE enabled kasan: GPF could be caused by NULL-ptr deref or user memory access general protection fault: 0000 [#1] PREEMPT SMP KASAN CPU: 2 PID: 2012 Comm: trinity-c3 Not tainted 4.7.0-rc7+ #19 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Ubuntu-1.8.2-1ubuntu1 04/01/2014 task: ffff8800b745f2c0 ti: ffff880111740000 task.ti: ffff880111740000 RIP: 0010:[<ffffffff82bbf066>] [<ffffffff82bbf066>] irttp_connect_request+0x36/0x710 RSP: 0018:ffff880111747bb8 EFLAGS: 00010286 RAX: dffffc0000000000 RBX: 0000000000000000 RCX: 0000000069dd8358 RDX: 0000000000000009 RSI: 0000000000000027 RDI: 0000000000000048 RBP: ffff880111747c00 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000069dd8358 R11: 1ffffffff0759723 R12: 0000000000000000 R13: ffff88011a7e4780 R14: 0000000000000027 R15: 0000000000000000 FS: 00007fc738404700(0000) GS:ffff88011af00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007fc737fdfb10 CR3: 0000000118087000 CR4: 00000000000006e0 Stack: 0000000000000200 ffff880111747bd8 ffffffff810ee611 ffff880119f1f220 ffff880119f1f4f8 ffff880119f1f4f0 ffff88011a7e4780 ffff880119f1f232 ffff880119f1f220 ffff880111747d58 ffffffff82bca542 0000000000000000 Call Trace: [<ffffffff82bca542>] irda_connect+0x562/0x1190 [<ffffffff825ae582>] SYSC_connect+0x202/0x2a0 [<ffffffff825b4489>] SyS_connect+0x9/0x10 [<ffffffff8100334c>] do_syscall_64+0x19c/0x410 [<ffffffff83295ca5>] entry_SYSCALL64_slow_path+0x25/0x25 Code: 41 89 ca 48 89 e5 41 57 41 56 41 55 41 54 41 89 d7 53 48 89 fb 48 83 c7 48 48 89 fa 41 89 f6 48 c1 ea 03 48 83 ec 20 4c 8b 65 10 <0f> b6 04 02 84 c0 74 08 84 c0 0f 8e 4c 04 00 00 80 7b 48 00 74 RIP [<ffffffff82bbf066>] irttp_connect_request+0x36/0x710 RSP <ffff880111747bb8> ---[ end trace 4cda2588bc055b30 ]--- The problem is that irda_open_tsap() can fail and leave self->tsap = NULL, and then irttp_connect_request() almost immediately dereferences it. Cc: [email protected] Signed-off-by: Vegard Nossum <[email protected]> Signed-off-by: David S. Miller <[email protected]> Signed-off-by: Willy Tarreau <[email protected]>
commit 9f2dfda2f2f1c6181c3732c16b85c59ab2d195e0 upstream. An inverted return value check in hostfs_mknod() caused the function to return success after handling it as an error (and cleaning up). It resulted in the following segfault when trying to bind() a named unix socket: Pid: 198, comm: a.out Not tainted 4.4.0-rc4 RIP: 0033:[<0000000061077df6>] RSP: 00000000daae5d60 EFLAGS: 00010202 RAX: 0000000000000000 RBX: 000000006092a460 RCX: 00000000dfc54208 RDX: 0000000061073ef1 RSI: 0000000000000070 RDI: 00000000e027d600 RBP: 00000000daae5de0 R08: 00000000da980ac0 R09: 0000000000000000 R10: 0000000000000003 R11: 00007fb1ae08f72a R12: 0000000000000000 R13: 000000006092a460 R14: 00000000daaa97c0 R15: 00000000daaa9a88 Kernel panic - not syncing: Kernel mode fault at addr 0x40, ip 0x61077df6 CPU: 0 PID: 198 Comm: a.out Not tainted 4.4.0-rc4 #1 Stack: e027d620 dfc54208 0000006f da981398 61bee000 0000c1ed daae5de0 0000006e e027d620 dfcd4208 00000005 6092a460 Call Trace: [<60dedc67>] SyS_bind+0xf7/0x110 [<600587be>] handle_syscall+0x7e/0x80 [<60066ad7>] userspace+0x3e7/0x4e0 [<6006321f>] ? save_registers+0x1f/0x40 [<6006c88e>] ? arch_prctl+0x1be/0x1f0 [<60054985>] fork_handler+0x85/0x90 Let's also get rid of the "cosmic ray protection" while we're at it. Fixes: e919305 "hostfs: fix races in dentry_name() and inode_name()" Signed-off-by: Vegard Nossum <[email protected]> Cc: Jeff Dike <[email protected]> Cc: Al Viro <[email protected]> Signed-off-by: Richard Weinberger <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
commit dcc7fdbec53a960588f2c40232db2c6466c09917 upstream. v4l2-compliance sends a zeroed struct v4l2_streamparm in v4l2-test-formats.cpp::testParmType(), and this results in a division by 0 in some gspca subdrivers: divide error: 0000 [#1] SMP Modules linked in: gspca_ov534 gspca_main ... CPU: 0 PID: 17201 Comm: v4l2-compliance Not tainted 4.3.0-rc2-ao2 #1 Hardware name: System manufacturer System Product Name/M2N-E SLI, BIOS ASUS M2N-E SLI ACPI BIOS Revision 1301 09/16/2010 task: ffff8800818306c0 ti: ffff880095c4c000 task.ti: ffff880095c4c000 RIP: 0010:[<ffffffffa079bd62>] [<ffffffffa079bd62>] sd_set_streamparm+0x12/0x60 [gspca_ov534] RSP: 0018:ffff880095c4fce8 EFLAGS: 00010296 RAX: 0000000000000000 RBX: ffff8800c9522000 RCX: ffffffffa077a140 RDX: 0000000000000000 RSI: ffff880095e0c100 RDI: ffff8800c9522000 RBP: ffff880095e0c100 R08: ffffffffa077a100 R09: 00000000000000cc R10: ffff880067ec7740 R11: 0000000000000016 R12: ffffffffa07bb400 R13: 0000000000000000 R14: ffff880081b6a800 R15: 0000000000000000 FS: 00007fda0de78740(0000) GS:ffff88012fc00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00000000014630f8 CR3: 00000000cf349000 CR4: 00000000000006f0 Stack: ffffffffa07a6431 ffff8800c9522000 ffffffffa077656e 00000000c0cc5616 ffff8800c9522000 ffffffffa07a5e20 ffff880095e0c100 0000000000000000 ffff880067ec7740 ffffffffa077a140 ffff880067ec7740 0000000000000016 Call Trace: [<ffffffffa07a6431>] ? v4l_s_parm+0x21/0x50 [videodev] [<ffffffffa077656e>] ? vidioc_s_parm+0x4e/0x60 [gspca_main] [<ffffffffa07a5e20>] ? __video_do_ioctl+0x280/0x2f0 [videodev] [<ffffffffa07a5ba0>] ? video_ioctl2+0x20/0x20 [videodev] [<ffffffffa07a59b9>] ? video_usercopy+0x319/0x4e0 [videodev] [<ffffffff81182dc1>] ? page_add_new_anon_rmap+0x71/0xa0 [<ffffffff811afb92>] ? mem_cgroup_commit_charge+0x52/0x90 [<ffffffff81179b18>] ? handle_mm_fault+0xc18/0x1680 [<ffffffffa07a15cc>] ? v4l2_ioctl+0xac/0xd0 [videodev] [<ffffffff811c846f>] ? do_vfs_ioctl+0x28f/0x480 [<ffffffff811c86d4>] ? SyS_ioctl+0x74/0x80 [<ffffffff8154a8b6>] ? entry_SYSCALL_64_fastpath+0x16/0x75 Code: c7 93 d9 79 a0 5b 5d e9 f1 f3 9a e0 0f 1f 00 66 2e 0f 1f 84 00 00 00 00 00 66 66 66 66 90 53 31 d2 48 89 fb 48 83 ec 08 8b 46 10 <f7> 76 0c 80 bf ac 0c 00 00 00 88 87 4e 0e 00 00 74 09 80 bf 4f RIP [<ffffffffa079bd62>] sd_set_streamparm+0x12/0x60 [gspca_ov534] RSP <ffff880095c4fce8> ---[ end trace 279710c2c6c72080 ]--- Following what the doc says about a zeroed timeperframe (see http://www.linuxtv.org/downloads/v4l-dvb-apis/vidioc-g-parm.html): ... To reset manually applications can just set this field to zero. fix the issue by resetting the frame rate to a default value in case of an unusable timeperframe. The fix is done in the subdrivers instead of gspca.c because only the subdrivers have notion of a default frame rate to reset the camera to. Signed-off-by: Antonio Ospite <[email protected]> Reviewed-by: Hans de Goede <[email protected]> Signed-off-by: Hans Verkuil <[email protected]> Signed-off-by: Mauro Carvalho Chehab <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
commit 8eee1d3ed5b6fc8e14389567c9a6f53f82bb7224 upstream. The bulk of ATA host state machine is implemented by ata_sff_hsm_move(). The function is called from either the interrupt handler or, if polling, a work item. Unlike from the interrupt path, the polling path calls the function without holding the host lock and ata_sff_hsm_move() selectively grabs the lock. This is completely broken. If an IRQ triggers while polling is in progress, the two can easily race and end up accessing the hardware and updating state machine state at the same time. This can put the state machine in an illegal state and lead to a crash like the following. kernel BUG at drivers/ata/libata-sff.c:1302! invalid opcode: 0000 [#1] SMP DEBUG_PAGEALLOC KASAN Modules linked in: CPU: 1 PID: 10679 Comm: syz-executor Not tainted 4.5.0-rc1+ #300 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011 task: ffff88002bd00000 ti: ffff88002e048000 task.ti: ffff88002e048000 RIP: 0010:[<ffffffff83a83409>] [<ffffffff83a83409>] ata_sff_hsm_move+0x619/0x1c60 ... Call Trace: <IRQ> [<ffffffff83a84c31>] __ata_sff_port_intr+0x1e1/0x3a0 drivers/ata/libata-sff.c:1584 [<ffffffff83a85611>] ata_bmdma_port_intr+0x71/0x400 drivers/ata/libata-sff.c:2877 [< inline >] __ata_sff_interrupt drivers/ata/libata-sff.c:1629 [<ffffffff83a85bf3>] ata_bmdma_interrupt+0x253/0x580 drivers/ata/libata-sff.c:2902 [<ffffffff81479f98>] handle_irq_event_percpu+0x108/0x7e0 kernel/irq/handle.c:157 [<ffffffff8147a717>] handle_irq_event+0xa7/0x140 kernel/irq/handle.c:205 [<ffffffff81484573>] handle_edge_irq+0x1e3/0x8d0 kernel/irq/chip.c:623 [< inline >] generic_handle_irq_desc include/linux/irqdesc.h:146 [<ffffffff811a92bc>] handle_irq+0x10c/0x2a0 arch/x86/kernel/irq_64.c:78 [<ffffffff811a7e4d>] do_IRQ+0x7d/0x1a0 arch/x86/kernel/irq.c:240 [<ffffffff86653d4c>] common_interrupt+0x8c/0x8c arch/x86/entry/entry_64.S:520 <EOI> [< inline >] rcu_lock_acquire include/linux/rcupdate.h:490 [< inline >] rcu_read_lock include/linux/rcupdate.h:874 [<ffffffff8164b4a1>] filemap_map_pages+0x131/0xba0 mm/filemap.c:2145 [< inline >] do_fault_around mm/memory.c:2943 [< inline >] do_read_fault mm/memory.c:2962 [< inline >] do_fault mm/memory.c:3133 [< inline >] handle_pte_fault mm/memory.c:3308 [< inline >] __handle_mm_fault mm/memory.c:3418 [<ffffffff816efb16>] handle_mm_fault+0x2516/0x49a0 mm/memory.c:3447 [<ffffffff8127dc16>] __do_page_fault+0x376/0x960 arch/x86/mm/fault.c:1238 [<ffffffff8127e358>] trace_do_page_fault+0xe8/0x420 arch/x86/mm/fault.c:1331 [<ffffffff8126f514>] do_async_page_fault+0x14/0xd0 arch/x86/kernel/kvm.c:264 [<ffffffff86655578>] async_page_fault+0x28/0x30 arch/x86/entry/entry_64.S:986 Fix it by ensuring that the polling path is holding the host lock before entering ata_sff_hsm_move() so that all hardware accesses and state updates are performed under the host lock. Signed-off-by: Tejun Heo <[email protected]> Reported-and-tested-by: Dmitry Vyukov <[email protected]> Link: http://lkml.kernel.org/g/CACT4Y+b_JsOxJu2EZyEf+mOXORc_zid5V1-pLZSroJVxyWdSpw@mail.gmail.com Signed-off-by: Greg Kroah-Hartman <[email protected]>
commit 12e26969b32c79018165d52caff3762135614aa1 upstream. I get the splat below when modprobing/rmmoding EDAC drivers. It happens because bus->name is invalid after bus_unregister() has run. The Code: section below corresponds to: .loc 1 1108 0 movq 672(%rbx), %rax # mci_1(D)->bus, mci_1(D)->bus .loc 1 1109 0 popq %rbx # .loc 1 1108 0 movq (%rax), %rdi # _7->name, jmp kfree # and %rax has some funky stuff 2030203020312030 which looks a lot like something walked over it. Fix that by saving the name ptr before doing stuff to string it points to. general protection fault: 0000 [#1] SMP Modules linked in: ... CPU: 4 PID: 10318 Comm: modprobe Tainted: G I EN 3.12.51-11-default+ #48 Hardware name: HP ProLiant DL380 G7, BIOS P67 05/05/2011 task: ffff880311320280 ti: ffff88030da3e000 task.ti: ffff88030da3e000 RIP: 0010:[<ffffffffa019da92>] [<ffffffffa019da92>] edac_unregister_sysfs+0x22/0x30 [edac_core] RSP: 0018:ffff88030da3fe28 EFLAGS: 00010292 RAX: 2030203020312030 RBX: ffff880311b4e000 RCX: 000000000000095c RDX: 0000000000000001 RSI: ffff880327bb9600 RDI: 0000000000000286 RBP: ffff880311b4e750 R08: 0000000000000000 R09: ffffffff81296110 R10: 0000000000000400 R11: 0000000000000000 R12: ffff88030ba1ac68 R13: 0000000000000001 R14: 00000000011b02f0 R15: 0000000000000000 FS: 00007fc9bf8f5700(0000) GS:ffff8801a7c40000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b CR2: 0000000000403c90 CR3: 000000019ebdf000 CR4: 00000000000007e0 Stack: Call Trace: i7core_unregister_mci.isra.9 i7core_remove pci_device_remove __device_release_driver driver_detach bus_remove_driver pci_unregister_driver i7core_exit SyS_delete_module system_call_fastpath 0x7fc9bf426536 Code: 2e 0f 1f 84 00 00 00 00 00 66 66 66 66 90 53 48 89 fb e8 52 2a 1f e1 48 8b bb a0 02 00 00 e8 46 59 1f e1 48 8b 83 a0 02 00 00 5b <48> 8b 38 e9 26 9a fe e0 66 0f 1f 44 00 00 66 66 66 66 90 48 8b RIP [<ffffffffa019da92>] edac_unregister_sysfs+0x22/0x30 [edac_core] RSP <ffff88030da3fe28> Signed-off-by: Borislav Petkov <[email protected]> Cc: Mauro Carvalho Chehab <[email protected]> Fixes: 7a623c0 ("edac: rewrite the sysfs code to use struct device") Signed-off-by: Greg Kroah-Hartman <[email protected]>
Once we failed to merge inline data into inode page during flushing inline inode, we will skip invoking inode_dec_dirty_pages, which makes dirty page count incorrect, result in panic in ->evict_inode, Fix it. ------------[ cut here ]------------ kernel BUG at /home/yuchao/git/devf2fs/inode.c:336! invalid opcode: 0000 [#1] PREEMPT SMP CPU: 3 PID: 10004 Comm: umount Tainted: G O 4.6.0-rc5+ #17 Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006 task: f0c33000 ti: c5212000 task.ti: c5212000 EIP: 0060:[<f89aacb5>] EFLAGS: 00010202 CPU: 3 EIP is at f2fs_evict_inode+0x85/0x490 [f2fs] EAX: 00000001 EBX: c4529ea0 ECX: 00000001 EDX: 00000000 ESI: c0131000 EDI: f89dd0a0 EBP: c5213e9c ESP: c5213e78 DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068 CR0: 80050033 CR2: b75878c0 CR3: 1a36a700 CR4: 000406f0 Stack: c4529ea0 c4529ef4 c5213e8c c176d45c c4529ef4 00000000 c4529ea0 c4529fac f89dd0a0 c5213eb0 c1204a68 c5213ed8 c452a2b4 c6680930 c5213ec0 c1204b64 c6680d44 c6680620 c5213eec c120588d ee84b000 ee84b5c0 c5214000 ee84b5e0 Call Trace: [<c176d45c>] ? _raw_spin_unlock+0x2c/0x50 [<c1204a68>] evict+0xa8/0x170 [<c1204b64>] dispose_list+0x34/0x50 [<c120588d>] evict_inodes+0x10d/0x130 [<c11ea941>] generic_shutdown_super+0x41/0xe0 [<c1185190>] ? unregister_shrinker+0x40/0x50 [<c1185190>] ? unregister_shrinker+0x40/0x50 [<c11eac52>] kill_block_super+0x22/0x70 [<f89af23e>] kill_f2fs_super+0x1e/0x20 [f2fs] [<c11eae1d>] deactivate_locked_super+0x3d/0x70 [<c11eb383>] deactivate_super+0x43/0x60 [<c1208ec9>] cleanup_mnt+0x39/0x80 [<c1208f50>] __cleanup_mnt+0x10/0x20 [<c107d091>] task_work_run+0x71/0x90 [<c105725a>] exit_to_usermode_loop+0x72/0x9e [<c1001c7c>] do_fast_syscall_32+0x19c/0x1c0 [<c176dd48>] sysenter_past_esp+0x45/0x74 EIP: [<f89aacb5>] f2fs_evict_inode+0x85/0x490 [f2fs] SS:ESP 0068:c5213e78 ---[ end trace d30536330b7fdc58 ]--- Signed-off-by: Chao Yu <[email protected]> Signed-off-by: Jaegeuk Kim <[email protected]>
FI_DIRTY_INODE flag is not covered by inode page lock, so it can be unset at any time like below. Thread #1 Thread #2 - lock_page(ipage) - update i_fields - update i_size/i_blocks/and so on - set FI_DIRTY_INODE - reset FI_DIRTY_INODE - set_page_dirty(ipage) In this case, we can lose the latest i_field information. Signed-off-by: Jaegeuk Kim <[email protected]>
This patch enhances the xattr consistency of dirs from suddern power-cuts. Possible scenario would be: 1. dir->setxattr used by per-file encryption 2. file->setxattr goes into inline_xattr 3. file->fsync In that case, we should do checkpoint for #1. Otherwise we'd lose dir's key information for the file given #2. Signed-off-by: Jaegeuk Kim <[email protected]>
tests/generic/251 of fstest suit complains us with below message: ------------[ cut here ]------------ invalid opcode: 0000 [#1] PREEMPT SMP CPU: 2 PID: 7698 Comm: fstrim Tainted: G O 4.7.0+ #21 task: e9f4e000 task.stack: e7262000 EIP: 0060:[<f89fcefe>] EFLAGS: 00010202 CPU: 2 EIP is at write_checkpoint+0xfde/0x1020 [f2fs] EAX: f33eb300 EBX: eecac310 ECX: 00000001 EDX: ffff0001 ESI: eecac000 EDI: eecac5f0 EBP: e7263dec ESP: e7263d18 DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068 CR0: 80050033 CR2: b76ab01c CR3: 2eb89de0 CR4: 000406f0 Stack: 00000001 a220fb7b e9f4e000 00000002 419ff2d3 b3a05151 00000002 e9f4e5d8 e9f4e000 419ff2d3 b3a05151 eecac310 c10b8154 b3a05151 419ff2d3 c10b78bd e9f4e000 e9f4e000 e9f4e5d8 00000001 e9f4e000 ec409000 eecac2cc eecac288 Call Trace: [<c10b8154>] ? __lock_acquire+0x3c4/0x760 [<c10b78bd>] ? mark_held_locks+0x5d/0x80 [<f8a10632>] f2fs_trim_fs+0x1c2/0x2e0 [f2fs] [<f89e9f56>] f2fs_ioctl+0x6b6/0x10b0 [f2fs] [<c13d51df>] ? __this_cpu_preempt_check+0xf/0x20 [<c10b4281>] ? trace_hardirqs_off_caller+0x91/0x120 [<f89e98a0>] ? __exchange_data_block+0xd30/0xd30 [f2fs] [<c120b2e1>] do_vfs_ioctl+0x81/0x7f0 [<c11d57c5>] ? kmem_cache_free+0x245/0x2e0 [<c1217840>] ? get_unused_fd_flags+0x40/0x40 [<c1206eec>] ? putname+0x4c/0x50 [<c11f631e>] ? do_sys_open+0x16e/0x1d0 [<c1001990>] ? do_fast_syscall_32+0x30/0x1c0 [<c13d51df>] ? __this_cpu_preempt_check+0xf/0x20 [<c120baa8>] SyS_ioctl+0x58/0x80 [<c1001a01>] do_fast_syscall_32+0xa1/0x1c0 [<c178cc54>] sysenter_past_esp+0x45/0x74 EIP: [<f89fcefe>] write_checkpoint+0xfde/0x1020 [f2fs] SS:ESP 0068:e7263d18 ---[ end trace 4de95d7e6b3aa7c6 ]--- The reason is: with below call stack, we will encounter BUG_ON during doing fstrim. Thread A Thread B - write_checkpoint - do_checkpoint - f2fs_write_inode - update_inode_page - update_inode - set_page_dirty - f2fs_set_node_page_dirty - inc_page_count - percpu_counter_inc - set_sbi_flag(SBI_IS_DIRTY) - clear_sbi_flag(SBI_IS_DIRTY) Thread C Thread D - f2fs_write_node_page - set_node_addr - __set_nat_cache_dirty - nm_i->dirty_nat_cnt++ - do_vfs_ioctl - f2fs_ioctl - f2fs_trim_fs - write_checkpoint - f2fs_bug_on(nm_i->dirty_nat_cnt) Fix it by setting superblock dirty correctly in do_checkpoint and f2fs_write_node_page. Signed-off-by: Chao Yu <[email protected]> Signed-off-by: Jaegeuk Kim <[email protected]>
Previously, f2fs_write_begin sets PageUptodate all the time. But, when user tries to update the entire page (i.e., len == PAGE_SIZE), we need to consider that the page is able to be copied partially afterwards. In such the case, we will lose the remaing region in the page. This patch fixes this by setting PageUptodate in f2fs_write_end as given copied result. In the short copy case, it returns zero to let generic_perform_write retry copying user data again. As a result, f2fs_write_end() works: PageUptodate len copied return retry 1. no 4096 4096 4096 false -> return 4096 2. no 4096 1024 0 true -> goto #1 case 3. yes 2048 2048 2048 false -> return 2048 4. yes 2048 1024 1024 false -> return 1024 Suggested-by: Al Viro <[email protected]> Signed-off-by: Jaegeuk Kim <[email protected]>
Protocol sockets (struct sock) don't have UIDs, but most of the time, they map 1:1 to userspace sockets (struct socket) which do. Various operations such as the iptables xt_owner match need access to the "UID of a socket", and do so by following the backpointer to the struct socket. This involves taking sk_callback_lock and doesn't work when there is no socket because userspace has already called close(). Simplify this by adding a sk_uid field to struct sock whose value matches the UID of the corresponding struct socket. The semantics are as follows: 1. Whenever sk_socket is non-null: sk_uid is the same as the UID in sk_socket, i.e., matches the return value of sock_i_uid. Specifically, the UID is set when userspace calls socket(), fchown(), or accept(). 2. When sk_socket is NULL, sk_uid is defined as follows: - For a socket that no longer has a sk_socket because userspace has called close(): the previous UID. - For a cloned socket (e.g., an incoming connection that is established but on which userspace has not yet called accept): the UID of the socket it was cloned from. - For a socket that has never had an sk_socket: UID 0 inside the user namespace corresponding to the network namespace the socket belongs to. Kernel sockets created by sock_create_kern are a special case of #1 and sk_uid is the user that created them. For kernel sockets created at network namespace creation time, such as the per-processor ICMP and TCP sockets, this is the user that created the network namespace. [Backport of net-next 86741ec25462e4c8cdce6df2f41ead05568c7d5e] Bug: 16355602 Change-Id: Idbc3e9a0cec91c4c6e01916b967b6237645ebe59 Signed-off-by: Lorenzo Colitti <[email protected]> Signed-off-by: David S. Miller <[email protected]>
(cherry picked from commit 3d5fe03) We can end up allocating a new compression stream with GFP_KERNEL from within the IO path, which may result is nested (recursive) IO operations. That can introduce problems if the IO path in question is a reclaimer, holding some locks that will deadlock nested IOs. Allocate streams and working memory using GFP_NOIO flag, forbidding recursive IO and FS operations. An example: inconsistent {IN-RECLAIM_FS-W} -> {RECLAIM_FS-ON-W} usage. git/20158 [HC0[0]:SC0[0]:HE1:SE1] takes: (jbd2_handle){+.+.?.}, at: start_this_handle+0x4ca/0x555 {IN-RECLAIM_FS-W} state was registered at: __lock_acquire+0x8da/0x117b lock_acquire+0x10c/0x1a7 start_this_handle+0x52d/0x555 jbd2__journal_start+0xb4/0x237 __ext4_journal_start_sb+0x108/0x17e ext4_dirty_inode+0x32/0x61 __mark_inode_dirty+0x16b/0x60c iput+0x11e/0x274 __dentry_kill+0x148/0x1b8 shrink_dentry_list+0x274/0x44a prune_dcache_sb+0x4a/0x55 super_cache_scan+0xfc/0x176 shrink_slab.part.14.constprop.25+0x2a2/0x4d3 shrink_zone+0x74/0x140 kswapd+0x6b7/0x930 kthread+0x107/0x10f ret_from_fork+0x3f/0x70 irq event stamp: 138297 hardirqs last enabled at (138297): debug_check_no_locks_freed+0x113/0x12f hardirqs last disabled at (138296): debug_check_no_locks_freed+0x33/0x12f softirqs last enabled at (137818): __do_softirq+0x2d3/0x3e9 softirqs last disabled at (137813): irq_exit+0x41/0x95 other info that might help us debug this: Possible unsafe locking scenario: CPU0 ---- lock(jbd2_handle); <Interrupt> lock(jbd2_handle); *** DEADLOCK *** 5 locks held by git/20158: #0: (sb_writers#7){.+.+.+}, at: [<ffffffff81155411>] mnt_want_write+0x24/0x4b #1: (&type->i_mutex_dir_key#2/1){+.+.+.}, at: [<ffffffff81145087>] lock_rename+0xd9/0xe3 #2: (&sb->s_type->i_mutex_key#11){+.+.+.}, at: [<ffffffff8114f8e2>] lock_two_nondirectories+0x3f/0x6b #3: (&sb->s_type->i_mutex_key#11/4){+.+.+.}, at: [<ffffffff8114f909>] lock_two_nondirectories+0x66/0x6b #4: (jbd2_handle){+.+.?.}, at: [<ffffffff811e31db>] start_this_handle+0x4ca/0x555 stack backtrace: CPU: 2 PID: 20158 Comm: git Not tainted 4.1.0-rc7-next-20150615-dbg-00016-g8bdf555-dirty #211 Call Trace: dump_stack+0x4c/0x6e mark_lock+0x384/0x56d mark_held_locks+0x5f/0x76 lockdep_trace_alloc+0xb2/0xb5 kmem_cache_alloc_trace+0x32/0x1e2 zcomp_strm_alloc+0x25/0x73 [zram] zcomp_strm_multi_find+0xe7/0x173 [zram] zcomp_strm_find+0xc/0xe [zram] zram_bvec_rw+0x2ca/0x7e0 [zram] zram_make_request+0x1fa/0x301 [zram] generic_make_request+0x9c/0xdb submit_bio+0xf7/0x120 ext4_io_submit+0x2e/0x43 ext4_bio_write_page+0x1b7/0x300 mpage_submit_page+0x60/0x77 mpage_map_and_submit_buffers+0x10f/0x21d ext4_writepages+0xc8c/0xe1b do_writepages+0x23/0x2c __filemap_fdatawrite_range+0x84/0x8b filemap_flush+0x1c/0x1e ext4_alloc_da_blocks+0xb8/0x117 ext4_rename+0x132/0x6dc ? mark_held_locks+0x5f/0x76 ext4_rename2+0x29/0x2b vfs_rename+0x540/0x636 SyS_renameat2+0x359/0x44d SyS_rename+0x1e/0x20 entry_SYSCALL_64_fastpath+0x12/0x6f [[email protected]: add stable mark] Signed-off-by: Sergey Senozhatsky <[email protected]> Acked-by: Minchan Kim <[email protected]> Cc: Kyeongdon Kim <[email protected]> Cc: <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
commit e9b736d88af1a143530565929390cadf036dc799 upstream. The class of 4 n_hdls buf locks is the same because a single function n_hdlc_buf_list_init is used to init all the locks. But since flush_tx_queue takes n_hdlc->tx_buf_list.spinlock and then calls n_hdlc_buf_put which takes n_hdlc->tx_free_buf_list.spinlock, lockdep emits a warning: ============================================= [ INFO: possible recursive locking detected ] 4.3.0-25.g91e30a7-default #1 Not tainted --------------------------------------------- a.out/1248 is trying to acquire lock: (&(&list->spinlock)->rlock){......}, at: [<ffffffffa01fd020>] n_hdlc_buf_put+0x20/0x60 [n_hdlc] but task is already holding lock: (&(&list->spinlock)->rlock){......}, at: [<ffffffffa01fdc07>] n_hdlc_tty_ioctl+0x127/0x1d0 [n_hdlc] other info that might help us debug this: Possible unsafe locking scenario: CPU0 ---- lock(&(&list->spinlock)->rlock); lock(&(&list->spinlock)->rlock); *** DEADLOCK *** May be due to missing lock nesting notation 2 locks held by a.out/1248: #0: (&tty->ldisc_sem){++++++}, at: [<ffffffff814c9eb0>] tty_ldisc_ref_wait+0x20/0x50 #1: (&(&list->spinlock)->rlock){......}, at: [<ffffffffa01fdc07>] n_hdlc_tty_ioctl+0x127/0x1d0 [n_hdlc] ... Call Trace: ... [<ffffffff81738fd0>] _raw_spin_lock_irqsave+0x50/0x70 [<ffffffffa01fd020>] n_hdlc_buf_put+0x20/0x60 [n_hdlc] [<ffffffffa01fdc24>] n_hdlc_tty_ioctl+0x144/0x1d0 [n_hdlc] [<ffffffff814c25c1>] tty_ioctl+0x3f1/0xe40 ... Fix it by initializing the spin_locks separately. This removes also reduntand memset of a freshly kzallocated space. Signed-off-by: Jiri Slaby <[email protected]> Reported-by: Dmitry Vyukov <[email protected]> Signed-off-by: Jiri Slaby <[email protected]> (cherry picked from commit a1b2f63f3024a235216883ab0c90b81bcedfc264) Signed-off-by: Jerry Zhang <[email protected]> Signed-off-by: Jerry Zhang <[email protected]>
The logic in __ip6_append_data() assumes that the MTU is at least large enough for the headers. A device's MTU may be adjusted after being added while sendmsg() is processing data, resulting in __ip6_append_data() seeing any MTU. For an mtu smaller than the size of the fragmentation header, the math results in a negative 'maxfraglen', which causes problems when refragmenting any previous skb in the skb_write_queue, leaving it possibly malformed. Instead sendmsg returns EINVAL when the mtu is calculated to be less than IPV6_MIN_MTU. Found by syzkaller: kernel BUG at ./include/linux/skbuff.h:2064! invalid opcode: 0000 [#1] SMP KASAN Dumping ftrace buffer: (ftrace buffer empty) Modules linked in: CPU: 1 PID: 14216 Comm: syz-executor5 Not tainted 4.13.0-rc4+ #2 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 task: ffff8801d0b68580 task.stack: ffff8801ac6b8000 RIP: 0010:__skb_pull include/linux/skbuff.h:2064 [inline] RIP: 0010:__ip6_make_skb+0x18cf/0x1f70 net/ipv6/ip6_output.c:1617 RSP: 0018:ffff8801ac6bf570 EFLAGS: 00010216 RAX: 0000000000010000 RBX: 0000000000000028 RCX: ffffc90003cce000 RDX: 00000000000001b8 RSI: ffffffff839df06f RDI: ffff8801d9478ca0 RBP: ffff8801ac6bf780 R08: ffff8801cc3f1dbc R09: 0000000000000000 R10: ffff8801ac6bf7a0 R11: 43cb4b7b1948a9e7 R12: ffff8801cc3f1dc8 R13: ffff8801cc3f1d40 R14: 0000000000001036 R15: dffffc0000000000 FS: 00007f43d740c700(0000) GS:ffff8801dc100000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007f7834984000 CR3: 00000001d79b9000 CR4: 00000000001406e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: ip6_finish_skb include/net/ipv6.h:911 [inline] udp_v6_push_pending_frames+0x255/0x390 net/ipv6/udp.c:1093 udpv6_sendmsg+0x280d/0x31a0 net/ipv6/udp.c:1363 inet_sendmsg+0x11f/0x5e0 net/ipv4/af_inet.c:762 sock_sendmsg_nosec net/socket.c:633 [inline] sock_sendmsg+0xca/0x110 net/socket.c:643 SYSC_sendto+0x352/0x5a0 net/socket.c:1750 SyS_sendto+0x40/0x50 net/socket.c:1718 entry_SYSCALL_64_fastpath+0x1f/0xbe RIP: 0033:0x4512e9 RSP: 002b:00007f43d740bc08 EFLAGS: 00000216 ORIG_RAX: 000000000000002c RAX: ffffffffffffffda RBX: 00000000007180a8 RCX: 00000000004512e9 RDX: 000000000000002e RSI: 0000000020d08000 RDI: 0000000000000005 RBP: 0000000000000086 R08: 00000000209c1000 R09: 000000000000001c R10: 0000000000040800 R11: 0000000000000216 R12: 00000000004b9c69 R13: 00000000ffffffff R14: 0000000000000005 R15: 00000000202c2000 Code: 9e 01 fe e9 c5 e8 ff ff e8 7f 9e 01 fe e9 4a ea ff ff 48 89 f7 e8 52 9e 01 fe e9 aa eb ff ff e8 a8 b6 cf fd 0f 0b e8 a1 b6 cf fd <0f> 0b 49 8d 45 78 4d 8d 45 7c 48 89 85 78 fe ff ff 49 8d 85 ba RIP: __skb_pull include/linux/skbuff.h:2064 [inline] RSP: ffff8801ac6bf570 RIP: __ip6_make_skb+0x18cf/0x1f70 net/ipv6/ip6_output.c:1617 RSP: ffff8801ac6bf570 Reported-by: syzbot <[email protected]> Signed-off-by: Mike Maloney <[email protected]> Reviewed-by: Eric Dumazet <[email protected]> Signed-off-by: David S. Miller <[email protected]> (cherry picked from commit 749439bfac6e1a2932c582e2699f91d329658196) Bug: 65023306 Change-Id: I3b713621c749b7fd3a070116be8996ae2e2dd6e8 Signed-off-by: Greg Hackmann <[email protected]>
This is as much for the purpose of getting this patch out somewhere as actually requesting that you pull it. This is enough to support LXC on a Nexus 9, but there's still problems which mean that I don't have it fully working yet--you can't mount more than one cgroup on the same mount point for some reason, which breaks the LXC Android wrapper script.
Anyways, in the off chance that you're interested, I built it on your kernel sources, so I figured I'd make a pull request.
Nearly every change is a result of older code which expects
int
s for UID or GID. However, with namespaces, you need to usekgid_t
orkuid_t
structures, respectively.