Skip to content

Conversation

@mmirg
Copy link

@mmirg mmirg commented May 6, 2013

This applies a patch written by Dirk Hohndel to convert the right alt
key to a Fn key. This allows functions such as Pg Up to be mapped to
right alt + up.

This applies a patch written by Dirk Hohndel to convert the right alt
key to a Fn key.  This allows functions such as Pg Up to be mapped to
right alt + up.
@brocktice
Copy link
Owner

I have already pulled this in locally and rebuilt, am going to test later today and then will merge. It might be nice to have a kernel config option for this so that it can be merged without permanently changing the behavior.

Reversed Dirk Hohndel keyboard patch for now.  I folded in patches to
the proximity sensor that add features like IR sensing.  Set
CONFIG_SENSORS_ISL29018 in kernel config to compile the driver.  The
accompanying documentation explains where to find the device in sysfs.
@mmirg
Copy link
Author

mmirg commented May 8, 2013

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Fair enough. I'm also as of yet unconvinced that this is the best
solution, at least one concern that I have is that it seems like
people using non-English keyboard layouts may lose access to AltGr if
Alt_R is forcibly mapped to Fn. There was a somewhat spirited
discussion on Google+ between Dirk and Dimitry Torokhov about the
merits of different approaches for handling the keyboard.

I reversed this patch on my machine and have gone back to fiddling
with xmodmap to setup my keyboard as I like.

On 05/06/2013 11:29 AM, brocktice wrote:

I am going to test this later today and then will merge. It might
be nice to have a kernel config option for this so that it can be
merged without permanently changing the behavior.

— Reply to this email directly or view it on GitHub
#2 (comment).

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.19 (GNU/Linux)

iEYEARECAAYFAlGJyWUACgkQ2bvmSlxv+a5kIQCg22a25fYn5dQwnQRu2pza5Ad5
QsEAnRMQ41IzSkGOsg6giTNBKVCq4QL8
=HhDB
-----END PGP SIGNATURE-----

brocktice pushed a commit that referenced this pull request May 18, 2013
Commit 2353f2b ("HID: protect hid_debug_list") introduced mutex
locking around debug_list access to prevent SMP races when debugfs
nodes are being operated upon by multiple userspace processess.

mutex is not a proper synchronization primitive though, as the hid-debug
callbacks are being called from atomic contexts.

We also have to be careful about disabling IRQs when taking the lock
to prevent deadlock against IRQ handlers.

Benjamin reports this has also been reported in RH bugzilla as bug #958935.

 ===============================
 [ INFO: suspicious RCU usage. ]
 3.9.0+ #94 Not tainted
 -------------------------------
 include/linux/rcupdate.h:476 Illegal context switch in RCU read-side critical section!

 other info that might help us debug this:

 rcu_scheduler_active = 1, debug_locks = 0
 4 locks held by Xorg/5502:
  #0:  (&evdev->mutex){+.+...}, at: [<ffffffff81512c3d>] evdev_write+0x6d/0x160
  #1:  (&(&dev->event_lock)->rlock#2){-.-...}, at: [<ffffffff8150dd9b>] input_inject_event+0x5b/0x230
  #2:  (rcu_read_lock){.+.+..}, at: [<ffffffff8150dd82>] input_inject_event+0x42/0x230
  #3:  (&(&usbhid->lock)->rlock){-.....}, at: [<ffffffff81565289>] usb_hidinput_input_event+0x89/0x120

 stack backtrace:
 CPU: 0 PID: 5502 Comm: Xorg Not tainted 3.9.0+ #94
 Hardware name: Dell Inc. OptiPlex 390/0M5DCD, BIOS A09 07/24/2012
  0000000000000001 ffff8800689c7c38 ffffffff816f249f ffff8800689c7c68
  ffffffff810acb1d 0000000000000000 ffffffff81a03ac7 000000000000019d
  0000000000000000 ffff8800689c7c90 ffffffff8107cda7 0000000000000000
 Call Trace:
  [<ffffffff816f249f>] dump_stack+0x19/0x1b
  [<ffffffff810acb1d>] lockdep_rcu_suspicious+0xfd/0x130
  [<ffffffff8107cda7>] __might_sleep+0xc7/0x230
  [<ffffffff816f7770>] mutex_lock_nested+0x40/0x3a0
  [<ffffffff81312ac4>] ? vsnprintf+0x354/0x640
  [<ffffffff81553cc4>] hid_debug_event+0x34/0x100
  [<ffffffff81554197>] hid_dump_input+0x67/0xa0
  [<ffffffff81556430>] hid_set_field+0x50/0x120
  [<ffffffff8156529a>] usb_hidinput_input_event+0x9a/0x120
  [<ffffffff8150d89e>] input_handle_event+0x8e/0x530
  [<ffffffff8150df10>] input_inject_event+0x1d0/0x230
  [<ffffffff8150dd82>] ? input_inject_event+0x42/0x230
  [<ffffffff81512cae>] evdev_write+0xde/0x160
  [<ffffffff81185038>] vfs_write+0xc8/0x1f0
  [<ffffffff81185535>] SyS_write+0x55/0xa0
  [<ffffffff81704482>] system_call_fastpath+0x16/0x1b
 BUG: sleeping function called from invalid context at kernel/mutex.c:413
 in_atomic(): 1, irqs_disabled(): 1, pid: 5502, name: Xorg
 INFO: lockdep is turned off.
 irq event stamp: 1098574
 hardirqs last  enabled at (1098573): [<ffffffff816fb53f>] _raw_spin_unlock_irqrestore+0x3f/0x70
 hardirqs last disabled at (1098574): [<ffffffff816faaf5>] _raw_spin_lock_irqsave+0x25/0xa0
 softirqs last  enabled at (1098306): [<ffffffff8104971f>] __do_softirq+0x18f/0x3c0
 softirqs last disabled at (1097867): [<ffffffff81049ad5>] irq_exit+0xa5/0xb0
 CPU: 0 PID: 5502 Comm: Xorg Not tainted 3.9.0+ #94
 Hardware name: Dell Inc. OptiPlex 390/0M5DCD, BIOS A09 07/24/2012
  ffffffff81a03ac7 ffff8800689c7c68 ffffffff816f249f ffff8800689c7c90
  ffffffff8107ce60 0000000000000000 ffff8800689c7fd8 ffff88006a62c800
  ffff8800689c7d10 ffffffff816f7770 ffff8800689c7d00 ffffffff81312ac4
 Call Trace:
  [<ffffffff816f249f>] dump_stack+0x19/0x1b
  [<ffffffff8107ce60>] __might_sleep+0x180/0x230
  [<ffffffff816f7770>] mutex_lock_nested+0x40/0x3a0
  [<ffffffff81312ac4>] ? vsnprintf+0x354/0x640
  [<ffffffff81553cc4>] hid_debug_event+0x34/0x100
  [<ffffffff81554197>] hid_dump_input+0x67/0xa0
  [<ffffffff81556430>] hid_set_field+0x50/0x120
  [<ffffffff8156529a>] usb_hidinput_input_event+0x9a/0x120
  [<ffffffff8150d89e>] input_handle_event+0x8e/0x530
  [<ffffffff8150df10>] input_inject_event+0x1d0/0x230
  [<ffffffff8150dd82>] ? input_inject_event+0x42/0x230
  [<ffffffff81512cae>] evdev_write+0xde/0x160
  [<ffffffff81185038>] vfs_write+0xc8/0x1f0
  [<ffffffff81185535>] SyS_write+0x55/0xa0
  [<ffffffff81704482>] system_call_fastpath+0x16/0x1b

Reported-by: majianpeng <majianpeng@gmail.com>
Reported-by: Benjamin Tissoires <benjamin.tissoires@gmail.com>
Reviewed-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
brocktice pushed a commit that referenced this pull request May 18, 2013
Older linux clients match the 'sec=' mount option flavor against the server's
flavor list (if available) and return EPERM if the specified flavor or AUTH_NULL
(which "matches" any flavor) is not found.

Recent changes skip this step and allow the vfs mount even though no operations
will succeed, creating a 'dud' mount.

This patch reverts back to the old behavior of matching specified flavors
against the server list and also returns EPERM when no sec= is specified and
none of the flavors returned by the server are supported by the client.

Example of behavior change:

the server's /etc/exports:

/export/krb5      *(sec=krb5,rw,no_root_squash)

old client behavior:

$ uname -a
Linux one.apikia.fake 3.8.8-202.fc18.x86_64 #1 SMP Wed Apr 17 23:25:17 UTC 2013 x86_64 x86_64 x86_64 GNU/Linux
$ sudo mount -v -o sec=sys,vers=3 zero:/export/krb5 /mnt
mount.nfs: timeout set for Sun May  5 17:32:04 2013
mount.nfs: trying text-based options 'sec=sys,vers=3,addr=192.168.100.10'
mount.nfs: prog 100003, trying vers=3, prot=6
mount.nfs: trying 192.168.100.10 prog 100003 vers 3 prot TCP port 2049
mount.nfs: prog 100005, trying vers=3, prot=17
mount.nfs: trying 192.168.100.10 prog 100005 vers 3 prot UDP port 20048
mount.nfs: mount(2): Permission denied
mount.nfs: access denied by server while mounting zero:/export/krb5

recently changed behavior:

$ uname -a
Linux one.apikia.fake 3.9.0-testing+ #2 SMP Fri May 3 20:29:32 EDT 2013 x86_64 x86_64 x86_64 GNU/Linux
$ sudo mount -v -o sec=sys,vers=3 zero:/export/krb5 /mnt
mount.nfs: timeout set for Sun May  5 17:37:17 2013
mount.nfs: trying text-based options 'sec=sys,vers=3,addr=192.168.100.10'
mount.nfs: prog 100003, trying vers=3, prot=6
mount.nfs: trying 192.168.100.10 prog 100003 vers 3 prot TCP port 2049
mount.nfs: prog 100005, trying vers=3, prot=17
mount.nfs: trying 192.168.100.10 prog 100005 vers 3 prot UDP port 20048
$ ls /mnt
ls: cannot open directory /mnt: Permission denied
$ sudo ls /mnt
ls: cannot open directory /mnt: Permission denied
$ sudo df /mnt
df: ‘/mnt’: Permission denied
df: no file systems processed
$ sudo umount /mnt
$

Signed-off-by: Weston Andros Adamson <dros@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
brocktice pushed a commit that referenced this pull request May 18, 2013
o Deadlock case #1

Thread 1:
- writeback_sb_inodes
 - do_writepages
  - f2fs_write_data_pages
   - write_cache_pages
    - f2fs_write_data_page
     - f2fs_balance_fs
      - wait mutex_lock(gc_mutex)

Thread 2:
- f2fs_balance_fs
 - mutex_lock(gc_mutex)
 - f2fs_gc
  - f2fs_iget
   - wait iget_locked(inode->i_lock)

Thread 3:
- do_unlinkat
 - iput
  - lock(inode->i_lock)
   - evict
    - inode_wait_for_writeback

o Deadlock case #2

Thread 1:
- __writeback_single_inode
 : set I_SYNC
  - do_writepages
   - f2fs_write_data_page
    - f2fs_balance_fs
     - f2fs_gc
      - iput
       - evict
        - inode_wait_for_writeback(I_SYNC)

In order to avoid this, even though iput is called with the zero-reference
count, we need to stop the eviction procedure if the inode is on writeback.
So this patch links f2fs_drop_inode which checks the I_SYNC flag.

Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
brocktice pushed a commit that referenced this pull request May 18, 2013
Pull xfs update (#2) from Ben Myers:

 - add CONFIG_XFS_WARN, a step between zero debugging and
   CONFIG_XFS_DEBUG.

 - fix attrmulti and attrlist to fall back to vmalloc when kmalloc
   fails.

* tag 'for-linus-v3.10-rc1-2' of git://oss.sgi.com/xfs/xfs:
  xfs: fallback to vmalloc for large buffers in xfs_compat_attrlist_by_handle
  xfs: fallback to vmalloc for large buffers in xfs_attrlist_by_handle
  xfs: introduce CONFIG_XFS_WARN
jsgf pushed a commit to jsgf/pixel_linux that referenced this pull request Jun 7, 2013
An inactive timer's base can refer to a offline cpu's base.

In the current code, cpu_base's lock is blindly reinitialized each
time a CPU is brought up. If a CPU is brought online during the period
that another thread is trying to modify an inactive timer on that CPU
with holding its timer base lock, then the lock will be reinitialized
under its feet. This leads to following SPIN_BUG().

<0> BUG: spinlock already unlocked on CPU#3, kworker/u:3/1466
<0> lock: 0xe3ebe000, .magic: dead4ead, .owner: kworker/u:3/1466, .owner_cpu: 1
<4> [<c0013dc4>] (unwind_backtrace+0x0/0x11c) from [<c026e794>] (do_raw_spin_unlock+0x40/0xcc)
<4> [<c026e794>] (do_raw_spin_unlock+0x40/0xcc) from [<c076c160>] (_raw_spin_unlock+0x8/0x30)
<4> [<c076c160>] (_raw_spin_unlock+0x8/0x30) from [<c009b858>] (mod_timer+0x294/0x310)
<4> [<c009b858>] (mod_timer+0x294/0x310) from [<c00a5e04>] (queue_delayed_work_on+0x104/0x120)
<4> [<c00a5e04>] (queue_delayed_work_on+0x104/0x120) from [<c04eae00>] (sdhci_msm_bus_voting+0x88/0x9c)
<4> [<c04eae00>] (sdhci_msm_bus_voting+0x88/0x9c) from [<c04d8780>] (sdhci_disable+0x40/0x48)
<4> [<c04d8780>] (sdhci_disable+0x40/0x48) from [<c04bf300>] (mmc_release_host+0x4c/0xb0)
<4> [<c04bf300>] (mmc_release_host+0x4c/0xb0) from [<c04c7aac>] (mmc_sd_detect+0x90/0xfc)
<4> [<c04c7aac>] (mmc_sd_detect+0x90/0xfc) from [<c04c2504>] (mmc_rescan+0x7c/0x2c4)
<4> [<c04c2504>] (mmc_rescan+0x7c/0x2c4) from [<c00a6a7c>] (process_one_work+0x27c/0x484)
<4> [<c00a6a7c>] (process_one_work+0x27c/0x484) from [<c00a6e94>] (worker_thread+0x210/0x3b0)
<4> [<c00a6e94>] (worker_thread+0x210/0x3b0) from [<c00aad9c>] (kthread+0x80/0x8c)
<4> [<c00aad9c>] (kthread+0x80/0x8c) from [<c000ea80>] (kernel_thread_exit+0x0/0x8)

As an example, this particular crash occurred when CPU brocktice#3 is executing
mod_timer() on an inactive timer whose base is refered to offlined CPU
brocktice#2.  The code locked the timer_base corresponding to CPU brocktice#2. Before it
could proceed, CPU brocktice#2 came online and reinitialized the spinlock
corresponding to its base. Thus now CPU brocktice#3 held a lock which was
reinitialized. When CPU brocktice#3 finally ended up unlocking the old cpu_base
corresponding to CPU brocktice#2, we hit the above SPIN_BUG().

CPU #0		CPU brocktice#3				       CPU brocktice#2
------		-------				       -------
.....		 ......				      <Offline>
		mod_timer()
		 lock_timer_base
		   spin_lock_irqsave(&base->lock)

cpu_up(2)	 .....				        ......
							init_timers_cpu()
....		 .....				    	spin_lock_init(&base->lock)
.....		   spin_unlock_irqrestore(&base->lock)  ......
		   <spin_bug>

Allocation of per_cpu timer vector bases is done only once under
"tvec_base_done[]" check. In the current code, spinlock_initialization
of base->lock isn't under this check. When a CPU is up each time the
base lock is reinitialized. Move base spinlock initialization under
the check.

Signed-off-by: Tirupathi Reddy <tirupath@codeaurora.org>
Cc: stable@vger.kernel.org
Link: http://lkml.kernel.org/r/1368520142-4136-1-git-send-email-tirupath@codeaurora.org
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
jsgf pushed a commit to jsgf/pixel_linux that referenced this pull request Jun 7, 2013
_enable_preprogram is marked as __init, but is called from _enable
which is not. Without this patch, the board oopses after init. Tested
on custom hardware and on beagle board xM. Otherwise we can get:

Unable to handle kernel paging request at virtual address 000b0012
pgd = cf968000
*pgd=8fb06831, *pte=00000000, *ppte=00000000
PREEMPT ARM
Modules linked in:
CPU: 0    Not tainted  (3.9.0 brocktice#2)
PC is at _enable_preprogram+0x1c/0x24
LR is at omap_hwmod_enable+0x34/0x60
   psr: 80000093
sp : cf95de08  ip : 00002de5  fp : bec33d4c
r10: 00000000  r9 : 00000002  r8 : b6dd2c78
r7 : 00000004  r6 : 00000000  r5 : a0000013  r4 : cf95c000
r3 : 00000000  r2 : b6dd2c7c  r1 : 00000000  r0 : 000b0012
Flags: Nzcv  IRQs off  FIQs on  Mode SVC_32  ISA ARM  Segment user
Control: 10c5387d  Table: 8f968019  DAC: 00000015
Process otpcmd (pid: 607, stack limit = 0xcf95c230)
Stack: (0xcf95de08 to 0xcf95e000)
de00:                   00000001 cf91f840 00000000 c001d6fc 00000002 cf91f840
de20: cf8f7e10 c001de54 cf8f7e10 c001de78 c001de68 c01d5e80 00000000 cf8f7e10
de40: cf8f7e10 c01d5f28 cf8f7e10 c0530d30 00000000 c01d6f28 00000000 c0088664
de60: b6ea1000 cfb05284 cf95c000 00000001 cf95c000 60000013 00000001 cf95dee4
de80: cf870050 c01d7308 cf870010 cf870050 00000001 c0278b14 c0526f28 00000000
dea0: cf870050 ffff8e18 00000001 cf95dee4 00000000 c0274f7c cf870050 00000001
dec0: cf95dee4 cf1d8484 000000e0 c0276464 00000008 cf9c0000 00000007 c0276980
dee0: cf9c0000 00000064 00000008 cf1d8404 cf1d8400 c01cc05c 0000270a cf1d8504
df00: 00000023 cf1d8484 00000007 c01cc670 00000bdd 00000001 00000000 cf449e60
df20: cf1dde70 cf1d8400 bec33d18 cf1d8504 c0246f00 00000003 cf95c000 00000000
df40: bec33d4c c01cd078 00000003 cf1d8504 00000081 c01cbcb8 bec33d18 00000003
df60: bec33d18 c00a9034 00002000 c00a9c68 cf92fe00 00000003 c0246f00 cf92fe00
df80: 00000000 c00a9cb0 00000003 00000000 00008e70 00000000 b6f17000 00000036
dfa0: c000e484 c000e300 00008e70 00000000 00000003 c0246f00 bec33d18 bec33d18
dfc0: 00008e70 00000000 b6f17000 00000036 00000000 00000000 b6f6d000 bec33d4c
dfe0: b6ea1bd0 bec33d0c 00008c9c b6ea1bdc 60000010 00000003 00000000 00000000
(_omap_device_enable_hwmods+0x20/0x34)
(omap_device_enable+0x3c/0x50)
(_od_runtime_resume+0x10/0x1c)
(__rpm_callback+0x54/0x98)
(rpm_callback+0x64/0x7c)
(rpm_resume+0x434/0x554)
(__pm_runtime_resume+0x48/0x74)
(omap_i2c_xfer+0x28/0xe8)
(__i2c_transfer+0x3c/0x78)
(i2c_transfer+0x6c/0xc0)
(i2c_master_send+0x38/0x48)
(sha204p_send_command+0x60/0x9c)
(sha204c_send_and_receive+0x5c/0x1e0)
(sha204m_read+0x94/0xa0)
(otp_do_read+0x50/0xa4)
(vfs_ioctl+0x24/0x40)
(do_vfs_ioctl+0x1b0/0x1c0)
(sys_ioctl+0x38/0x54)
(ret_fast_syscall+0x0/0x30)
Code: e1a08002 ea000009 e598003c e592c05c (e7904003)

Cc: stable@vger.kernel.org
Signed-off-by: Jean-Philippe Fran=C3=A7ois <jp.francois@cynove.com>
Acked-by: Kevin Hilman <khilman@linaro.org>
[tony@atomide.com: updated description with oops]
Signed-off-by: Tony Lindgren <tony@atomide.com>
jsgf pushed a commit to jsgf/pixel_linux that referenced this pull request Jun 7, 2013
i2c: suppress lockdep warning on delete_device

Since commit 846f997 the following lockdep
warning is thrown in case i2c device is removed (via delete_device sysfs
attribute) which contains subdevices (e.g. i2c multiplexer):

=============================================
[ INFO: possible recursive locking detected ]
3.8.7-0-sampleversion-fct #8 Tainted: G           O
---------------------------------------------
bash/3743 is trying to acquire lock:
  (s_active#110){++++.+}, at: [<ffffffff802b3048>] sysfs_hash_and_remove+0x58/0xc8

but task is already holding lock:
  (s_active#110){++++.+}, at: [<ffffffff802b3cb8>] sysfs_write_file+0xc8/0x208

other info that might help us debug this:
  Possible unsafe locking scenario:

        CPU0
        ----
   lock(s_active#110);
   lock(s_active#110);

  *** DEADLOCK ***

  May be due to missing lock nesting notation

4 locks held by bash/3743:
  #0:  (&buffer->mutex){+.+.+.}, at: [<ffffffff802b3c3c>] sysfs_write_file+0x4c/0x208
  brocktice#1:  (s_active#110){++++.+}, at: [<ffffffff802b3cb8>] sysfs_write_file+0xc8/0x208
  brocktice#2:  (&adap->userspace_clients_lock/1){+.+.+.}, at: [<ffffffff80454a18>] i2c_sysfs_delete_device+0x90/0x238
  brocktice#3:  (&__lockdep_no_validate__){......}, at: [<ffffffff803dcc24>] device_release_driver+0x24/0x48

stack backtrace:
Call Trace:
[<ffffffff80575cc8>] dump_stack+0x8/0x34
[<ffffffff801b50fc>] __lock_acquire+0x161c/0x2110
[<ffffffff801b5c3c>] lock_acquire+0x4c/0x70
[<ffffffff802b60cc>] sysfs_addrm_finish+0x19c/0x1e0
[<ffffffff802b3048>] sysfs_hash_and_remove+0x58/0xc8
[<ffffffff802b7d8c>] sysfs_remove_group+0x64/0x148
[<ffffffff803d990c>] device_remove_attrs+0x9c/0x1a8
[<ffffffff803d9b1c>] device_del+0x104/0x1d8
[<ffffffff803d9c18>] device_unregister+0x28/0x70
[<ffffffff8045505c>] i2c_del_adapter+0x1cc/0x328
[<ffffffff8045802c>] i2c_del_mux_adapter+0x14/0x38
[<ffffffffc025c108>] pca954x_remove+0x90/0xe0 [pca954x]
[<ffffffff804542f8>] i2c_device_remove+0x80/0xe8
[<ffffffff803dca9c>] __device_release_driver+0x74/0xf8
[<ffffffff803dcc2c>] device_release_driver+0x2c/0x48
[<ffffffff803dbc14>] bus_remove_device+0x13c/0x1d8
[<ffffffff803d9b24>] device_del+0x10c/0x1d8
[<ffffffff803d9c18>] device_unregister+0x28/0x70
[<ffffffff80454b08>] i2c_sysfs_delete_device+0x180/0x238
[<ffffffff802b3cd4>] sysfs_write_file+0xe4/0x208
[<ffffffff8023ddc4>] vfs_write+0xbc/0x160
[<ffffffff8023df6c>] SyS_write+0x54/0xd8
[<ffffffff8013d424>] handle_sys64+0x44/0x64

The problem is already known for USB and PCI subsystems. The reason is that
delete_device attribute is defined statically in i2c-core.c and used for all
devices in i2c subsystem.

Discussion of original USB problem:
http://lkml.indiana.edu/hypermail/linux/kernel/1204.3/01160.html

Commit 356c05d introduced new macro to suppress
lockdep warnings for this special case and included workaround for USB code.

LKML discussion of the workaround:
http://lkml.indiana.edu/hypermail/linux/kernel/1205.1/03634.html

As i2c case is in principle the same, the same workaround could be used here.

Signed-off-by: Alexander Sverdlin <alexander.sverdlin@nsn.com>
Acked-by: Alan Stern <stern@rowland.harvard.edu>
Cc: Eric W. Biederman <ebiederm@xmission.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
jsgf pushed a commit to jsgf/pixel_linux that referenced this pull request Jun 7, 2013
This manifested as grep failing psuedo-randomly:

-------------->8---------------------
[ARCLinux]$ ip address show lo | grep inet
[ARCLinux]$ ip address show lo | grep inet
[ARCLinux]$ ip address show lo | grep inet
[ARCLinux]$
[ARCLinux]$ ip address show lo | grep inet
    inet 127.0.0.1/8 scope host lo
-------------->8---------------------

ARC700 MMU provides fully orthogonal permission bits per page:
Ur, Uw, Ux, Kr, Kw, Kx

The user mode page permission templates used to have all Kernel mode
access bits enabled.
This caused a tricky race condition observed with uClibc buffered file
read and UNIX pipes.

1. Read access to an anon mapped page in libc .bss: write-protected
   zero_page mapped: TLB Entry installed with Ur + K[rwx]

2. grep calls libc:getc() -> buffered read layer calls read(2) with the
   internal read buffer in same .bss page.
   The read() call is on STDIN which has been redirected to a pipe.
   read(2) => sys_read() => pipe_read() => copy_to_user()

3. Since page has Kernel-write permission (despite being user-mode
   write-protected), copy_to_user() suceeds w/o taking a MMU TLB-Miss
   Exception (page-fault for ARC). core-MM is unaware that kernel
   erroneously wrote to the reserved read-only zero-page (BUG brocktice#1)

4. Control returns to userspace which now does a write to same .bss page
   Since Linux MM is not aware that page has been modified by kernel, it
   simply reassigns a new writable zero-init page to mapping, loosing the
   prior write by kernel - effectively zero'ing out the libc read buffer
   under the hood - hence grep doesn't see right data (BUG brocktice#2)

The fix is to make all kernel-mode access permissions mirror the
user-mode ones. Note that the kernel still has full access to pages,
when accessed directly (w/o MMU) - this fix ensures that kernel-mode
access in copy_to_from() path uses the same faulting access model as for
pure user accesses to keep MM fully aware of page state.

The issue is peudo-random because it only shows up if the TLB entry
installed in brocktice#1 is present at the time of brocktice#3. If it is evicted out, due
to TLB pressure or some-such, then copy_to_user() does take a TLB Miss
Exception, with a routine write-to-anon COW processing installing a
fresh page for kernel writes and also usable as it is in userspace.

Further the issue was dormant for so long as it depends on where the
libc internal read buffer (in .bss) is mapped at runtime.
If it happens to reside in file-backed data mapping of libc (in the
page-aligned slack space trailing the file backed data), loader zero
padding the slack space, does the early cow page replacement, setting
things up at the very beginning itself.

With gcc 4.8 based builds, the libc buffer got pushed out to a real
anon mapping which triggers the issue.

Reported-by: Anton Kolesov <akolesov@synopsys.com>
Cc: <stable@vger.kernel.org> # 3.9
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
brocktice pushed a commit that referenced this pull request Jul 23, 2013
Sasha Levin noticed that the warning introduced by commit 6286ae9
("slab: Return NULL for oversized allocations) is being triggered:

  WARNING: CPU: 15 PID: 21519 at mm/slab_common.c:376 kmalloc_slab+0x2f/0xb0()
  can: request_module (can-proto-4) failed.
  mpoa: proc_mpc_write: could not parse ''
  Modules linked in:
  CPU: 15 PID: 21519 Comm: trinity-child15 Tainted: G W    3.10.0-rc4-next-20130607-sasha-00011-gcd78395-dirty #2
   0000000000000009 ffff880020a95e30 ffffffff83ff4041 0000000000000000
   ffff880020a95e68 ffffffff8111fe12 fffffffffffffff0 00000000000082d0
   0000000000080000 0000000000080000 0000000001400000 ffff880020a95e78
  Call Trace:
   [<ffffffff83ff4041>] dump_stack+0x4e/0x82
   [<ffffffff8111fe12>] warn_slowpath_common+0x82/0xb0
   [<ffffffff8111fe55>] warn_slowpath_null+0x15/0x20
   [<ffffffff81243dcf>] kmalloc_slab+0x2f/0xb0
   [<ffffffff81278d54>] __kmalloc+0x24/0x4b0
   [<ffffffff8196ffe3>] ? security_capable+0x13/0x20
   [<ffffffff812a26b7>] ? pipe_fcntl+0x107/0x210
   [<ffffffff812a26b7>] pipe_fcntl+0x107/0x210
   [<ffffffff812b7ea0>] ? fget_raw_light+0x130/0x3f0
   [<ffffffff812aa5fb>] SyS_fcntl+0x60b/0x6a0
   [<ffffffff8403ca98>] tracesys+0xe1/0xe6

Andrew Morton writes:

  __GFP_NOWARN is frequently used by kernel code to probe for "how big
  an allocation can I get".  That's a bit lame, but it's used on slow
  paths and is pretty simple.

However, SLAB would still spew a warning when a big allocation happens
if the __GFP_NOWARN flag is _not_ set to expose kernel bugs.

Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
[ penberg@kernel.org: improve changelog ]
Signed-off-by: Pekka Enberg <penberg@kernel.org>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants