While building a test kernel for the new esp driver (against
git-current), I hit this bug. Trivial fix, put the inline declaration
in the right place. :)
Signed-off-by: Tom "spot" Callaway <tcallawa@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
David Miller [Tue, 17 Apr 2007 21:40:46 +0000 (14:40 -0700)]
Fix compat sys_ipc() on sparc64
The 32-bit syscall trampoline for sys_ipc() on sparc64
was sign extending various arguments, which is bogus when
using compat_sys_ipc() since that function expects zero
extended copies of all the arguments.
This bug breaks the sparc64 kernel when built with gcc-4.2.x
among other things.
[SPARC64]: Fix arg passing to compat_sys_ipc().
Do not sign extend args using the sys32_ipc stub, that is
buggy and unnecessary.
Based upon an excellent report by Mikael Pettersson.
Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
David Miller [Tue, 17 Apr 2007 21:37:25 +0000 (14:37 -0700)]
Fix sparc64 SBUS IOMMU allocator
[SPARC64]: Fix SBUS IOMMU allocation code.
There are several IOMMU allocator bugs. Instead of trying to fix this
overly complicated code, just mirror the PCI IOMMU arena allocator
which is very stable and well stress tested.
I tried to make the code as identical as possible so we can switch
sun4u PCI and SBUS over to a common piece of IOMMU code. All that
will be need are two callbacks, one to do a full IOMMU flush and one
to do a streaming buffer flush.
This patch gets rid of a lot of hangs and mysterious crashes on SBUS
sparc64 systems, at least for me.
Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
sys_madvise has down_write of mmap_sem, then madvise_remove calls
vmtruncate_range which takes i_mutex and i_alloc_sem: no, we can
easily devise deadlocks from that ordering.
madvise_remove drop mmap_sem while calling vmtruncate_range: luckily,
since madvise_remove doesn't split or merge vmas, it's easy to handle
this case with a NULL prev, without restructuring sys_madvise. (Though
sad to retake mmap_sem when it's unlikely to be needed, and certainly
down_read is sufficient for MADV_REMOVE, unlike the other madvices.)
holepunch: fix disconnected pages after second truncate
shmem_truncate_range has its own truncate_inode_pages_range, to free any
pages racily instantiated while it was in progress: a SHMEM_PAGEIN flag
is set when this might have happened. But holepunching gets no chance
to clear that flag at the start of vmtruncate_range, so it's always set
(unless a truncate came just before), so holepunch almost always does
this second truncate_inode_pages_range.
shmem holepunch has unlikely swap<->file races hereabouts whatever we do
(without a fuller rework than is fit for this release): I was going to
skip the second truncate in the punch_hole case, but Miklos points out
that would make holepunch correctness more vulnerable to swapoff. So
keep the second truncate, but follow it by an unmap_mapping_range to
eliminate the disconnected pages (freed from pagecache while still
mapped in userspace) that it might have left behind.
Miklos Szeredi observes that during truncation of shmem page directories,
info->lock is released to improve latency (after lowering i_size and
next_index to exclude races); but this is quite wrong for holepunching,
which receives no such protection from i_size or next_index, and is left
vulnerable to races with shmem_unuse, shmem_getpage and shmem_writepage.
Hold info->lock throughout when holepunching? No, any user could prevent
rescheduling for far too long. Instead take info->lock just when needed:
in shmem_free_swp when removing the swap entries, and whenever removing
a directory page from the level above. But so long as we remove before
scanning, we can safely skip taking the lock at the lower levels, except
at misaligned start and end of the hole.
holepunch: fix shmem_truncate_range punching too far
Miklos Szeredi observes BUG_ON(!entry) in shmem_writepage() triggered
in rare circumstances, because shmem_truncate_range() erroneously
removes partially truncated directory pages at the end of the range:
later reclaim on pages pointing to these removed directories triggers
the BUG. Indeed, and it can also cause data loss beyond the hole.
Fix this as in the patch proposed by Miklos, but distinguish between
"limit" (how far we need to search: ignore truncation's next_index
optimization in the holepunch case - if there are races it's more
consistent to act on the whole range specified) and "upper_limit"
(how far we can free directory pages: generally we must be careful
to keep partially punched pages, but can relax at end of file -
i_size being held stable by i_mutex).
Avi Kivity [Sun, 22 Apr 2007 09:28:49 +0000 (12:28 +0300)]
KVM: MMU: Fix host memory corruption on i386 with >= 4GB ram
PAGE_MASK is an unsigned long, so using it to mask physical addresses on
i386 (which are 64-bit wide) leads to truncation. This can result in
page->private of unrelated memory pages being modified, with disasterous
results.
Fix by not using PAGE_MASK for physical addresses; instead calculate
the correct value directly from PAGE_SIZE. Also fix a similar BUG_ON().
Avi Kivity [Sun, 22 Apr 2007 09:28:05 +0000 (12:28 +0300)]
KVM: MMU: Fix guest writes to nonpae pde
KVM shadow page tables are always in pae mode, regardless of the guest
setting. This means that a guest pde (mapping 4MB of memory) is mapped
to two shadow pdes (mapping 2MB each).
When the guest writes to a pte or pde, we intercept the write and emulate it.
We also remove any shadowed mappings corresponding to the write. Since the
mmu did not account for the doubling in the number of pdes, it removed the
wrong entry, resulting in a mismatch between shadow page tables and guest
page tables, followed shortly by guest memory corruption.
This patch fixes the problem by detecting the special case of writing to
a non-pae pde and adjusting the address and number of shadow pdes zapped
accordingly.
This patch removes bogus zeroing of unused bits in output reports,
introduced in Simon's patch in commit d4ae650a.
According to the specification, any sane device should not care
about values of unused bits.
What is worse, the zeroing is done in a way which is broken and
might clear certain bits in output reports which are actually
_used_ - a device that has multiple fields with one value of
the size 1 bit each might serve as an example of why this is
bogus - the second call of hid_output_report() would clear the
first bit of report, which has already been set up previously.
This patch will break LEDs on SpaceNavigator, because this device
is broken and takes into account the bits which it shouldn't touch.
The quirk for this particular device will be provided in a separate
patch.
IB/mthca: Fix data corruption after FMR unmap on Sinai
In mthca_arbel_fmr_unmap(), the high bits of the key are masked off.
This gets rid of the effect of adjust_key(), which makes sure that
bits 3 and 23 of the key are equal when the Sinai throughput
optimization is enabled, and so it may happen that an FMR will end up
with bits 3 and 23 in the key being different. This causes data
corruption, because when enabling the throughput optimization, the
driver promises the HCA firmware that bits 3 and 23 of all memory keys
will always be equal.
Fix by re-applying adjust_key() after masking the key.
Thanks to Or Gerlitz for reproducing the problem, and Ariel Shahar for
help in debug.
Signed-off-by: Michael S. Tsirkin <mst@dev.mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
[IPV4] nl_fib_lookup: Initialise res.r before fib_res_put(&res)
When CONFIG_IP_MULTIPLE_TABLES is enabled, the code in nl_fib_lookup()
needs to initialize the res.r field before fib_res_put(&res) - unlike
fib_lookup(), a direct call to ->tb_lookup does not set this field.
Signed-off-by: Sergey Vlasov <vsu@altlinux.ru> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
A security issue is emerging. Disallow Routing Header Type 0 by default
as we have been doing for IPv4.
Note: We allow RH2 by default because it is harmless.
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Reply to NETLINK_FIB_LOOKUP messages were misrouted back to kernel,
which resulted in infinite recursion and stack overflow.
The bug is present in all kernel versions since the feature appeared.
The patch also makes some minimal cleanup:
1. Return something consistent (-ENOENT) when fib table is missing
2. Do not crash when queue is empty (does not happen, but yet)
3. Put result of lookup
Signed-off-by: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Brian Pomerantz [Mon, 2 Apr 2007 06:49:41 +0000 (23:49 -0700)]
fix page leak during core dump
When the dump cannot occur most likely because of a full file system and
the page to be written is the zero page, the call to page_cache_release()
is missed.
Signed-off-by: Brian Pomerantz <bapper@mvista.com> Cc: Hugh Dickins <hugh@veritas.com> Cc: Nick Piggin <nickpiggin@yahoo.com.au> Cc: David Howells <dhowells@redhat.com> Cc: <stable@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
"When we block_prepare_write() failed while ext3_prepare_write() we jump to
"failure" label and call ext3_prepare_failure() witch search last mapped bh
and invoke commit_write untill it. This is wrong!! because some bh from
begining to the last mapped bh may be not uptodate. As a result we commit to
disk not uptodate page content witch contains garbage from previous usage."
and
"Unexpected file size increasing."
Call trace the same as it was in first issue but result is different.
For example we have file with i_size is zero. we want write two blocks ,
but fs has only one free block.
->ext3_prepare_write(...from == 0, to == 2048)
retry:
->block_prepare_write() == -ENOSPC# we failed but allocated one block here.
->ext3_prepare_failure()
->commit_write( from == 0, to == 1024) # after this i_size becomes 1024 :)
if (ret == -ENOSPC && ext3_should_retry_alloc(inode->i_sb, &retries))
goto retry;
Finally when all retries will be spended ext3_prepare_failure return
-ENOSPC, but i_size was increased and later block trimm procedures can't
help here.
We don't appear to have the horsepower to fix these issues, so let's put
things back the way they were for now.
Based on this, those last two statements fill_result_tf()
appear to me to be in the wrong order, in that the tf->flags
are uninitialized at the point where tf_read() is invoked.
So for lba48 commands, tf_read() won't be reading back the
full lba48 register contents..
Correct?
This patch corrects fill_result_tf() so that the flags
get copied to result_tf before they are used by tf_read().
Signed-off-by: Mark Lord <mlord@pobox.com> Signed-off-by: Jeff Garzik <jeff@garzik.org> Cc: Chuck Ebbert <cebbert@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Conke Hu [Tue, 10 Apr 2007 17:06:56 +0000 (13:06 -0400)]
ahci.c: walkaround for SB600 SATA internal error issue
ahci.c: walkaround for SB600 SATA internal error issue
There is a HW issue in ATI SB600 SATA that PxSERR.E should not be
set on some conditions, for example, when there is no media in SATA
CD/DVD drive or media is not ready, AHCI controller fails to execute
ATAPI commands and reports PORT_IRQ_TF_ERR, but ATI SB600 SATA
controller sets PxSERR.E at the
same time, which is not necessary.
This patch is just to ignore the INTERNAL ERROR in such case.
Without this patch, ahci error handler will report many errors as
below:
----------- cut from dmesg -----------
ata9: soft resetting port
ata9: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
ata9.00: configured for UDMA/33
ata9: EH complete
ata9.00: exception Emask 0x40 SAct 0x0 SErr 0x800 action 0x2
ata9.00: (irq_stat 0x40000001)
ata9.00: cmd a0/00:00:00:00:20/00:00:00:00:00/a0 tag 0 cdb 0x0 data 0
res 51/24:03:00:00:20/00:00:00:00:00/a0 Emask 0x40 (internal error)
ata9: soft resetting port
ata9: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
ata9.00: configured for UDMA/33
ata9: EH complete
ata9.00: exception Emask 0x40 SAct 0x0 SErr 0x800 action 0x2
ata9.00: (irq_stat 0x40000001)
ata9.00: cmd a0/01:00:00:00:00/00:00:00:00:00/a0 tag 0 cdb 0x43 data 12 in
res 51/24:03:00:00:00/00:00:00:00:00/a0 Emask 0x40 (internal error)
-------- end cut ---------
Signed-off-by: Conke Hu <conke.hu@amd.com> Signed-off-by: Jeff Garzik <jeff@garzik.org> Cc: Chuck Ebbert <cebbert@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
softmac: avoid assert in ieee80211softmac_wx_get_rate
[PATCH] softmac: avoid assert in ieee80211softmac_wx_get_rate
Unconfigured bcm43xx device can hit an assert() during wx_get_rate
queries. This is because bcm43xx calls ieee80211softmac_start late
(i.e. during open instead of probe).
Neil Brown [Wed, 4 Apr 2007 19:29:43 +0000 (15:29 -0400)]
knfsd: allow nfsd READDIR to return 64bit cookies
From Neil Brown <neilb@suse.de>
[PATCH] knfsd: allow nfsd READDIR to return 64bit cookies
->readdir passes lofft_t offsets (used as nfs cookies) to
nfs3svc_encode_entry{,_plus}, but when they pass it on to encode_entry it
becomes an 'off_t', which isn't good.
So filesystems that returned 64bit offsets would lose.
Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Cc: Chuck Ebbert <cebbert@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
IDE error recovery is using IDLE IMMEDIATE if the drive is busy or has DRQ set.
This violates the ATA spec (can only send IDLEÃ\82 IMMEDIATE when drive is not
busy) and really hoses up some drives (modern drives will not be able to
recover using this error handling). The correct thing to do is issue a SRST
followed by a SET FEATURES command. This is what Western Digital recommends
for error recovery and what Western Digital says Windows does. Ã\82 ItÃ\82 also does
not violate the ATA spec as far as I can tell.
Bart:
* port the patch over the current tree
* undo the recalibration code removal
* send SET FEATURES command after checking for good drive status
* don't check whether the current request is of REQ_TYPE_ATA_{CMD,TASK}
type because we need to send SET FEATURES before handling any requests
* some pre-ATA4 drives require INITIALIZE DEVICE PARAMETERS command before
other commands (except IDENTIFY) so send SET FEATURES only if there are
no pending drive->special requests
* update comments and patch description
* any bugs introduced by this patch are mine and not Suleiman's :-)
David Miller [Tue, 10 Apr 2007 20:39:35 +0000 (13:39 -0700)]
Fix TCP slow_start_after_idle sysctl
[TCP]: slow_start_after_idle should influence cwnd validation too
For the cases that slow_start_after_idle are meant to deal
with, it is almost a certainty that the congestion window
tests will think the connection is application limited and
we'll thus decrease the cwnd there too. This defeats the
whole point of setting slow_start_after_idle to zero.
So test it there too.
We do not cancel out the entire tcp_cwnd_validate() function
so that if the sysctl is changed we still have the validation
state maintained.
Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Userspace uses an integer for TCA_TCINDEX_SHIFT, the kernel was changed
to expect and use a u16 value in 2.6.11, which broke compatibility on
big endian machines. Change back to use int.
Reported by Ole Reinartz <ole.reinartz@gmx.de>
Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Herbert Xu [Tue, 10 Apr 2007 20:37:24 +0000 (13:37 -0700)]
Fix IPSEC replay window handling
[IPSEC]: Reject packets within replay window but outside the bit mask
Up until this point we've accepted replay window settings greater than
32 but our bit mask can only accomodate 32 packets. Thus any packet
with a sequence number within the window but outside the bit mask would
be accepted.
This patch causes those packets to be rejected instead.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
The sense buffer code in scsi_send_eh_cmnd was changed to use
alloc_page() and a scatter list, but the sense data copy was not
updated to match so what we actually get in the sense buffer is total
grabage starting with the kernel address of the struct page we got.
Basically the stack frame of scsi_send_eh_cmd() is what ends up
in the sense buffer.
Depending upon how pointers look on a given platform, you can
end up getting sr_ioctl.c errors when you mount a cdrom. If
the CDROM gives a check condition for GPCMD_GET_CONFIGURATION issued
by drivers/cdrom/cdrom.c:cdrom_mmc_profile(), sr_ioctl will
spit out this error message in sr_do_ioctl() with the way pointers
are on sparc64:
[IPv6]: Fix incorrect length check in rawv6_sendmsg()
In article <20070329.142644.70222545.davem@davemloft.net> (at Thu, 29 Mar 2007 14:26:44 -0700 (PDT)), David Miller <davem@davemloft.net> says:
> From: Sridhar Samudrala <sri@us.ibm.com>
> Date: Thu, 29 Mar 2007 14:17:28 -0700
>
> > The check for length in rawv6_sendmsg() is incorrect.
> > As len is an unsigned int, (len < 0) will never be TRUE.
> > I think checking for IPV6_MAXPLEN(65535) is better.
> >
> > Is it possible to send ipv6 jumbo packets using raw
> > sockets? If so, we can remove this check.
>
> I don't see why such a limitation against jumbo would exist,
> does anyone else?
>
> Thanks for catching this Sridhar. A good compiler should simply
> fail to compile "if (x < 0)" when 'x' is an unsigned type, don't
> you think :-)
Dave, we use "int" for returning value,
so we should fix this anyway, IMHO;
we should not allow len > INT_MAX.
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Acked-by: Sridhar Samudrala <sri@us.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Patrick McHardy [Tue, 10 Apr 2007 20:29:44 +0000 (13:29 -0700)]
Fix IFB net driver input device crashes
[IFB]: Fix crash on input device removal
The input_device pointer is not refcounted, which means the device may
disappear while packets are queued, causing a crash when ifb passes packets
with a stale skb->dev pointer to netif_rx().
Fix by storing the interface index instead and do a lookup where neccessary.
Signed-off-by: Patrick McHardy <kaber@trash.net> Acked-by: Jamal Hadi Salim <hadi@cyberus.ca> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Patrick McHardy [Tue, 10 Apr 2007 12:47:21 +0000 (14:47 +0200)]
NETFILTER: ipt_CLUSTERIP: fix oops in checkentry function
[NETFILTER]: ipt_CLUSTERIP: fix oops in checkentry function
The clusterip_config_find_get() already increases entries reference
counter, so there is no reason to do it twice in checkentry() callback.
This causes the config to be freed before it is removed from the list,
resulting in a crash when adding the next rule.
Signed-off-by: Jaroslav Kysela <perex@suse.cz> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Neil Brown [Wed, 11 Apr 2007 03:31:07 +0000 (13:31 +1000)]
Fix calculation for size of filemap_attr array in md/bitmap.
If 'num_pages' were ever 1 more than a multiple of 8 (32bit platforms)
for of 16 (64 bit platforms). filemap_attr would be allocated one
'unsigned long' shorter than required. We need a round-up in there.
Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Adam Kropelin [Wed, 11 Apr 2007 09:13:13 +0000 (11:13 +0200)]
HID: Do not discard truncated input reports
HID: Do not discard truncated input reports
Truncated reports should not be discarded since it prevents buggy
devices from communicating with userspace.
Prior to the regession introduced in 2.6.20, a shorter-than-expected
report in hid_input_report() was passed thru after having the missing
bytes cleared. This behavior was established over a few patches in the
2.6.early-teens days, including commit cd6104572bca9e4afe0dcdb8ecd65ef90b01297b.
This patch restores the previous behavior and fixes the regression.
The ADEF bits in the TSCR register have different meanings in read
and write mode. For this reason ADEF has to be reset on every
read-modify-write operation.
This patch introduces a special write function for this register, which
takes care of it.
Thanks to Holger Magnussen for pointing my nose at this problem.
Driver needs to turn off carrier when down, otherwise it can
confuse bonding and bridging and looks like carrier is on immediately
when it is brought back up.
Signed-off-by: Stephen Hemminger <shemminger@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Thomas Gleixner [Mon, 2 Apr 2007 12:25:31 +0000 (14:25 +0200)]
i386: fix file_read_actor() and pipe_read() for original i386 systems
The __copy_to_user_inatomic() calls in file_read_actor() and pipe_read()
are broken on original i386 machines, where WP-works-ok == false, as
__copy_to_user_inatomic() on such systems calls functions which might
sleep and/or contain cond_resched() calls inside of a kmap_atomic()
region.
The original check for WP-works-ok was in access_ok(), but got moved
during the 2.5 series to fix a race vs. swap.
Return the number of bytes to copy in the case where we are in an atomic
region, so the non atomic code pathes in file_read_actor() and
pipe_read() are taken.
This could be optimized to avoid the kmap_atomic by moving the check for
WP-works-ok into fault_in_pages_writeable(), but this is more intrusive
and can be done later.
Jan Beulich [Sun, 1 Apr 2007 18:41:26 +0000 (20:41 +0200)]
kbuild: fix dependency generation
Commit 2e3646e51b2d6415549b310655df63e7e0d7a080 changed the way
the split config tree is built, but failed to also adjust fixdep
accordingly - if changing a config option from or to m, files
referencing the respective CONFIG_..._MODULE (but not the
corresponding CONFIG_...) didn't get rebuilt.
The problem is that trisate symbol are represent with three
different symbols:
SYMBOL=n => no symbol defined
SYMBOL=y => CONFIG_SYMBOL defined to '1'
SYMBOL=m => CONFIG_SYMBOL_MODULE defined to '1'
But conf_split_config do not distingush between the =y and =m case,
so only the =y case is honoured.
This is fixed in fixdep so when a CONFIG symbol with
_MODULE is found we skip that part and only look
for the CONFIG_SYMBOL version.
Signed-off-by: Jan Beulich <jbeulich@novell.com> Signed-off-by: Sam Ravnborg <sam@ravnborg.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
There was a typo in commit b40b478e9972ec14cf144f1a03f88918789cbfe0,
preventing it from working - 32bit binaries crashed hopelessly before
the below fix and work perfectly now.
Signed-off-by: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
[chrisw: update changelog to reflect -stable commit id] Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Jean Delvare [Thu, 5 Apr 2007 06:52:46 +0000 (23:52 -0700)]
APPLETALK: Fix a remotely triggerable crash
When we receive an AppleTalk frame shorter than what its header says,
we still attempt to verify its checksum, and trip on the BUG_ON() at
the end of function atalk_sum_skb() because of the length mismatch.
This has security implications because this can be triggered by simply
sending a specially crafted ethernet frame to a target victim,
effectively crashing that host. Thus this qualifies, I think, as a
remote DoS. Here is the frame I used to trigger the crash, in npg
format:
<Appletalk Killer>
{
# Ethernet header -----
XX XX XX XX XX XX # Destination MAC
00 00 00 00 00 00 # Source MAC
00 1D # Length
The destination MAC address must be set to those of the victim.
The severity is mitigated by two requirements:
* The target host must have the appletalk kernel module loaded. I
suspect this isn't so frequent.
* AppleTalk frames are non-IP, thus I guess they can only travel on
local networks. I am no network expert though, maybe it is possible
to somehow encapsulate AppleTalk packets over IP.
The bug has been reported back in June 2004:
http://bugzilla.kernel.org/show_bug.cgi?id=2979
But it wasn't investigated, and was closed in July 2006 as both
reporters had vanished meanwhile.
This code was new in kernel 2.6.0-test5:
http://git.kernel.org/?p=linux/kernel/git/tglx/history.git;a=commitdiff;h=7ab442d7e0a76402c12553ee256f756097cae2d2
And not modified since then, so we can assume that vanilla kernels
2.6.0-test5 and later, and distribution kernels based thereon, are
affected.
Note that I still do not know for sure what triggered the bug in the
real-world cases. The frame could have been corrupted by the kernel if
we have a bug hiding somewhere. But more likely, we are receiving the
faulty frame from the network.
Signed-off-by: Jean Delvare <jdelvare@suse.de> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Chris Wright <chrisw@sous-sol.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Daniel Drake [Tue, 27 Mar 2007 05:32:15 +0000 (21:32 -0800)]
generic_serial: fix decoding of baud rate
Commit d720bc4b8fc5d6d179ef094908d4fbb5e436ffad partially removed a private
implementation of baud speed decoding. However it doesn't seem to be
complete: after the speed is decoded, it is still being used as an index to
a local speed table (array overrun, no doubt).
This was found by Graham Murray who noticed it caused a 2.6.19 regression
with the SX driver: https://bugs.gentoo.org/170554
Signed-off-by: Daniel Drake <dsd@gentoo.org> Acked-by: Alan Cox <alan@lxorguk.ukuu.org.uk> Cc: Russell King <rmk@arm.linux.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Jeff Garzik [Wed, 28 Mar 2007 22:39:22 +0000 (18:39 -0400)]
libata: sata_mv: Fix 50xx irq mask
[libata] sata_mv: Fix 50xx irq mask
IRQ mask bits assumed a 60xx or newer generation chip, which is very
wrong for the 50xx series. Luckily both generations shared the per-port
interrupt mask bits, leaving only the "misc chip features" bits to be
completely mismatched.
Fix 50xx by ensuring we only program bits that exist.
Mark Lord [Wed, 28 Mar 2007 22:35:21 +0000 (18:35 -0400)]
libata bugfix: HDIO_DRIVE_TASK
libata bugfix: HDIO_DRIVE_TASK
I was trying to use HDIO_DRIVE_TASK for something today,
and discovered that the libata implementation does not copy
over the upper four LBA bits from args[6].
This is serious, as any tools using this ioctl would have their
commands applied to the wrong sectors on the drive, possibly resulting
in disk corruption.
Ideally, newer apps should use SG_IO/ATA_16 directly,
avoiding this bug. But with libata poised to displace drivers/ide,
better compatibility here is a must.
This patch fixes libata to use the upper four LBA bits passed
in from the ioctl.
The original drivers/ide implementation copies over all bits
except for the master/slave select bit. With this patch,
libata will copy only the four high-order LBA bits,
just in case there are assumptions elsewhere in libata (?).
Signed-off-by: Mark Lord <mlord@pobox.com> Cc: Chuck Ebbert <cebbert@redhat.com> Signed-off-by: Jeff Garzik <jeff@garzik.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Tejun Heo [Wed, 28 Mar 2007 22:33:39 +0000 (18:33 -0400)]
libata: clear TF before IDENTIFYing
libata: clear TF before IDENTIFYing
Some devices chock if Feature is not clear when IDENTIFY is issued.
Set ATA_TFLAG_ISADDR | ATA_TFLAG_DEVICE for IDENTIFY such that whole
TF is cleared when reading ID data.
Kudos to Art Haas for testing various futile patches over several
months and Mark Lord for pointing out the fix.
Signed-off-by: Tejun Heo <htejun@gmail.com> Cc: Art Haas <ahaas@airmail.net> Cc: Mark Lord <mlord@pobox.com> Cc: Chuck Ebbert <cebbert@redhat.com> Signed-off-by: Jeff Garzik <jeff@garzik.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
which is a no-op in this case, so we don't advance to the next element
of the scatterlist array:
if (walk->offset >= walk->sg->offset + walk->sg->length)
scatterwalk_start(walk, sg_next(walk->sg));
and we end up copying the same data twice.
It appears that other callers of scatterwalk_{page}done first advance
walk->offset, so I believe that's the correct thing to do here.
This caused a bug in NFS when run with krb5p security, which would
cause some writes to fail with permissions errors--for example, writes
of less than 8 bytes (the des blocksize) at the start of a file.
Alan Tyson [Wed, 28 Mar 2007 21:40:35 +0000 (17:40 -0400)]
CIFS: reset mode when client notices that ATTR_READONLY is no longer set
[CIFS] reset mode when client notices that ATTR_READONLY is no longer set
[<cebbert@redhat.com>: removed changelog part of patch]
Cc: Chuck Ebbert <cebbert@redhat.com> Signed-off-by: Alan Tyso <atyson@hp.com> Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Steve French <sfrench@us.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Steve French [Wed, 28 Mar 2007 21:40:03 +0000 (17:40 -0400)]
CIFS: Allow reset of file to ATTR_NORMAL when archive bit not set
[CIFS] Allow reset of file to ATTR_NORMAL when archive bit not set
When a file had a dos attribute of 0x1 (readonly - but dos attribute
of archive was not set) - doing chmod 0777 or equivalent would
try to set a dos attribute of 0 (which some servers ignore)
rather than ATTR_NORMAL (0x20) which most servers accept.
Does not affect servers which support the CIFS Unix Extensions.
[<cebbert@redhat.com>: removed changelog part of patch]
Cc: Chuck Ebbert <cebbert@redhat.com> Acked-by: Prasad Potluri <pvp@us.ibm.com> Acked-by: Shirish Pargaonkar <shirishp@us.ibm.com> Signed-off-by: Steve French <sfrench@us.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
ide: revert "ide: fix drive side 80c cable check, take 2" for now
"ide: fix drive side 80c cable check, take 2" patch from Tejun Heo (commit fab59375b9543f84d1714f7dd00f5d11e531bd3e) fixed 80c bit test (bit13 of word93)
but we also need to fix master/slave IDENTIFY order (slave device should be
probed first in order to make it release PDIAG- signal) and we should also
check for pre-ATA3 slave devices (which may not release PDIAG- signal).
Unfortunately the fact that IDE driver doesn't reset devices itself helps
only a bit as it seems that some BIOS-es reset ATA devices after programming
the chipset, some BIOS-es can be set to not probe/configure selected devices,
there may be no BIOS in case of add-on cards etc.
Since we are quite late in the release cycle and the required changes will
affect a lot of systems just revert the fix for now.
Vasily Averin [Wed, 28 Mar 2007 16:35:29 +0000 (12:35 -0400)]
i2o: block IO errors on i2o disk
I2O subsystem has been broken in mainstream several months ago (after
2.6.18). Commit 4aff5e2333c9a1609662f2091f55c3f6fffdad36 from Jens
Axboe split struct request ->flags into two parts: cmd_type and
cmd_flags.
In i2o layer this patch has replaced flag REQ_SPECIAL by the according
cmd_type. However i2o has used REQ_SPECIAL not as command type but as
driver-specific flag for the debug purposes. As result all i2o requests
have type "special" now, are not processed to the hardware and fail with
I/O error:
i2o/hda:<3>Buffer I/O error on device i2o/hda, logical block 0
Buffer I/O error on device i2o/hda, logical block 0
Buffer I/O error on device i2o/hda, logical block 0
unable to read partition table
block-osm: device added (TID: 207): i2o/hda
The following patch removes the extra debug checks without any drawbacks and
restores the normal driver's work.
Tejun Heo [Wed, 28 Mar 2007 16:31:53 +0000 (12:31 -0400)]
jmicron: make ide jmicron driver play nice with libata ones
jmicron: make ide jmicron driver play nice with libata ones
When libata is configured, the device is configured such that SATA and
PATA ports live in separate functions with different programming
interfaces. pata_jmicron and ide jmicron drivers can drive only the
PATA part.
This patch makes jmicron match PCI class code such that it doesn't
attach itself to the SATA part preventing the proper ahci driver from
attaching.
Oliver Endriss [Thu, 29 Mar 2007 01:22:42 +0000 (21:22 -0400)]
V4L: saa7146: Fix allocation of clipping memory
V4L: saa7146: Fix allocation of clipping memory
Olaf Hering pointed out that SAA7146_CLIPPING_MEM would become
very large for PAGE_SIZE > 4K.
In fact, the number of clipping windows is limited to 16,
and calculate_clipping_registers_rect() does not use more
than 256 bytes. SAA7146_CLIPPING_MEM adjusted accordingly.
Simon Arlott [Thu, 29 Mar 2007 01:22:40 +0000 (21:22 -0400)]
dvb-core: fix several locking related problems
dvb-core: fix several locking related problems
Fix several instances of dvb-core functions using mutex_lock_interruptible
and returning -ERESTARTSYS where the calling function will either never
retry or never check the return value.
These cause a race condition with dvb_dmxdev_filter_free and
dvb_dvr_release, both of which are filesystem release functions whose
return value is ignored and will never be retried. When this happens it
becomes impossible to open dvr0 again (-EBUSY) since it has not been
released properly.
Hans Verkuil [Thu, 29 Mar 2007 01:22:35 +0000 (21:22 -0400)]
V4L: msp_attach must return 0 if no msp3400 was found.
V4L: msp_attach must return 0 if no msp3400 was found.
Returning -1 causes the probe to stop, but it should just continue
instead. This patch fixes an annoying 'i2c_adapter i2c-7: Client
creation failed at 0x44 (-1)' kernel message that appeared in 2.6.20
Trent Piepho [Thu, 29 Mar 2007 01:22:28 +0000 (21:22 -0400)]
V4L: radio: Fix error in Kbuild file
V4L: radio: Fix error in Kbuild file
All the radio drivers need video_dev, but they were depending on
VIDEO_DEV!=n. That meant that one could try to compile the driver into
the kernel when VIDEO_DEV=m, which will not work. If video_dev is a
module, then the radio drivers must be modules too.
Michael Krufky [Thu, 29 Mar 2007 01:22:16 +0000 (21:22 -0400)]
DVB: fix nxt200x rf input switching
DVB: fix nxt200x rf input switching
After dvb tuner refactoring, the pll buffer has been altered such that
the pll address is now stored in buf[0]. Instead of sending buf to
set_pll_input, we should send buf+1.
Thomas Graf [Thu, 29 Mar 2007 19:34:13 +0000 (12:34 -0700)]
NET: Fix FIB rules compatability
[NET]: Fix fib_rules compatibility breakage
Based upon a patch from Patrick McHardy.
The fib_rules netlink attribute policy introduced in 2.6.19 broke
userspace compatibilty. When specifying a rule with "from all"
or "to all", iproute adds a zero byte long netlink attribute,
but the policy requires all addresses to have a size equal to
sizeof(struct in_addr)/sizeof(struct in6_addr), resulting in a
validation error.
Check attribute length of FRA_SRC/FRA_DST in the generic framework
by letting the family specific rules implementation provide the
length of an address. Report an error if address length is non
zero but no address attribute is provided. Fix actual bug by
checking address length for non-zero instead of relying on
availability of attribute.
Signed-off-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
gcc gets unhappy when the MSTK_SET macro's u8 __val variable
is updated with &= ~0xff (MSTK_YEAR_MASK). Making the constant
unsigned fixes the problem.
[ I fixed up the sparc32 side as well -DaveM ]
Signed-off-by: Mikael Pettersson <mikpe@it.uu.se> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Alexey Dobriyan [Thu, 29 Mar 2007 19:22:40 +0000 (12:22 -0700)]
NET: Fix sock_attach_fd() failure in sys_accept()
[NET]: Correct accept(2) recovery after sock_attach_fd()
* d_alloc() in sock_attach_fd() fails leaving ->f_dentry of new file NULL
* bail out to out_fd label, doing fput()/__fput() on new file
* but __fput() assumes valid ->f_dentry and dereferences it
Signed-off-by: Alexey Dobriyan <adobriyan@sw.ru> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
David Miller [Thu, 29 Mar 2007 19:16:27 +0000 (12:16 -0700)]
VIDEO: Fix FFB DAC revision probing
[VIDEO] ffb: Fix two DAC handling bugs.
The determination of whether the DAC has inverted cursor logic is
broken, import the version checks the X.org driver uses to fix this.
Next, when we change the timing generator, borrow code from X.org that
does 10 NOP reads of the timing generator register afterwards to make
sure the video-enable transition occurs cleanly.
Finally, use macros for the DAC registers and fields in order to
provide documentation for the next person who reads this code.
Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
We were only checking if there was enough space to put the int, but
left len as specified by the (malicious) user, sigh, fix it by setting
len to sizeof(val) and transfering just one int worth of data, the one
asked for.
Also check for negative len values.
Signed-off-by: Arnaldo Carvalho de Melo <acme@ghostprotocols.net> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
G. Liakhovetski [Tue, 27 Mar 2007 02:07:40 +0000 (19:07 -0700)]
PPP: Fix PPP skb leak
[PPP]: Don't leak an sk_buff on interface destruction.
Signed-off-by: G. Liakhovetski <gl@dsa-ac.de> Acked-by: Paul Mackerras <paulus@samba.org> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
David Miller [Tue, 27 Mar 2007 01:56:59 +0000 (18:56 -0700)]
IPV6: Fix ipv6 round-robin locking.
[IPV6]: Fix routing round-robin locking.
As per RFC2461, section 6.3.6, item #2, when no routers on the
matching list are known to be reachable or probably reachable we
do round robin on those available routes so that we make sure
to probe as many of them as possible to detect when one becomes
reachable faster.
Each routing table has a rwlock protecting the tree and the linked
list of routes at each leaf. The round robin code executes during
lookup and thus with the rwlock taken as a reader. A small local
spinlock tries to provide protection but this does not work at all
for two reasons:
1) The round-robin list manipulation, as coded, goes like this (with
read lock held):
walk routes finding head and tail
spin_lock();
rotate list using head and tail
spin_unlock();
While one thread is rotating the list, another thread can
end up with stale values of head and tail and then proceed
to corrupt the list when it gets the lock. This ends up causing
the OOPS in fib6_add() later onthat many people have been hitting.
2) All the other code paths that run with the rwlock held as
a reader do not expect the list to change on them, they
expect it to remain completely fixed while they hold the
lock in that way.
So, simply stated, it is impossible to implement this correctly using
a manipulation of the list without violating the rwlock locking
semantics.
Reimplement using a per-fib6_node round-robin pointer. This way we
don't need to manipulate the list at all, and since the round-robin
pointer can only ever point to real existing entries we don't need
to perform any locking on the changing of the round-robin pointer
itself. We only need to reset the round-robin pointer to NULL when
the entry it is pointing to is removed.
The idea is from Thomas Graf and it is very similar to how this
was implemented before the advanced router selection code when in.
Signed-off-by: David S. Miller <davem@davemloft.net>
Patrick McHardy [Tue, 27 Mar 2007 01:15:37 +0000 (18:15 -0700)]
NET_SCHED: Fix ingress qdisc locking.
[NET_SCHED]: Fix ingress locking
Ingress queueing uses a seperate lock for serializing enqueue operations,
but fails to properly protect itself against concurrent changes to the
qdisc tree. Use queue_lock for now since the real fix it quite intrusive.
Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
cls_basic doesn't allocate tp->root before it is linked into the
active classifier list, resulting in a NULL pointer dereference
when packets hit the classifier before its ->change function is
called.
Reported by Chris Madden <chris@reflexsecurity.com>
Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Stefan Richter [Sun, 25 Mar 2007 19:24:43 +0000 (21:24 +0200)]
ieee1394: dv1394: fix CardBus card ejection
Fix NULL pointer dereference on hot ejection of a FireWire card while
dv1394 was loaded. http://bugzilla.kernel.org/show_bug.cgi?id=7121
I did not test card ejection with open /dev/dv1394 files yet.
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Currently we have a confused udelay implementation.
* __const_udelay does not accept usecs but xloops in i386 and x86_64
* our implementation requires usecs as arg
* it gets a xloops count when called by asm/arch/delay.h
Bugs related to this (extremely long shutdown times) where reported by some
x86_64 users, especially using Device Mapper.
To hit this bug, a compile-time constant time parameter must be passed - that's
why UML seems to work most times.
Fix this with a simple udelay implementation.
Signed-off-by: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Jeff Dike [Sun, 25 Mar 2007 17:01:44 +0000 (13:01 -0400)]
UML - use correct register file size everywhere
This patch uses MAX_REG_NR consistently to refer to the register file
size. FRAME_SIZE isn't sufficient because on x86_64, it is smaller
than the ptrace register file size. MAX_REG_NR was introduced as a
consistent way to get the number of registers, but wasn't used
everywhere it should be.
When this causes a problem, it makes PTRACE_SETREGS fail on x86_64
because of a corrupted segment register value in the known-good
register file. The patch also adds a register dump at that point in
case there are any future problems here.
Signed-off-by: Jeff Dike <jdike@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Jeff Dike [Sun, 25 Mar 2007 16:54:32 +0000 (12:54 -0400)]
UML - Fix static linking
During a static link, ld has started putting a .note section in the
.uml.setup.init section. This has the result that the UML setups
begin with 32 bytes of garbage and UML crashes immediately on boot.
This patch creates a specific .note section for ld to drop this stuff
into.
Signed-off-by: Jeff Dike <jdike@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Jeff Dike [Fri, 23 Mar 2007 19:37:30 +0000 (15:37 -0400)]
UML - host VDSO fix
This fixes a problem seen by a number of people running UML on newer host
kernels. init would hang with an infinite segfault loop.
It turns out that the host kernel was providing a AT_SYSINFO_EHDR of
0xffffe000, which faked UML into believing that the host VDSO page could be
reused. However, AT_SYSINFO pointed into the middle of the address space, and
was unmapped as a result. Because UML was providing AT_SYSINFO_EHDR and
AT_SYSINFO to its own processes, these would branch to nowhere when trying to
use the VDSO.
The fix is to also check the location of AT_SYSINFO when deciding whether to
use the host's VDSO.
Signed-off-by: Jeff Dike <jdike@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Robert Hancock [Thu, 22 Mar 2007 16:39:04 +0000 (12:39 -0400)]
sata_nv: delay on switching between NCQ and non-NCQ commands
sata_nv: delay on switching between NCQ and non-NCQ commands
This patch appears to solve some problems with commands timing out in
cases where an NCQ command is immediately followed by a non-NCQ command
(or possibly vice versa). This is a rather ugly solution, but until we
know more about why this is needed, this is about all we can do.
[backport to 2.6.20 by Chuck Ebbert <cebbert@redhat.com>]
Signed-off-by: Robert Hancock <hancockr@shaw.ca> Cc: Chuck Ebbert <cebbert@redhat.com> Signed-off-by: Jeff Garzik <jeff@garzik.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Albert Lee [Wed, 21 Mar 2007 20:08:08 +0000 (16:08 -0400)]
ide: clear bmdma status in ide_intr() for ICHx controllers (revised #4)
ide: clear bmdma status in ide_intr() for ICHx controllers (revised #4)
patch 1/2 (revised):
- Fix drive->waiting_for_dma to work with CDB-intr devices.
- Do the dma status clearing in ide_intr() and add a new
hwif->ide_dma_clear_irq for Intel ICHx controllers.
Revised per Alan, Sergei and Bart's advice.
Patch against 2.6.20-rc6. Tested ok on my ICH4 and pdc20275 adapters.
Please review/apply, thanks.
Signed-off-by: Albert Lee <albertcc@tw.ibm.com> Cc: Sergei Shtylyov <sshtylyov@ru.mvista.com> Cc: Alan Cox <alan@lxorguk.ukuu.org.uk> Cc: Adam Hawks <awhawks@us.ibm.com> Cc: Chuck Ebbert <cebbert@redhat.com> Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Kai Makisara [Mon, 19 Mar 2007 23:58:57 +0000 (16:58 -0700)]
st: fix Tape dies if wrong block size used, bug 7919
[SCSI] st: fix Tape dies if wrong block size used, bug 7919
On Thu, 1 Feb 2007, Andrew Morton wrote:
> On Thu, 1 Feb 2007 15:34:29 -0800
> bugme-daemon@bugzilla.kernel.org wrote:
>
> > http://bugzilla.kernel.org/show_bug.cgi?id=7919
> >
> > Summary: Tape dies if wrong block size used
> > Kernel Version: 2.6.20-rc5
> > Status: NEW
> > Severity: normal
> > Owner: scsi_drivers-other@kernel-bugs.osdl.org
> > Submitter: dmartin@sccd.ctc.edu
> >
> >
> > Most recent kernel where this bug did *NOT* occur: 2.6.17.14
> >
> > Other Kernels Tested and Results:
> >
> > OK 2.6.15.7
> > OK 2.6.16.37
> > OK 2.6.17.14
> > BAD 2.6.18.6
> > BAD 2.6.18-1.2869.fc6
> > BAD 2.6.19.2 +
> > BAD 2.6.20-rc5
> >
> > NOTE: 2.6.18-1.2869.fc6 is a Fedora modified kernel, all others are from kernel.org
> >
...
> > Steps to reproduce:
> > Get a Adaptec AHA-2940U/UW/D / AIC-7881U card and a tape drive,
> > install a recent kernel
> > set the tape block size - mt setblk 4096
> > read from or write to tape using wrong block size - tar -b 7 -cvf /dev/tape foo
> >
Write does not trigger this bug because the driver refuses in fixed block
mode writes that are not a multiple of the block size. Read does trigger
it in my system.
The bug is not associated with any specific HBA. st tries to do direct i/o
in fixed block mode with reads that are not a multiple of tape block size.
The patch in this message fixes the st problem by switching to using the
driver buffer up to the next close of the device file in fixed block mode
if the user asks for a read like this.
I don't know why the bug has surfaced only after 2.6.17 although the st
problem is old. There may be another bug in the block subsystem and this
patch works around it. However, the patch fixes a problem in st and in
this way it is a valid fix.
This patch may also fix the bug 7900.
The patch compiles and is lightly tested.
Signed-off-by: Kai Makisara <kai.makisara@kolumbus.fi> Cc: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Jeff Dike [Mon, 19 Mar 2007 20:12:50 +0000 (16:12 -0400)]
UML - arch_prctl should set thread fs
x86_64 needs some TLS fixes. What was missing was remembering the child
thread id during clone and stuffing it into the child during each context
switch.
The %fs value is stored separately in the thread structure since the host
controls what effect it has on the actual register file. The host also needs
to store it in its own thread struct, so we need the value kept outside the
register file.
arch_prctl_skas was fixed to call PTRACE_ARCH_PRCTL appropriately. There is
some saving and restoring of registers in the ARCH_SET_* cases so that the
correct set of registers are changed on the host and restored to the process
when it runs again.
Signed-off-by: Jeff Dike <jdike@addtoit.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>