Pekka Enberg [Wed, 3 Oct 2012 07:54:05 +0000 (10:54 +0300)]
kvm tools: Drop lchown() calls from 9p
Creating new files in a guest:
echo "hello, world" > hello.txt
results in EPERM on my machine.
The problem is that virtio_p9_create() calls lchown() on UID zero (we
are logged in as root in the guest) which obviously doesn't work. In
fact, ownership changes can't work unless we have some sort of uid/gid
mapping. Therefore, drop the calls to lchown().
kvm tools: Do setup_fdt() later, get powerpc to boot again
In commit e3d3ced "kernel load/firmware cleanup", the call to
kvm__arch_setup_firmware() was moved. Previously more or less at the end
of the init sequence, but that commit moved it into kvm__init() which
is a core_init() call and so runs quite early.
This broke booting powerpc guests, as setup_fdt() needs to be called
later in the setup sequence. In particular it looks at kvm->nrcpus,
which is uninitialised at that point.
In general setup_fdt() needs to run late in the sequence, as it encodes
the setup of the machine into the device tree.
So move setup_fdt() out of kvm__arch_setup_firmware() and make it a
firmware_init() call of its own.
With this patch I am able to boot guests again on HV KVM.
Signed-off-by: Michael Ellerman <michael@ellerman.id.au> Signed-off-by: Pekka Enberg <penberg@kernel.org>
kvm tools: Fix segfault on powerpc in xics_register()
In commit 06e6648 "move kvm_cpus into struct kvm", kvm_cpu__init() became
kvm_cpu__arch_init() called from a new kvm_cpu__init(), and the call was moved
from the end of the init sequence to much earlier, and in particular prior to
irq__init().
This leads to a segfault on powerpc, because kvm_cpu__arch_init() calls into
xics_cpu_register(), which dereferences vcpu->kvm.icp which is uninitialised
until irq__init().
Later in commit a48488d "use init/exit where possible", irq__init() was pulled
out of the init sequence and made a dev_base_init() routine, on x86. On powerpc
the call to irq__init() was dropped entirely.
Finally, we now have a circular dependency between kvm_cpu__init() (which needs
kvm->arch.icp), and irq__init() (which needs kvm->nrcpus). This is caused by
the combination of commit 89f40a7 "move nrcpus into struct kvm_config",
which moved the global nrcpus into kvm->cfg, and commit 06e6648 "move kvm_cpus
into struct kvm", which moved the setup of kvm->nrcpus from kvm->cfg into
kvm_cpu__init().
To fix it we drop irq__init() entirely, if we ever have a non xics irq option
we can bring it back. We turn xics_system_init() into xics_init(), and have it
do the allocation and setup of the icp/ics, including the per-vcpu setup,
removing the dependency from kvm_cpu__init() (via kvm_cpu__arch_init()).
xics_init() is a base_init() routine, it can't be core, which should be early
enough, fingers crossed.
Finally drop irq__exit(), it does nothing and is never called.
Signed-off-by: Michael Ellerman <michael@ellerman.id.au> Signed-off-by: Pekka Enberg <penberg@kernel.org>
kvm tools: Fix powerpc build errors caused by recent changes
Several caused by commit 8074303 "remove global kvm object",
ioport__setup_arch(), term_getc_iov() & term_getc() in the
spapr_hvcons.c code, and kvm_cpu__reboot() in rtas_power_off().
Commit 221b584 "move active_console into struct kvm_config" added
checks in h_put_term_char() & h_get_term_char() of
kvm->cfg.active_console but needs to be vcpu->kvm->cfg.active_console.
That commit also missed updates to term_putc() & term_getc() in
spapr_rtas.c, and I'm guessing that we need similar checks of
active_console in rtas_put_term_char() & rtas_get_term_char().
Signed-off-by: Michael Ellerman <michael@ellerman.id.au> Signed-off-by: Pekka Enberg <penberg@kernel.org>
While it shouldn't happen on regular guests, we sometimes hit it when fuzzing
within the guest, which would cause the lkvm process to exit - which is
undesired.
Our PIT tests were using the debug port to trigger a reboot. Instead of using
that port we now use the reboot line of our i8042 controller.
Signed-off-by: Sasha Levin <levinsasha928@gmail.com> Signed-off-by: Pekka Enberg <penberg@kernel.org>
We accidently broke SMP when we moved mptable init to before we initialize the vcpu
count, that means that we always built smptable which was not properly initialized
for the given configuration.
Instead of initializing mptable as part of the kvm arch initialization, let it
be initialized on it's own in the firmware initialization level.
Signed-off-by: Sasha Levin <levinsasha928@gmail.com> Signed-off-by: Pekka Enberg <penberg@kernel.org>
Sort out the config initialization order so that configuration is fully initialized
before init functions start running, and move the firmware initialization code into
kvm.c.
Signed-off-by: Sasha Levin <levinsasha928@gmail.com> Signed-off-by: Pekka Enberg <penberg@kernel.org>
Move the vesa initialization logic into sdl__init() and vnc__init(), builtin-run
shouldn't have to know about the conditions for initializing vesa on it's own.
Signed-off-by: Sasha Levin <levinsasha928@gmail.com> Signed-off-by: Pekka Enberg <penberg@kernel.org>
Move io debug delay into kvm_config, the parser out of builtin-run into the disk code
and make the init/exit functions match the rest of the code in style.
Signed-off-by: Sasha Levin <levinsasha928@gmail.com> Signed-off-by: Pekka Enberg <penberg@kernel.org>
Support passing a private ptr to CALLBACK options. This will make it possible
assigning options into specific struct kvms by passing them directly to parsers.
Signed-off-by: Sasha Levin <levinsasha928@gmail.com> Signed-off-by: Pekka Enberg <penberg@kernel.org>
kvm tools: split struct kvm into arch specific part
Move all the non-arch specific members into a generic struct, and the arch specific
members into a arch specific kvm_arch. This prevents code duplication across different
archs.
Signed-off-by: Sasha Levin <levinsasha928@gmail.com> Signed-off-by: Pekka Enberg <penberg@kernel.org>
kvm tools: generate command line options dynamically
Since we now store options in a struct, we should generate the command line options
dynamically. This is a pre-requisite to the following patch moving the options
into struct kvm.
Signed-off-by: Sasha Levin <levinsasha928@gmail.com> Signed-off-by: Pekka Enberg <penberg@kernel.org>
Sasha Levin [Thu, 30 Aug 2012 07:36:39 +0000 (09:36 +0200)]
kvm tools: Use the new KVM_SIGNAL_MSI ioctl to inject interrupts directly.
We still create GSIs and keep them for two reasons:
- They're required by virtio-* devices.
- There's not much overhead since we just create them when starting the
guest, they don't use anything when the guest is running.
Signed-off-by: Sasha Levin <levinsasha928@gmail.com> Signed-off-by: Pekka Enberg <penberg@kernel.org>
Asias He [Thu, 23 Aug 2012 12:47:51 +0000 (20:47 +0800)]
kvm tools: Respect guest tcp window size
Respect guest tcp window size and stop sending tcp segments to guest
if guest's receive window is closed.
This fixes the TCP hang I'm seeing where guest and host are transferring
big chuck of data.
This problem was not triggered when guest and external host
communicates, probably because guest to external host communication
walks through real network and is much slower than guest and host
communication. Thus, guest's receive window has little chance to be
closed.
v2: use pthread_cond_wait to wait
Signed-off-by: Asias He <asias.hejun@gmail.com> Signed-off-by: Pekka Enberg <penberg@kernel.org>
Asias He [Thu, 23 Aug 2012 12:47:50 +0000 (20:47 +0800)]
kvm tools: Make tcp between guest/host virtual ip work in UIP mode
This pach makes tcp between 'guest ip' and 'host virtual ip' work in UIP mode.
(The defulat guest ip is 192.168.33.15, host virtual ip is 192.168.33.1)
2) Run kvm tool with the new disk option '-d scsi:$wwpn:$tpgt', e.g
$ lkvm run -k /boot/bzImage -d ~/img/sid.img -d scsi:naa.0:0
Signed-off-by: Asias He <asias.hejun@gmail.com> Cc: Nicholas A. Bellinger <nab@linux-iscsi.org> Cc: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Pekka Enberg <penberg@kernel.org>
Pekka Enberg [Fri, 17 Aug 2012 07:51:05 +0000 (10:51 +0300)]
kvm tools: Add 'install' target to Makefile
Add a new 'install' target to Makefile that installs 'lkvm' binary to
$HOME/bin by default. The installed binary still needs to be launched
from linux/tools/kvm directory because of our silly external dependency
to stage 2 guest init file:
[penberg@tux ~]$ lkvm run
Fatal: Failed linking stage 2 of init.
The most convinent way to fix that is to embed the stage 2 image in
'lkvm' executable like we do with our mini-BIOS.
Paul Neumann [Mon, 13 Aug 2012 17:11:25 +0000 (18:11 +0100)]
kvm tools: Fix segfault on "lkvm run"
The segfault is triggered by just running "lkvm run". On my system, it
does not find any kernel, so kvm_cmd_run_init() returns EINVAL which
fails the (r < 0) check in kvm_cmd_run(). Since kvm_cmd_run_init() does
not get to initialize the cpus, kvm_cpus gets mistakenly dereferenced in
kvm_cmd_run_work().
The errors from kvm_cmd_run_init() are not handled properly as they are
returned as positive values.
Acked-by: Asias He <asias.hejun@gmail.com> Signed-off-by: Paul Neumann <paul104x@yahoo.de> Signed-off-by: Pekka Enberg <penberg@kernel.org>
Asias He [Fri, 3 Aug 2012 09:19:51 +0000 (17:19 +0800)]
kvm tools: Enable O_DIRECT support
With Direct I/O, file reads and writes go directly from the applications
to the storage device, bypassing the operating system read and write
caches. This is useful for applications that manage their own caches.
Open a disk image with O_DIRECT:
$ lkvm run -d ~/img/test.img,direct
The original readonly flag is still supported.
Open a disk image with O_DIRECT and readonly:
$ lkvm run -d ~/img/test.img,direct,ro
Signed-off-by: Asias He <asias.hejun@gmail.com> Acked-by: Sasha Levin <levinsasha928@gmail.com> Signed-off-by: Pekka Enberg <penberg@kernel.org>
This removes the limit for p9 fids and the huge fid array that came along with
it. Instead, it dynamically allocates fids and stores them in a rb-tree.
This is useful when the guest needs a lot of fids, such as when stress
testing guests.
Signed-off-by: Sasha Levin <levinsasha928@gmail.com> Signed-off-by: Pekka Enberg <penberg@kernel.org>
Pekka Enberg [Tue, 31 Jul 2012 06:21:49 +0000 (09:21 +0300)]
Merge commit 'v3.5' into kvmtool/next
Michael Ellerman writes:
It occurred to me overnight that I forgot to mention that in order to
build the new code you need the headers from a 3.5-rc1 era kernel (for
the ioctl & KVM_CAP definitions).
The easiest way to do that is to merge linus' tree into kvmtool.
Merge emailed kgdb dmesg fixups patches from Anton Vorontsov:
"The dmesg command appears to be broken after the printk rework. The
old logic in the kdb code makes no sense in terms of current
printk/logging storage format, and KDB simply hangs forever upon
entering 'dmesg' command.
The first patch revives the command by switching to kmsg_dumper
iterator. As a side-effect, the code is now much more simpler.
A few changes were needed in the printk.c: we needed unlocked variant
of the kmsg_dumper iterator, but these can surely wait for 3.6.
It's probably too late even for the first patch to go to 3.5, but I'll
try to convince otherwise. :-) Here we go:
- The current code is broken for sure, and has no hope to work at
all. It is a regression
- The new code works for me, and probably works for everyone else;
- If it compiles (and I urge everyone to compile-test it on your
setup), it hardly can make things worse."
* Merge emailed patches from Anton Vorontsov: (4 commits)
kdb: Switch to nolock variants of kmsg_dump functions
printk: Implement some unlocked kmsg_dump functions
printk: Remove kdb_syslog_data
kdb: Revive dmesg command
Anton Vorontsov [Sat, 21 Jul 2012 00:27:37 +0000 (17:27 -0700)]
kdb: Revive dmesg command
The kgdb dmesg command is broken after the printk rework. The old logic
in kdb code makes no sense in terms of current printk/logging storage
format, and KDB simply hangs forever.
This patch revives the command by switching to kmsg_dumper iterator.
The code is now much more simpler and shorter.
Signed-off-by: Anton Vorontsov <anton.vorontsov@linaro.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Merge branch 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus
Pull late MIPS fixes from Ralf Baechle:
"This fixes a number of lose ends in the MIPS code and various bug
fixes.
Aside of dropping some patch that should not be in this pull request
everything has sat in -next for quite a while and there are no known
issues.
The biggest patch in this patch set moves the allocation of an array
that is aliased to a function (for runtime generated code) to
assembler code. This avoids an issue with certain toolchains when
building for microMIPS."
* 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus: (35 commits)
MIPS: PCI: Move fixups from __init to __devinit.
MIPS: Fix bug.h MIPS build regression
MIPS: sync-r4k: remove redundant irq operation
MIPS: smp: Warn on too early irq enable
MIPS: call set_cpu_online() on cpu being brought up with irq disabled
MIPS: call ->smp_finish() a little late
MIPS: Yosemite: delay irq enable to ->smp_finish()
MIPS: SMTC: delay irq enable to ->smp_finish()
MIPS: BMIPS: delay irq enable to ->smp_finish()
MIPS: Octeon: delay enable irq to ->smp_finish()
MIPS: Oprofile: Fix build as a module.
MIPS: BCM63XX: Fix BCM6368 IPSec clock bit
MIPS: perf: Fix build error caused by unused counters_per_cpu_to_total()
MIPS: Fix Magic SysRq L kernel crash.
MIPS: BMIPS: Fix duplicate header inclusion.
mips: mark const init data with __initconst instead of __initdata
MIPS: cmpxchg.h: Add missing include
MIPS: Malta may also be equipped with MIPS64 R2 processors.
MIPS: Fix typo multipy -> multiply
MIPS: Cavium: Fix duplicate ARCH_SPARSEMEM_ENABLE in kconfig.
...
Merge tag 'dm-3.5-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/agk/linux-dm
Pull device-mapper discard fixes from Alasdair G Kergon:
- avoid a crash in dm-raid1 when discards coincide with mirror
recovery;
- avoid discarding shared data that's still needed in dm-thin;
- don't guarantee that discarded blocks will be wiped in dm-raid1.
* tag 'dm-3.5-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/agk/linux-dm:
dm raid1: set discard_zeroes_data_unsupported
dm thin: do not send discards to shared blocks
dm raid1: fix crash with mirror recovery and discard
Merge branch 'for-linus' of git://git.open-osd.org/linux-open-osd
Pull pnfs/ore fixes from Boaz Harrosh:
"These are catastrophic fixes to the pnfs objects-layout that were just
discovered. They are also destined for @stable.
I have found these and worked on them at around RC1 time but
unfortunately went to the hospital for kidney stones and had a very
slow recovery. I refrained from sending them as is, before proper
testing, and surly I have found a bug just yesterday.
So now they are all well tested, and have my sign-off. Other then
fixing the problem at hand, and assuming there are no bugs at the new
code, there is low risk to any surrounding code. And in anyway they
affect only these paths that are now broken. That is RAID5 in pnfs
objects-layout code. It does also affect exofs (which was not broken)
but I have tested exofs and it is lower priority then objects-layout
because no one is using exofs, but objects-layout has lots of users."
* 'for-linus' of git://git.open-osd.org/linux-open-osd:
pnfs-obj: Fix __r4w_get_page when offset is beyond i_size
pnfs-obj: don't leak objio_state if ore_write/read fails
ore: Unlock r4w pages in exact reverse order of locking
ore: Remove support of partial IO request (NFS crash)
ore: Fix NFS crash by supporting any unaligned RAID IO
and we finally have the fix. I am quite confident the fix is correct
because I could reproduce the problem with nandsim and verify the fix.
It was also verified by Iwo (the reporter).
I am also confident that this is OK to merge the fix so late because
this patch affects only the fixup functionality, which is not used by
most users."
* tag 'upstream-3.5-rc8' of git://git.infradead.org/linux-ubifs:
UBIFS: fix a bug in empty space fix-up
We can't guarantee that REQ_DISCARD on dm-mirror zeroes the data even if
the underlying disks support zero on discard. So this patch sets
ti->discard_zeroes_data_unsupported.
For example, if the mirror is in the process of resynchronizing, it may
happen that kcopyd reads a piece of data, then discard is sent on the
same area and then kcopyd writes the piece of data to another leg.
Consequently, the data is not zeroed.
When process_discard receives a partial discard that doesn't cover a
full block, it sends this discard down to that block. Unfortunately, the
block can be shared and the discard would corrupt the other snapshots
sharing this block.
This patch detects block sharing and ends the discard with success when
sending it to the shared block.
The above change means that if the device supports discard it can't be
guaranteed that a discard request zeroes data. Therefore, we set
ti->discard_zeroes_data_unsupported.
dm raid1: fix crash with mirror recovery and discard
This patch fixes a crash when a discard request is sent during mirror
recovery.
Firstly, some background. Generally, the following sequence happens during
mirror synchronization:
- function do_recovery is called
- do_recovery calls dm_rh_recovery_prepare
- dm_rh_recovery_prepare uses a semaphore to limit the number
simultaneously recovered regions (by default the semaphore value is 1,
so only one region at a time is recovered)
- dm_rh_recovery_prepare calls __rh_recovery_prepare,
__rh_recovery_prepare asks the log driver for the next region to
recover. Then, it sets the region state to DM_RH_RECOVERING. If there
are no pending I/Os on this region, the region is added to
quiesced_regions list. If there are pending I/Os, the region is not
added to any list. It is added to the quiesced_regions list later (by
dm_rh_dec function) when all I/Os finish.
- when the region is on quiesced_regions list, there are no I/Os in
flight on this region. The region is popped from the list in
dm_rh_recovery_start function. Then, a kcopyd job is started in the
recover function.
- when the kcopyd job finishes, recovery_complete is called. It calls
dm_rh_recovery_end. dm_rh_recovery_end adds the region to
recovered_regions or failed_recovered_regions list (depending on
whether the copy operation was successful or not).
The above mechanism assumes that if the region is in DM_RH_RECOVERING
state, no new I/Os are started on this region. When I/O is started,
dm_rh_inc_pending is called, which increases reg->pending count. When
I/O is finished, dm_rh_dec is called. It decreases reg->pending count.
If the count is zero and the region was in DM_RH_RECOVERING state,
dm_rh_dec adds it to the quiesced_regions list.
Consequently, if we call dm_rh_inc_pending/dm_rh_dec while the region is
in DM_RH_RECOVERING state, it could be added to quiesced_regions list
multiple times or it could be added to this list when kcopyd is copying
data (it is assumed that the region is not on any list while kcopyd does
its jobs). This results in memory corruption and crash.
There already exist bypasses for REQ_FLUSH requests: REQ_FLUSH requests
do not belong to any region, so they are always added to the sync list
in do_writes. dm_rh_inc_pending does not increase count for REQ_FLUSH
requests. In mirror_end_io, dm_rh_dec is never called for REQ_FLUSH
requests. These bypasses avoid the crash possibility described above.
These bypasses were improperly implemented for REQ_DISCARD when
the mirror target gained discard support in commit 5fc2ffeabb9ee0fc0e71ff16b49f34f0ed3d05b4 (dm raid1: support discard).
In do_writes, REQ_DISCARD requests is always added to the sync queue and
immediately dispatched (even if the region is in DM_RH_RECOVERING). However,
dm_rh_inc and dm_rh_dec is called for REQ_DISCARD resusts. So it violates the
rule that no I/Os are started on DM_RH_RECOVERING regions, and causes the list
corruption described above.
This patch changes it so that REQ_DISCARD requests follow the same path
as REQ_FLUSH. This avoids the crash.
Boaz Harrosh [Thu, 7 Jun 2012 23:02:30 +0000 (02:02 +0300)]
pnfs-obj: Fix __r4w_get_page when offset is beyond i_size
It is very common for the end of the file to be unaligned on
stripe size. But since we know it's beyond file's end then
the XOR should be preformed with all zeros.
Old code used to just read zeros out of the OSD devices, which is a great
waist. But what scares me more about this situation is that, we now have
pages attached to the file's mapping that are beyond i_size. I don't
like the kind of bugs this calls for.
Fix both birds, by returning a global zero_page, if offset is beyond
i_size.
TODO:
Change the API to ->__r4w_get_page() so a NULL can be
returned without being considered as error, since XOR API
treats NULL entries as zero_pages.
[Bug since 3.2. Should apply the same way to all Kernels since] CC: Stable Tree <stable@kernel.org> Signed-off-by: Boaz Harrosh <bharrosh@panasas.com>
ore: Unlock r4w pages in exact reverse order of locking
The read-4-write pages are locked in address ascending order.
But where unlocked in a way easiest for coding. Fix that,
locks should be released in opposite order of locking, .i.e
descending address order.
I have not hit this dead-lock. It was found by inspecting the
dbug print-outs. I suspect there is an higher lock at caller that
protects us, but fix it regardless.
Boaz Harrosh [Fri, 8 Jun 2012 01:30:40 +0000 (04:30 +0300)]
ore: Remove support of partial IO request (NFS crash)
Do to OOM situations the ore might fail to allocate all resources
needed for IO of the full request. If some progress was possible
it would proceed with a partial/short request, for the sake of
forward progress.
Since this crashes NFS-core and exofs is just fine without it just
remove this contraption, and fail.
TODO:
Support real forward progress with some reserved allocations
of resources, such as mem pools and/or bio_sets
[Bug since 3.2 Kernel] CC: Stable Tree <stable@kernel.org> CC: Benny Halevy <bhalevy@tonian.com> Signed-off-by: Boaz Harrosh <bharrosh@panasas.com>
Boaz Harrosh [Thu, 7 Jun 2012 22:19:07 +0000 (01:19 +0300)]
ore: Fix NFS crash by supporting any unaligned RAID IO
In RAID_5/6 We used to not permit an IO that it's end
byte is not stripe_size aligned and spans more than one stripe.
.i.e the caller must check if after submission the actual
transferred bytes is shorter, and would need to resubmit
a new IO with the remainder.
Exofs supports this, and NFS was supposed to support this
as well with it's short write mechanism. But late testing has
exposed a CRASH when this is used with none-RPC layout-drivers.
The change at NFS is deep and risky, in it's place the fix
at ORE to lift the limitation is actually clean and simple.
So here it is below.
The principal here is that in the case of unaligned IO on
both ends, beginning and end, we will send two read requests
one like old code, before the calculation of the first stripe,
and also a new site, before the calculation of the last stripe.
If any "boundary" is aligned or the complete IO is within a single
stripe. we do a single read like before.
The code is clean and simple by splitting the old _read_4_write
into 3 even parts:
1._read_4_write_first_stripe
2. _read_4_write_last_stripe
3. _read_4_write_execute
And calling 1+3 at the same place as before. 2+3 before last
stripe, and in the case of all in a single stripe then 1+2+3
is preformed additively.
Why did I not think of it before. Well I had a strike of
genius because I have stared at this code for 2 years, and did
not find this simple solution, til today. Not that I did not try.
This solution is much better for NFS than the previous supposedly
solution because the short write was dealt with out-of-band after
IO_done, which would cause for a seeky IO pattern where as in here
we execute in order. At both solutions we do 2 separate reads, only
here we do it within a single IO request. (And actually combine two
writes into a single submission)
NFS/exofs code need not change since the ORE API communicates the new
shorter length on return, what will happen is that this case would not
occur anymore.
hurray!!
[Stable this is an NFS bug since 3.2 Kernel should apply cleanly] CC: Stable Tree <stable@kernel.org> Signed-off-by: Boaz Harrosh <bharrosh@panasas.com>
UBIFS has a feature called "empty space fix-up" which is a quirk to work-around
limitations of dumb flasher programs. Namely, of those flashers that are unable
to skip NAND pages full of 0xFFs while flashing, resulting in empty space at
the end of half-filled eraseblocks to be unusable for UBIFS. This feature is
relatively new (introduced in v3.0).
The fix-up routine (fixup_free_space()) is executed only once at the very first
mount if the superblock has the 'space_fixup' flag set (can be done with -F
option of mkfs.ubifs). It basically reads all the UBIFS data and metadata and
writes it back to the same LEB. The routine assumes the image is pristine and
does not have anything in the journal.
There was a bug in 'fixup_free_space()' where it fixed up the log incorrectly.
All but one LEB of the log of a pristine file-system are empty. And one
contains just a commit start node. And 'fixup_free_space()' just unmapped this
LEB, which resulted in wiping the commit start node. As a result, some users
were unable to mount the file-system next time with the following symptom:
UBIFS error (pid 1): replay_log_leb: first log node at LEB 3:0 is not CS node
UBIFS error (pid 1): replay_log_leb: log error detected while replaying the log at LEB 3:0
The root-cause of this bug was that 'fixup_free_space()' wrongly assumed
that the beginning of empty space in the log head (c->lhead_offs) was known
on mount. However, it is not the case - it was always 0. UBIFS does not store
in it the master node and finds out by scanning the log on every mount.
The fix is simple - just pass commit start node size instead of 0 to
'fixup_leb()'.
Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client
Pull last minute Ceph fixes from Sage Weil:
"The important one fixes a bug in the socket failure handling behavior
that was turned up in some recent failure injection testing. The
other two are minor bug fixes."
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client:
rbd: endian bug in rbd_req_cb()
rbd: Fix ceph_snap_context size calculation
libceph: fix messenger retry
Merge tag 'md-3.5-fixes' of git://neil.brown.name/md
Pull three md bugfixes from NeilBrown:
"One of the bugs was introduced in 3.5-rc1. Others have been there for
longer."
* tag 'md-3.5-fixes' of git://neil.brown.name/md:
md/raid1: close some possible races on write errors during resync
md: avoid crash when stopping md array races with closing other open fds.
md: fix bug in handling of new_data_offset
Pull networking changes from David Miller:
"Ok, we should be good to go now"
1) We have to statically initialize the init_net device list head rather
than do so in an initcall, otherwise netprio_cgroup crashes if it's
built statically rather than modular (Mark D. Rustad)
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net:
net: Statically initialize init_net.dev_base_head
MAINTAINERS: Changes in qlcnic and qlge maintainers list
cipso: don't follow a NULL pointer when setsockopt() is called
Merge branch 'upstream-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/hid
Pull HID update from Jiri Kosina:
"A final round of changes for HID for 3.5: just device ID additions."
* 'upstream-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/hid:
HID: hid-multitouch: add support for Zytronic panels
HID: add Sennheiser BTD500USB device support
HID: add battery quirk for Apple Wireless ANSI
The strcpy was being used to set the name of the board. Since the
destination char* was read-only and the name is set statically at
compile time; this was both wrong and redundant.
The type of char* is changed to const char* to prevent future errors.
Reported-by: Radek Masin <radek@masin.eu> Signed-off-by: Ezequiel Garcia <elezegarcia@gmail.com>
[ Taking directly due to vacations - Linus ] Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>