Real syscalls generation is inside criu for a while
but will be moved out in the next patch.
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@virtuozzo.com>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
Andrew Vagin reported the problem found by a checker:
CID 174702 (#1 of 1): Out-of-bounds access (INCOMPATIBLE_CAST)
incompatible_cast: Pointer &f->raw.counter points to an object whose
effective type is int (32 bits, signed) but is dereferenced as a wider
unsigned long (64 bits, unsigned). This may lead to memory corruption.
It looks like, this points to real problem, which may happen on big-endian
platforms. In the code I relay on the fact, that FDS_EVENT_BIT has a small
number and the value, it determines, fits into int type without problems.
But it's correct only for little-endian.
In case of big-endian, if the word size is 8 bytes, then FDS_EVENT value
is in the last bytes, so there is an access to wrong memory.
To fix the problem, I suggest to use little-endian byte order to work
with task_st futex. Then, the bits from 0 to 31 will be in the low adresses,
i.e. in task_st futex. There is new primitives test_and_set_bit_le() and
set_bit_le() borrowed from the linux kernel for that.
This fixes the problem, but I suppose, the checker does not see the problem
so deep, and just compares the types size, so it will fail again.
So, let's enlarge the bit field size to silence it.
Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com>
Signed-off-by: Pavel Emelyanov <xemul@virtuozzo.com>
The idea is symilar to kernel's wake_up() and wait_event().
One task needs some event. It checks the event has not
happened yet (fle hasn't received, unix peer hasn't bound, etc)
and calls get_fds_event(). Other task makes the event
(sends a fle, binds the peer to a name, etc) and calls set_fds_event().
So, while there is no an event, the first task is sleeping,
and the second wakes it up later:
Task A: clear_fds_event();
if (!socket_bound)
wait_fds_event(); /* sleep */
Task B: bind_socket();
set_fds_event(); /* wake up */
For the details of using see next patches.
v5: Use bit operations.
Split clear_fds_event from wait function.
v2: Do not wait for foreign transport sock is ready,
as it's guarantied by we create it before CR_STATE_FORKING.
travis-ci: success for Rework file opening scheme to make it asynchronous (rev5)
Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com>
Signed-off-by: Pavel Emelyanov <xemul@virtuozzo.com>
System call sys_futex() requires that (from futex(2)):
"On all platforms, futexes are four-byte integers
that must be aligned on a four-byte boundary".
travis-ci: success for locks: Mask futexes aligned
Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com>
Signed-off-by: Pavel Emelyanov <xemul@virtuozzo.com>
musl-libc fixed inconsistency between posix and kernl msghdr structures
by adding pad-s.
It initializes all pad-s before calling recvmsg and sendmsg syscalls.
CRIU calls raw system calls from pie code, so we need to intialize pads too.
In addition, we don't initialize msg_flags and iov_len.
https://github.com/xemul/criu/issues/276https://travis-ci.org/kolyshkin/criu/builds/198415449
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
Signed-off-by: Pavel Emelyanov <xemul@virtuozzo.com>
C compiler might generate calls to memcpy, memset, memcmp, and memmove
as it seem fit (so far we haven't seen memmove being required). That
means we need to provide our own versions of it for code which is not
linked to a libc.
We already have a solution for that in commit bdf6051
("pie: provide memcpy/memcmp/memset for noglibc case")
but we faced another problem of compiler trying to optimize
our builtin_memset() by inserting calls to memset() which
is just an alias in our case and so it lead to infinite recursion.
This was workarounded in commit 8ea0ba7 ("string.h: fix memset
over-optimization with clang") but it's not clear that was a proper
fix.
This patch is considered to be the real solution. As we don't have
any other implementations of memset/memcpy/memcmp in non-libc case,
we can call ours without any prefixes and avoid using weak aliases.
Implementation notes:
1. mem*() functions code had to be moved from .h to .c for the functions
to be compatible with their prototypes declared in /usr/include/string.h
(i.e. "extern").
2. FORTIFY_SOURCE needed to be disabled for code not linked to libc,
because otherwise memcpy() may be replaced with a macro that expands
to __memcpy_chk() which of course can't be resolved during linking.
https://travis-ci.org/kolyshkin/criu/builds/198415449
Signed-off-by: Kir Kolyshkin <kir@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@virtuozzo.com>
This will be used to pass MSG_DONTWAIT in next patch.
Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com>
Signed-off-by: Pavel Emelyanov <xemul@virtuozzo.com>
Replace "-1" return with errno codes.
ENOMSG and EBADFD were choosen to do not cross with
standard recvmsg() errors (described in its man page).
This patch is need as preparation to making recv_msg()
be able to be non-block, and return EAGAIN and EWOULDBLOCK
in case of no data.
Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com>
Signed-off-by: Pavel Emelyanov <xemul@virtuozzo.com>
Remove getting opts from descriptors out from scm engine,
this stuff is pure criu thing, so make it collect the data.
The tricky change here is that parasite code needs memory
to keep fd_opts on. The memory is taken from parasite args
region, which is now bigger than it used to be. But that's
not a big deal, as previously this space was allocated on
the parasite stack (!, but with smaller chunks).
On the other hand, now we have one memcpy less, as opts are
put directly into the destination buffer.
travis-ci: success for files: Rework send/recv-fds to be more generic
Signed-off-by: Pavel Emelyanov <xemul@virtuozzo.com>
Currently criu built with criu/pie-util-fd (which
is a symlink to criu/pie/util-fd) with same flags
as we use in general compel infection code. Moreover
the criu link with libcompel.a, so we get a problem
where send_fds/recv_fds are multiple defined. Lets
rather unweave this mess:
- drop criu/pie-util-fd.c completely
- move send_fd/recv_fd inliners into scm.h
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@virtuozzo.com>
- Declare send_fds, recv_fds in sch.h, these
are prototypes used in both compel and criu
- Drop old protos from plugin-fds.h uapi file
- Drop old code from fds.c source
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@virtuozzo.com>
When SCM_FDSET_HAS_OPTS is not set the scm-code.c
can't be built because it declares struct fd_opts
in parameters. Lets rather hide this type and
allow to build without SCM_FDSET_HAS_OPTS definition.
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@virtuozzo.com>
In pure-compel library messing with opts is not required,
only criu and criu's pie will need it, so make it possible
to compile out common/scm-code's opts management.
Signed-off-by: Pavel Emelyanov <xemul@virtuozzo.com>
travis-ci: success for headers: Switch to common linkage.h
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@virtuozzo.com>
travis-ci: success for series starting with [1/2] ppc: Add atomic_dec_return()
Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com>
Signed-off-by: Pavel Emelyanov <xemul@virtuozzo.com>
Borrowed from Linux kernel.
travis-ci: success for series starting with [1/2] ppc: Add atomic_dec_return()
Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com>
Signed-off-by: Pavel Emelyanov <xemul@virtuozzo.com>
Somehow clang doesn't always like -Wa flags, for example when making
dependencies (see commit 9303ed3 ("Makefiles: move -Wa,--noexecstack
out of CFLAGS"), which causes build break, scary error messages, and
even hair loss.
There are many ways to solve this. This patch employs the one
that is simple and clean.
The -Wa,-mimplicit-it=always flag was added by commit 79c4b74
("arm: fix compilation on ARMv7"). The reason is, ARM needs an IT
instruction before certain conditionals. Those IT instructions are
almost always automatically generated by assembler itself, but in some
cases a special assembler flag (like the one above) is needed.
As there is only one place in the code that need IT, it's easy to patch
it (add explicit IT) and remove the flag. Note that "IT" generates
no machine code per se, so there should not be any functional change
(although I haven't checked it).
For more info on IT, see http://tinyurl.com/z3ldsdr
Hope for a review from our ARM experts.
travis-ci: success for Fixes to compile on arm with clang
Cc: Christopher Covington <cov@codeaurora.org>
Cc: Dmitry Safonov <dsafonov@virtuozzo.com>
Cc: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Kir Kolyshkin <kir@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@virtuozzo.com>
clang-3.8 fails to compile inline asm having ldarx with 4 args.
Quoting [1]:
'''
Recent versions of the PowerPC architecture added a hint bit to the larx
instructions to differentiate between an atomic operation and a lock
operation:
> 0 Other programs might attempt to modify the word in storage addressed by EA
> even if the subsequent Store Conditional succeeds.
>
> 1 Other programs will not attempt to modify the word in storage addressed by
> EA until the program that has acquired the lock performs a subsequent
> store releasing the lock.
'''
I also found some more info about this in [2].
Anyway, we could either construct some preprocessor logic to omit this
argument for clang, or just drop it. This patch does the latter.
[1] https://patchwork.ozlabs.org/patch/45008/
[2] https://sourceware.org/ml/libc-alpha/2015-03/msg00085.html
travis-ci: success for PPC+clang compile fixes
Cc: Laurent Dufour <ldufour@linux.vnet.ibm.com>
Signed-off-by: Kir Kolyshkin <kir@openvz.org>
Acked-by: Laurent Dufour <ldufour@linux.vnet.ibm.com>
Signed-off-by: Pavel Emelyanov <xemul@virtuozzo.com>
Subprojects usually have own "asm" directory,
so to eliminate collision specify complete path.
travis-ci: success for common: Use complete path in inclusion
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@virtuozzo.com>
Since in criu we can't choose proper
arch inside include statements (well,
it will simply require more ifdefs),
I generate include/common/asm symlink
to point proper architecture.
travis-ci: success for Common headers
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@virtuozzo.com>
The idea is to have one place for headers which
are shared between subprojects (zdtm, criu, compel).
travis-ci: success for Add directory for common headers (rev3)
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@virtuozzo.com>
Since we need to align some allocations (but not most of them), let's
always align them when checking the current position.
v2: always rst_mem_align() before the beginning of each "set" of
allocations
v3: merge rst_mem_align and rst_mem_cpos
Signed-off-by: Tycho Andersen <tycho.andersen@canonical.com>
CC: Cyrill Gorcunov <gorcunov@gmail.com>
CC: Andrey Vagin <avagin@openvz.org>
Acked-by: Andrew Vagin <avagin@virtuozzo.com>
Signed-off-by: Pavel Emelyanov <xemul@virtuozzo.com>
Fix for commit 0ce8e42995 ("kerndat: do not report errors on feature
test").
That commit hid error messages for feature testing when you cannot
write to /proc/*/loginuid files because of missing kernel patch that
allows unsetting loginuid value on older kernels, but it didn't hide
error messages in case of disabled CONFIG_AUDITSYSCALL - then you
don't have loginuid files.
Also fixed comment for kerndat feature test: procfs file might fail
to open if it's missing and that's fine - !CONFIG_AUDITSYSCALL case,
but it can't fail due permission fault on _read_ (then something is
wrong, lets report a problem).
Reported-by: Cyrill Gorcunov <gorcunov@gmail.com>
Signed-off-by: Dmitry Safonov <dsafonov@virtuozzo.com>
Signed-off-by: Pavel Emelyanov <xemul@virtuozzo.com>
We request all contracks via netlink and save netlink messages which
describe them in an image file, then we send these netlink messages back on restore.
https://github.com/xemul/criu/issues/54
Signed-off-by: Andrew Vagin <avagin@virtuozzo.com>
Signed-off-by: Pavel Emelyanov <xemul@virtuozzo.com>
Stas found that if we don't align a pointer,
futex and atomic operations can fail.
v2: don't hard-code the size of void *
v3: add a function to allocate memory without gaps with
a privious slice. It's used to allocate arrays.
v4: don't change rst_mem_cpos
Cc: Stanislav Kinsburskiy <skinsbursky@virtuozzo.com>
Signed-off-by: Andrew Vagin <avagin@virtuozzo.com>
Signed-off-by: Pavel Emelyanov <xemul@virtuozzo.com>
crtools binary is linked with the C library and could rely on all the
services this library is providing, including system calls.
Thus it doesn't need to be linked with the builtin system calls code
made for the parasite/restorer binaries.
This patch does:
- remove the inclusion of syscall.h
- replace all call to sys_<syscall>() by C library <syscall>()
- replace unwrapped system calls by syscall(SYS_<syscall>,...)
- fix the generated compiler's issues.
There should not be any functional changes. The only 'code' changes is
appearing in locks.h when futex is called through the C library, the
errno value is fetched from errno variable instead of the return
value.
Signed-off-by: Laurent Dufour <ldufour@linux.vnet.ibm.com>
Reviewed-by: Christopher Covington <cov@codeaurora.org>
Reviewed-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@virtuozzo.com>
The CRIU internal define of CLONE_SUBNS should not be put in
syscall-types.h since this define is not part of a system call.
This move is required to prepare the removal of syscall.h from the
component of crtools binary.
Signed-off-by: Laurent Dufour <ldufour@linux.vnet.ibm.com>
Reviewed-by: Christopher Covington <cov@codeaurora.org>
Reviewed-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@virtuozzo.com>
The file include/prctl.h should define the struct prctl_mm_map only if
it is not already defined in the system include file linux/prctl.h.
The definition should be part of the '#ifndef PR_SET_MM_MAP' block
since this structure is not defined in that case.
Signed-off-by: Laurent Dufour <ldufour@linux.vnet.ibm.com>
Reviewed-by: Christopher Covington <cov@codeaurora.org>
Reviewed-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@virtuozzo.com>
It is missged in first place and may cause
problem on exiting via alarm hanling.
Reported-by: Igor Sukhih <igor@virtuozzo.com>
Signed-off-by: Cyrill Gorcunov <gorcunov@virtuozzo.com>
Signed-off-by: Pavel Emelyanov <xemul@virtuozzo.com>
AutoFS will need to create write pipe end file descriptor, if it was closed.
Thus, pipe_info structure have to be exported.
Signed-off-by: Stanislav Kinsburskiy <skinsbursky@virtuozzo.com>
Signed-off-by: Pavel Emelyanov <xemul@virtuozzo.com>