Support for apparmor namespaces and stacking is coming to Ubuntu kernels in
16.10, and should hopefully be upstreamed Soon (TM) :).
The basic idea is similar to how cgroups are done: we can restore the
apparmor namespace and profile blobs independently of the tasks, and then
at the end we can just set the task's label appropriately. This means the
code that moves tasks under a label stays the same, and the only new code
is the stuff that dumps and restores the policy blobs that are in the
namespace that were loaded by the container.
Signed-off-by: Tycho Andersen <tycho.andersen@canonical.com>
Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
My editor (vim) auto-removes whitespace at EOL for *.c and *.h files,
and I think it makes sense to have a separate commit for this, rather
than littering other commits with such changes.
To make sure this won't pile up again, add a line to Makefile under
the linter target to check for such things (so CI will fail).
This is all whitespace except an addition to Makefile.
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
This will surpress false gcc warnings like this:
criu/stats.c:85:10: error: array subscript 4 is above array bounds
of 'struct timing[2]' [-Werror=array-bounds]
85 | return &rstats->timings[t];
| ^~~~~~~~~~~~~~~~~~~
criu/stats.c:25:16: note: while referencing 'timings'
25 | struct timing timings[RESTORE_TIME_NS_STATS];
| ^~~~~~~
cc1: all warnings being treated as errors
Signed-off-by: Andrei Vagin <avagin@gmail.com>
Build on Ubuntu 18.04 amd64 with command "make DEBUG=1" produces the following error:
include/common/asm/bitops.h: Assembler messages:
include/common/asm/bitops.h:71: Error: incorrect register `%edx' used with `q' suffix
Signed-off-by: anatasluo <luolongjuna@gmail.com>
The clang analyzer, scan-build, cannot correctly handle the
LOCK_BUG_ON() macro. At multiple places there is the following warning:
Error: CLANG_WARNING:
criu/pie/restorer.c:1221:4: warning: Dereference of null pointer
include/common/lock.h:14:35: note: expanded from macro 'LOCK_BUG_ON'
*(volatile unsigned long *)NULL = 0xdead0000 + __LINE__
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~
This just disable the clang analyzer for the LOCK_BUG_ON() macro.
Signed-off-by: Adrian Reber <areber@redhat.com>
Build on Fedora Core 33 produces the following warnings:
include/common/asm/bitops.h: Assembler messages:
include/common/asm/bitops.h:37: Warning: no instruction mnemonic suffix given and no register operands; using default for `bt'
include/common/asm/bitops.h: Assembler messages:
include/common/asm/bitops.h:63: Warning: no instruction mnemonic suffix given and no register operands; using default for `bts'
Update test_bit() and test_and_set_bit() implementation with recent
version from the Linux kernel to fix the warning.
Fixes#1217
Reported-by: Adrian Reber <areber@redhat.com>
Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
The file only includes other headers (which may be not needed).
If we aim for one-include-for-compel, we could instead paste all
subheaders into "compel.h".
Rather, I think it's worth to migrate to more fine-grained compel
headers than follow the strategy 'one header to rule them all'.
Further, the header creates problems for cross-compilation: it's
included in files, those are used by host-compel. Which rightfully
confuses compiler/linker as host's definitions for fpu regs/other
platform details get drained into host's compel.
Signed-off-by: Dmitry Safonov <dima@arista.com>
All those compel functions can fail by various reasons.
It may be status of the system, interruption by user or anything else.
It's really desired to handle as many PIE related errors as possible
otherwise it's hard to analyze statuses of parasite/restorer
and the C/R process.
At least warning for logs should be produced or even C/R stopped.
Signed-off-by: Dmitry Safonov <dima@arista.com>
Signed-off-by: Andrei Vagin <avagin@gmail.com>
We're building PIEs in arm format rather than in thumb.
Copy helpers from libgcc, provide a proper define and
link them into blobs.
Also substitute tabs by spaces, how it should have been
in pie/Makefile - tabs are for recipes.
Fixes:
LINK criu/pie/parasite.built-in.o
criu/pie/pie.lib.a(util-vdso.o): In function `elf_hash':
/criu/criu/pie/util-vdso.c:61: undefined reference to `__aeabi_uidivmod'
/criu/scripts/nmk/scripts/build.mk:209: recipe for target 'criu/pie/parasite.built-in.o' failed
Signed-off-by: Dmitry Safonov <0x7f454c46@gmail.com>
There are a few places where spaces have been used instead of tabs for
indentation. This patch converts the spaces to tabs for consistency
with the rest of the code base.
Signed-off-by: Radostin Stoyanov <rstoyanov1@gmail.com>
The macro __list_for_each is equivalent to list_for_each and it is not
used anywhere.
Signed-off-by: Radostin Stoyanov <rstoyanov1@gmail.com>
Signed-off-by: Andrei Vagin <avagin@gmail.com>
We operate by long variables in out bit arithmetics, so our constants
should be marked as long too.
Cc: Adrian Reber <areber@redhat.com>
Reported-by: Adrian Reber <areber@redhat.com>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
Tested-by: Adrian Reber <areber@redhat.com>
Reviewed-by: Dmitry Safonov <0x7f454c46@gmail.com>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
On ppc64/aarch64 Linux can be set to use Large pages, so the PAGE_SIZE
isn't build-time constant anymore. Define it through _SC_PAGESIZE.
There are different sizes for a page on ppc64:
: #if defined(CONFIG_PPC_256K_PAGES)
: #define PAGE_SHIFT 18
: #elif defined(CONFIG_PPC_64K_PAGES)
: #define PAGE_SHIFT 16
: #elif defined(CONFIG_PPC_16K_PAGES)
: #define PAGE_SHIFT 14
: #else
: #define PAGE_SHIFT 12
: #endif
And on aarch64 there are default sizes and possibly someone can set his
own PAGE_SHIFT:
: config ARM64_PAGE_SHIFT
: int
: default 16 if ARM64_64K_PAGES
: default 14 if ARM64_16K_PAGES
: default 12
On the downside - each time we need PAGE_SIZE, we're doing libc
function call on aarch64/ppc64.
Fixes: #415
Tested-by: Adrian Reber <areber@redhat.com>
Signed-off-by: Dmitry Safonov <dima@arista.com>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
For architectures like aarch64/ppc64 it's needed to propagate the size
of page inside PIEs. For the parasite page size will be defined during
seizing, and for restorer during early initialization.
Afterward we can use PAGE_SIZE in PIEs like we did before.
Signed-off-by: Dmitry Safonov <dima@arista.com>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
The flogr instruction not supported by debian jessie (z900).
So replace it by the gcc built-in.
Signed-off-by: Michael Holzheu <holzheu@linux.vnet.ibm.com>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
This patch only adds the support but does not enable it for building.
Reviewed-by: Alice Frosi <alice@linux.vnet.ibm.com>
Signed-off-by: Michael Holzheu <holzheu@linux.vnet.ibm.com>
Reviewed-by: Dmitry Safonov <dsafonov@virtuozzo.com>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
I'll need them in parasite head and in exit.
travis-ci: success for Rectify 32-bit compatible C/R on x86
Signed-off-by: Dmitry Safonov <dsafonov@virtuozzo.com>
Signed-off-by: Pavel Emelyanov <xemul@virtuozzo.com>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
Real syscalls generation is inside criu for a while
but will be moved out in the next patch.
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@virtuozzo.com>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
Andrew Vagin reported the problem found by a checker:
CID 174702 (#1 of 1): Out-of-bounds access (INCOMPATIBLE_CAST)
incompatible_cast: Pointer &f->raw.counter points to an object whose
effective type is int (32 bits, signed) but is dereferenced as a wider
unsigned long (64 bits, unsigned). This may lead to memory corruption.
It looks like, this points to real problem, which may happen on big-endian
platforms. In the code I relay on the fact, that FDS_EVENT_BIT has a small
number and the value, it determines, fits into int type without problems.
But it's correct only for little-endian.
In case of big-endian, if the word size is 8 bytes, then FDS_EVENT value
is in the last bytes, so there is an access to wrong memory.
To fix the problem, I suggest to use little-endian byte order to work
with task_st futex. Then, the bits from 0 to 31 will be in the low adresses,
i.e. in task_st futex. There is new primitives test_and_set_bit_le() and
set_bit_le() borrowed from the linux kernel for that.
This fixes the problem, but I suppose, the checker does not see the problem
so deep, and just compares the types size, so it will fail again.
So, let's enlarge the bit field size to silence it.
Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com>
Signed-off-by: Pavel Emelyanov <xemul@virtuozzo.com>
The idea is symilar to kernel's wake_up() and wait_event().
One task needs some event. It checks the event has not
happened yet (fle hasn't received, unix peer hasn't bound, etc)
and calls get_fds_event(). Other task makes the event
(sends a fle, binds the peer to a name, etc) and calls set_fds_event().
So, while there is no an event, the first task is sleeping,
and the second wakes it up later:
Task A: clear_fds_event();
if (!socket_bound)
wait_fds_event(); /* sleep */
Task B: bind_socket();
set_fds_event(); /* wake up */
For the details of using see next patches.
v5: Use bit operations.
Split clear_fds_event from wait function.
v2: Do not wait for foreign transport sock is ready,
as it's guarantied by we create it before CR_STATE_FORKING.
travis-ci: success for Rework file opening scheme to make it asynchronous (rev5)
Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com>
Signed-off-by: Pavel Emelyanov <xemul@virtuozzo.com>
System call sys_futex() requires that (from futex(2)):
"On all platforms, futexes are four-byte integers
that must be aligned on a four-byte boundary".
travis-ci: success for locks: Mask futexes aligned
Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com>
Signed-off-by: Pavel Emelyanov <xemul@virtuozzo.com>
musl-libc fixed inconsistency between posix and kernl msghdr structures
by adding pad-s.
It initializes all pad-s before calling recvmsg and sendmsg syscalls.
CRIU calls raw system calls from pie code, so we need to intialize pads too.
In addition, we don't initialize msg_flags and iov_len.
https://github.com/xemul/criu/issues/276https://travis-ci.org/kolyshkin/criu/builds/198415449
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
Signed-off-by: Pavel Emelyanov <xemul@virtuozzo.com>
C compiler might generate calls to memcpy, memset, memcmp, and memmove
as it seem fit (so far we haven't seen memmove being required). That
means we need to provide our own versions of it for code which is not
linked to a libc.
We already have a solution for that in commit bdf6051
("pie: provide memcpy/memcmp/memset for noglibc case")
but we faced another problem of compiler trying to optimize
our builtin_memset() by inserting calls to memset() which
is just an alias in our case and so it lead to infinite recursion.
This was workarounded in commit 8ea0ba7 ("string.h: fix memset
over-optimization with clang") but it's not clear that was a proper
fix.
This patch is considered to be the real solution. As we don't have
any other implementations of memset/memcpy/memcmp in non-libc case,
we can call ours without any prefixes and avoid using weak aliases.
Implementation notes:
1. mem*() functions code had to be moved from .h to .c for the functions
to be compatible with their prototypes declared in /usr/include/string.h
(i.e. "extern").
2. FORTIFY_SOURCE needed to be disabled for code not linked to libc,
because otherwise memcpy() may be replaced with a macro that expands
to __memcpy_chk() which of course can't be resolved during linking.
https://travis-ci.org/kolyshkin/criu/builds/198415449
Signed-off-by: Kir Kolyshkin <kir@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@virtuozzo.com>
This will be used to pass MSG_DONTWAIT in next patch.
Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com>
Signed-off-by: Pavel Emelyanov <xemul@virtuozzo.com>
Replace "-1" return with errno codes.
ENOMSG and EBADFD were choosen to do not cross with
standard recvmsg() errors (described in its man page).
This patch is need as preparation to making recv_msg()
be able to be non-block, and return EAGAIN and EWOULDBLOCK
in case of no data.
Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com>
Signed-off-by: Pavel Emelyanov <xemul@virtuozzo.com>
Remove getting opts from descriptors out from scm engine,
this stuff is pure criu thing, so make it collect the data.
The tricky change here is that parasite code needs memory
to keep fd_opts on. The memory is taken from parasite args
region, which is now bigger than it used to be. But that's
not a big deal, as previously this space was allocated on
the parasite stack (!, but with smaller chunks).
On the other hand, now we have one memcpy less, as opts are
put directly into the destination buffer.
travis-ci: success for files: Rework send/recv-fds to be more generic
Signed-off-by: Pavel Emelyanov <xemul@virtuozzo.com>
Currently criu built with criu/pie-util-fd (which
is a symlink to criu/pie/util-fd) with same flags
as we use in general compel infection code. Moreover
the criu link with libcompel.a, so we get a problem
where send_fds/recv_fds are multiple defined. Lets
rather unweave this mess:
- drop criu/pie-util-fd.c completely
- move send_fd/recv_fd inliners into scm.h
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@virtuozzo.com>
- Declare send_fds, recv_fds in sch.h, these
are prototypes used in both compel and criu
- Drop old protos from plugin-fds.h uapi file
- Drop old code from fds.c source
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@virtuozzo.com>
When SCM_FDSET_HAS_OPTS is not set the scm-code.c
can't be built because it declares struct fd_opts
in parameters. Lets rather hide this type and
allow to build without SCM_FDSET_HAS_OPTS definition.
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@virtuozzo.com>
In pure-compel library messing with opts is not required,
only criu and criu's pie will need it, so make it possible
to compile out common/scm-code's opts management.
Signed-off-by: Pavel Emelyanov <xemul@virtuozzo.com>
travis-ci: success for headers: Switch to common linkage.h
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@virtuozzo.com>