This commit enables checkpointing and restoring of applications as
non-root.
First goal was to enable checkpoint and restore of the env00 and
pthread00 test case.
This uses the information from opts.unprivileged and opts.cap_eff to
skip certain code paths which do not work as non-root.
Co-authored-by: Adrian Reber <areber@redhat.com>
Signed-off-by: Younes Manton <ymanton@ca.ibm.com>
A file's r/w/x changing between checkpoint and restore does
not necessarily imply that something is wrong. For example,
if a process opens a file having perms rw- for reading and
we change the perms to r--, the process can be restored and
will function as expected.
Therefore, this patch adds an option
--skip-file-rwx-check
to disable this check on restore. File validation is unaffected
and should still function as expected with respect to the content
of files.
Signed-off-by: Younes Manton <ymanton@ca.ibm.com>
We plan to switch to Mounts-v2 engine for restoring mounts by default,
this options is to allow switching to old engine. This patch only adds
an option, no engine behind it yet.
Cherry-picked from Virtuozzo criu:
https://src.openvz.org/projects/OVZ/repos/criu/commits/503f9ad2c
Changes: allow --mntns-compat-mode option only on restore and only if
MOVE_MOUNT_SET_GROUP is supported (this also requires change in
unittest/mock.c), change id in rpc criu_opts.
Signed-off-by: Pavel Tikhomirov <ptikhomirov@virtuozzo.com>
This commit adds feature check support to libcriu. It already exists in
the CLI and RPC and this just extends it to libcriu.
This commit provides one function to do all possible feature checks in
one call. The parameter to the feature check function is a structure and
the user can enable which features should be checked.
Using a structure makes the function extensible without the need to
break the API/ABI in the future.
Signed-off-by: Adrian Reber <areber@redhat.com>
In contrast to the CLI it is not possible to do a single pre-dump via
RPC and thus libcriu. In cr-service.c pre-dump always goes into a
pre-dump loop followed by a final dump. runc already works around this
to only do a single pre-dump by killing the CRIU process waiting for the
message for the final dump.
Trying to implement pre-dump in crun via libcriu it is not as easy to
work around CRIU's pre-dump loop expectations as with runc that directly
talks to CRIU via RPC.
We know that LXC/LXD also does single pre-dumps using the CLI and runc
also only does single pre-dumps by misusing the pre-dump loop interface.
With this commit it is possible to trigger a single pre-dump via RPC and
libcriu without misusing the interface provided via cr-service.c. So
this commit basically updates CRIU to the existing use cases.
The existing pre-dump loop still sounds like a very good idea, but so
far most tools have decided to implement the pre-dump loop themselves.
With this change we can implement pre-dump in crun to match what is
currently implemented in runc.
Signed-off-by: Adrian Reber <areber@redhat.com>
In runc we use the join-ns RPC API to enable checkpoint/restore of
containers with shared namespaces. Shared namespaces are often used
when containers run inside Kubernetes Pod.
In crun we use libcriu to interface with CRIU, however it currently
doesn't provide an API for join-ns. This patch adds the necessary
libcriu API to enable checkpoint/restore of containers with shared
namespaces with crun.
Signed-off-by: Radostin Stoyanov <radostin@redhat.com>
Support for apparmor namespaces and stacking is coming to Ubuntu kernels in
16.10, and should hopefully be upstreamed Soon (TM) :).
The basic idea is similar to how cgroups are done: we can restore the
apparmor namespace and profile blobs independently of the tasks, and then
at the end we can just set the task's label appropriately. This means the
code that moves tasks under a label stays the same, and the only new code
is the stuff that dumps and restores the policy blobs that are in the
namespace that were loaded by the container.
Signed-off-by: Tycho Andersen <tycho.andersen@canonical.com>
Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
Fixes: #1560
The latest protobuf-c compiler breaks CRIU because they removed
leading underscores from structs in 1.4.0.
This replaces those definitions with the standard generated structs.
v2: remove struct _VmaEntry, struct _CredsEntry and struct _CoreEntry
Signed-off-by: Zeyad Yasser <zeyady98@gmail.com>
Else we get error:
[root@fedora criu]# crit/crit x test/dump/zdtm/static/memfd00/56/1/ mems
...
Traceback (most recent call last):
File "/home/snorch/devel/ms/criu/crit/crit", line 6, in <module>
cli.main()
File "/home/snorch/devel/ms/criu/crit/pycriu/cli.py", line 430, in main
opts["func"](opts)
File "/home/snorch/devel/ms/criu/crit/pycriu/cli.py", line 361, in explore
explorers[opts['what']](opts)
File "/home/snorch/devel/ms/criu/crit/pycriu/cli.py", line 283, in explore_mems
fn = ' ' + get_file_str(opts, {
File "/home/snorch/devel/ms/criu/crit/pycriu/cli.py", line 214, in get_file_str
f = ft['get'](opts, ft, fd['id'])
File "/home/snorch/devel/ms/criu/crit/pycriu/cli.py", line 165, in ftype_reg
rf = ftype_find_in_image(opts, ft, fid, 'reg-files.img')
File "/home/snorch/devel/ms/criu/crit/pycriu/cli.py", line 154, in ftype_find_in_image
return f[ft['field']]
KeyError: 'reg'
Signed-off-by: Pavel Tikhomirov <ptikhomirov@virtuozzo.com>
It will broken when the cli `crit show ipcns-shm-9.img` is executed, msg:
{
"magic": "IPCNS_SHM",
"entries": [
{
"desc": {
"key": 0,
"uid": 0,
"gid": 0,
"cuid": 0,
"cgid": 0,
"mode": 438,
"id": 0
},
"size": 1048576,
"in_pagemaps": true,
"extra": Traceback (most recent call last):
File "/usr/bin/crit", line 6, in <module>
cli.main()
File "/usr/lib/python3/dist-packages/pycriu/cli.py", line 412, in main
opts["func"](opts)
File "/usr/lib/python3/dist-packages/pycriu/cli.py", line 45, in decode
json.dump(img, f, indent=indent)
File "/usr/lib/python3.9/json/__init__.py", line 179, in dump
for chunk in iterable:
File "/usr/lib/python3.9/json/encoder.py", line 431, in _iterencode
yield from _iterencode_dict(o, _current_indent_level)
File "/usr/lib/python3.9/json/encoder.py", line 405, in _iterencode_dict
yield from chunks
File "/usr/lib/python3.9/json/encoder.py", line 325, in _iterencode_list
yield from chunks
File "/usr/lib/python3.9/json/encoder.py", line 405, in _iterencode_dict
yield from chunks
File "/usr/lib/python3.9/json/encoder.py", line 438, in _iterencode
o = _default(o)
File "/usr/lib/python3.9/json/encoder.py", line 179, in default
raise TypeError(f'Object of type {o.__class__.__name__} '
TypeError: Object of type bytes is not JSON serializable
This is caused by `img['magic'][0]['extra']` which is bytes. I find
other load condtions, fix them at the same time.
Signed-off-by: fu.lin <fulin10@huawei.com>
pidfd_store_sk option will be used later to store tasks pidfds
between predumps to detect pid reuse reliably.
pidfd_store_sk should be a fd of a connectionless unix socket.
init_pidfd_store_sk() steals the socket from the RPC client using
pidfd_getfd, checks that it is a connectionless unix socket and
checks if it is not initialized before (i.e. unnamed socket).
If not initialized the socket is first bound to an abstract name
(combination of the real pid/fd to avoid overlap), then it is
connected to itself hence allowing us to store the pidfds in the
receive queue of the socket (this is similar to how fdstore_init()
works).
v2:
- avoid close(pidfd) overriding errno of SYS_pidfd_open in
init_pidfd_store_sk()
- close pidfd_store_sk because we might have leftover from
previous iterations
Signed-off-by: Zeyad Yasser <zeyady98@gmail.com>
CI sometimes errors out encoding/decoding extra pipe data.
This should fix extra pipe data for Python 3 and still keep it working
on Python 2.
Signed-off-by: Adrian Reber <areber@redhat.com>
This changes stdin to be opened as binary if the input is not a tty.
This changes stdout to be opened as binary if encoding or if the output
is not a tty.
Signed-off-by: Adrian Reber <areber@redhat.com>
The recent fix to make Jenkins run crit-recode again broke
Python 2 support (because Python 2 based CI was not running).
This should fix the Python 2 based test run.
Signed-off-by: Adrian Reber <areber@redhat.com>
With the switch to Python3 and binary output it is not possible to use
code like: 'f.write('\0' * (rounded - size))'. Switching to binary
helps.
Signed-off-by: Adrian Reber <areber@redhat.com>
fromstring() and tostring() are deprecated since Python 3.2 and have
been removed in 3.9. Both functions were just aliases and this patch
changes images.py to directly call fromybytes() and tobytes().
Signed-off-by: Adrian Reber <areber@redhat.com>
Although we are running crit-recode.py also in all CI runs we never seen
following error except in Jenkins:
Traceback (most recent call last):
File "/usr/lib/python3.8/base64.py", line 510, in _input_type_check
m = memoryview(s)
TypeError: memoryview: a bytes-like object is required, not 'str'
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "./test/crit-recode.py", line 25, in recode_and_check
r_img = pycriu.images.dumps(pb)
File "/var/lib/jenkins/workspace/Q/test/pycriu/images/images.py", line 635, in dumps
dump(img, f)
File "/var/lib/jenkins/workspace/Q/test/pycriu/images/images.py", line 626, in dump
handler.dump(img['entries'], f)
File "/var/lib/jenkins/workspace/Q/test/pycriu/images/images.py", line 289, in dump
f.write(base64.decodebytes(item['extra']))
File "/usr/lib/python3.8/base64.py", line 545, in decodebytes
_input_type_check(s)
File "/usr/lib/python3.8/base64.py", line 513, in _input_type_check
raise TypeError(msg) from err
TypeError: expected bytes-like object, not str
This commit fixes this by encoding the string to bytes.
Signed-off-by: Adrian Reber <areber@redhat.com>
python3 fails to encode image with the following:
> [dima@Mindolluin criu]$ ./crit/crit encode -i tmp -o tmp.1
> Traceback (most recent call last):
> File "/home/dima/src/criu/./crit/crit", line 6, in <module>
> cli.main()
> File "/home/dima/src/criu/crit/pycriu/cli.py", line 410, in main
> opts["func"](opts)
> File "/home/dima/src/criu/crit/pycriu/cli.py", line 50, in encode
> pycriu.images.dump(img, outf(opts))
> File "/home/dima/src/criu/crit/pycriu/images/images.py", line 617, in dump
> f.write(struct.pack('i', magic.by_name['IMG_COMMON']))
> TypeError: write() argument must be str, not bytes
Opening the output file as binary seems to help.
Signed-off-by: Dmitry Safonov <dima@arista.com>
This commit enables CRIT to decode the contents of a protobuf image
that stores information related to BPF map
Signed-off-by: Abhishek Vijeev <abhishek.vijeev@gmail.com>
I always wondered why re-running make on a criu checkout always prints
out
GEN magic.py
even if no file has changed. It seems the Makefile was looking for the
file in the wrong location. Providing the full path to the file will now
only rebuild magic.py if something actually changed that requires a
rebuild.
Signed-off-by: Adrian Reber <areber@redhat.com>
Fixes#1165
Traceback (most recent call last):
File "../criu/crit/crit-python3", line 6, in <module>
cli.main()
File "/home/xcv/repos/criu/crit/pycriu/cli.py", line 410, in main
opts["func"](opts)
File "/home/xcv/repos/criu/crit/pycriu/cli.py", line 43, in decode
json.dump(img, f, indent=indent)
File "/usr/lib/python3.8/json/__init__.py", line 179, in dump
for chunk in iterable:
File "/usr/lib/python3.8/json/encoder.py", line 431, in _iterencode
yield from _iterencode_dict(o, _current_indent_level)
File "/usr/lib/python3.8/json/encoder.py", line 405, in _iterencode_dict
yield from chunks
File "/usr/lib/python3.8/json/encoder.py", line 325, in _iterencode_list
yield from chunks
File "/usr/lib/python3.8/json/encoder.py", line 405, in _iterencode_dict
yield from chunks
File "/usr/lib/python3.8/json/encoder.py", line 405, in _iterencode_dict
yield from chunks
File "/usr/lib/python3.8/json/encoder.py", line 438, in _iterencode
o = _default(o)
File "/usr/lib/python3.8/json/encoder.py", line 179, in default
raise TypeError(f'Object of type {o.__class__.__name__} '
TypeError: Object of type bytes is not JSON serializable
Co-authored-by: Julian <jb@futureplay.de>
Signed-off-by: Otto Bittner <otto-bittner@gmx.de>
When using libcriu with the notify callback functionality CRIU transmits
an FD during 'orphan-pts-master' back to libcriu user. This is message
is sent via sendmsg() to transmit the FD and not via write() as all
other protobuf messages.
libcriu was using recv() and to be able to receive the FD this needs to
be changed to recvmsg() and if an FD is attached to it (currently only
for 'orphan-pts-master' this FD is stored in a variable which can be
retrieved with the function criu_get_orphan_pts_master_fd().
Signed-off-by: Adrian Reber <areber@redhat.com>
Although the CRIU version is exported in macros in version.h it only
contains the CRIU version of libcriu during build time.
As it is possible that CRIU is upgraded since the last time something
was built against libcriu, this adds functions to query the actual CRIU
binary about its version.
Signed-off-by: Adrian Reber <areber@redhat.com>
The orphan pts master option was introduced with commit [1]
to enable checkpoint/restore of containers with a pty pair
used as a console.
[1] 6afe523d97
Signed-off-by: Radostin Stoyanov <rstoyanov1@gmail.com>
The time namespace allows for per-namespace offsets to the system
monotonic and boot-time clocks.
C/R of time namespaces are very straightforward. On dump, criu enters a
target time namespace and dumps currents clocks values, then on restore,
criu creates a new namespace and restores clocks values.
Signed-off-by: Andrei Vagin <avagin@gmail.com>
See "man fcntl" for more information about seals.
memfd are the only files that can be sealed, currently. For this
reason, we dump the seal values in the MEMFD_INODE image.
Restoring seals must be done carefully as the seal F_SEAL_FUTURE_WRITE
prevents future write access. This means that any memory mapping with
write access must be restored before restoring the seals.
Signed-off-by: Nicolas Viennot <Nicolas.Viennot@twosigma.com>
See "man memfd_create" for more information of what memfd is.
This adds support for memfd open files, that are not not memory mapped.
* We add a new kind of file: MEMFD.
* We add two image types MEMFD_FILE, and MEMFD_INODE.
MEMFD_FILE contains usual file information (e.g., position).
MEMFD_INODE contains the memfd name, and a shmid identifier
referring to the content.
* We reuse the shmem facilities for dumping memfd content as it
would be easier to support incremental checkpoints in the future.
Signed-off-by: Nicolas Viennot <Nicolas.Viennot@twosigma.com>
lib/c/criu.c:343:30: error: implicit conversion from enumeration type
'enum criu_pre_dump_mode' to different enumeration type 'CriuPreDumpMode'
(aka 'enum _CriuPreDumpMode') [-Werror,-Wenum-conversion
opts->rpc->pre_dump_mode = mode;
~ ^~~~
Signed-off-by: Andrei Vagin <avagin@gmail.com>
Two modes of pre-dump algorithm:
1) splicing memory by parasite
--pre-dump-mode=splice (default)
2) using process_vm_readv syscall
--pre-dump-mode=read
Signed-off-by: Abhishek Dubey <dubeyabhishek777@gmail.com>
Signed-off-by: Andrei Vagin <avagin@gmail.com>
Instead of creating cgroup yard in CRIU, now we can create it externally
and pass it to CRIU. Useful if somebody doesn't want to grant
CAP_SYS_ADMIN to CRIU.
Signed-off-by: Michał Cłapiński <mclapinski@google.com>