Commit graph

184 commits

Author SHA1 Message Date
Bui Quang Minh
529f298913 cgroup-v2: make new field is_threaded optional
The new field is_threaded is currently marked as required which causes
backward compatibility problem when using newer CRIU version to restore
dumped image from older version. This commit makes this field optional and
reworks the logic the skip fixing up threaded cgroup controllers if there
is no information in dumped image.

Signed-off-by: Bui Quang Minh <minhquangbui99@gmail.com>
2023-04-15 21:17:21 -07:00
Pavel Tikhomirov
bd9b66c8c0 sk-inet: support IP_PKTINFO and IPV6_RECVPKTINFO options
We see systemd-resolved relying on these options, and after migration
the options are lost and systemd-resolved stops serving dns requests.

The socket options make kernel add cmsg with destination address to
packets, see more how systemd-resolved uses them:

00a60eaf5f/src/resolve/resolved-manager.c (L826)

Signed-off-by: Pavel Tikhomirov <ptikhomirov@virtuozzo.com>
2023-04-15 21:17:21 -07:00
Mathias Gibbens
c7211f52db Remove execute bit from source file
Signed-off-by: Mathias Gibbens <mathias@calenhad.com>
2023-04-15 21:17:21 -07:00
Bui Quang Minh
f5e0f641a8 cgroup: Remove redundant code that handles zombie tasks
Zombie tasks are dumped in dump_zombies() so it is redundant to handle them
in dump_one_task().

Deprecate cg_set in task_core_entry as this field must be per thread now.

Signed-off-by: Bui Quang Minh <minhquangbui99@gmail.com>
2023-04-15 21:17:21 -07:00
Bui Quang Minh
17d1d8810e cgroup-v2: Dump cgroup controllers of every threads in a process
Currently, we assume all threads in process are in the same cgroup controllers.
However, with threaded controllers, threads in a process may be in different
controllers. So we need to dump cgroup controllers of every threads in process
and fixup the procfs cgroup parsing to parse from self/task/<tid>/cgroup.

Signed-off-by: Bui Quang Minh <minhquangbui99@gmail.com>
2023-04-15 21:17:21 -07:00
Younes Manton
6a30c7d1ed non-root: enable non-root checkpoint/restore
This commit enables checkpointing and restoring of applications as
non-root.

First goal was to enable checkpoint and restore of the env00 and
pthread00 test case.

This uses the information from opts.unprivileged and opts.cap_eff to
skip certain code paths which do not work as non-root.

Co-authored-by: Adrian Reber <areber@redhat.com>
Signed-off-by: Younes Manton <ymanton@ca.ibm.com>
2023-04-15 21:17:21 -07:00
Younes Manton
18fba41255 config/files-reg: Add opt to skip file r/w/x check on restore
A file's r/w/x changing between checkpoint and restore does
not necessarily imply that something is wrong. For example,
if a process opens a file having perms rw- for reading and
we change the perms to r--, the process can be restored and
will function as expected.

Therefore, this patch adds an option

--skip-file-rwx-check

to disable this check on restore. File validation is unaffected
and should still function as expected with respect to the content
of files.

Signed-off-by: Younes Manton <ymanton@ca.ibm.com>
2023-04-15 21:17:21 -07:00
Yuriy Vasiliev
c7858ba42b infect: add SIGTSTP support
Add SIGTSTP signal dump and restore. Add a corresponding field
in the image, save it only if a task is in the stopped state.

Restore task state by sending desired stop signal if it is present
in the image. Fallback to SIGSTOP if it's absent.

Signed-off-by: Yuriy Vasiliev <yuriy.vasiliev@openvz.org>
2023-04-15 21:17:21 -07:00
Alexander Mikhalitsyn
1e0bed3d69 rseq: handle rseq/rseq_cs flags properly
Userspace may configure rseq cs abort policy by
setting RSEQ_CS_FLAG_NO_RESTART_ON_* flags.

In ("cr-dump: fixup thread IP when inside rseq cs") we have supported
the case when process was caught by CRIU during rseq cs execution by
fixing up IP to abort_ip. Thats a common case, but there is special flag
called RSEQ_CS_FLAG_NO_RESTART_ON_SIGNAL, in this case we have to leave
process IP as it was before CRIU seized it. Unfortunately, that's not
all that we need here. We also must preserve (struct rseq)->rseq_cs field.

You may ask like "why we need to preserve it by hands? CRIU is dumping
all process memory and restores it". That's true. But not so easy. The problem
here is that the kernel performs this field cleanup when it realized that
the process gets out of rseq cs. But during dump/restore procedures we are
executing parasite/restorer from the process context. It means that process
will get out of rseq cs in any case and (struct rseq)->rseq_cs will be cleared
by the kernel. So we need to restore this field by hands at the *last* stage
of restore just before releasing processes.

Signed-off-by: Alexander Mikhalitsyn <alexander.mikhalitsyn@virtuozzo.com>
2022-04-28 17:53:52 -07:00
Alexander Mikhalitsyn
f81e3062ca rseq: initial support
Support basic rseq C/R scenario. Assume that:
- there are no processes with IP inside the rseq critical section (CS)
- kernel has ptrace(PTRACE_GET_RSEQ_CONFIGURATION) support

On dump:
1. use ptrace(PTRACE_GET_RSEQ_CONFIGURATION) to get
struct rseq pointer, rseq size and signature from the kernel.
2. save to the image

On restore:
1. get rseq ptr, size, signature from the image
2. register it back using rseq() from the restorer parasite

Fixes: #1696

Reported-by: Radostin Stoyanov <radostin@redhat.com>
Suggested-by: Florian Weimer <fweimer@redhat.com>
Signed-off-by: Alexander Mikhalitsyn <alexander.mikhalitsyn@virtuozzo.com>
2022-04-28 17:53:52 -07:00
Kir Kolyshkin
0194ed392f Fix some codespell warnings
Brought to you by

	codespell -w

(using codespell v2.1.0).

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2022-04-28 17:53:52 -07:00
Kir Kolyshkin
2a60b4974c Rename useable to usable
I am not sure if this is going to bring any compatibility issues.
If yes, we need to remove this patch and add "useable" to the list of
ignored words instead.

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2022-04-28 17:53:52 -07:00
Pavel Tikhomirov
f2d1c7fab8 config/rpc: add new option --mntns-compat-mode for old mount engine
We plan to switch to Mounts-v2 engine for restoring mounts by default,
this options is to allow switching to old engine. This patch only adds
an option, no engine behind it yet.

Cherry-picked from Virtuozzo criu:
https://src.openvz.org/projects/OVZ/repos/criu/commits/503f9ad2c

Changes: allow --mntns-compat-mode option only on restore and only if
MOVE_MOUNT_SET_GROUP is supported (this also requires change in
unittest/mock.c), change id in rpc criu_opts.

Signed-off-by: Pavel Tikhomirov <ptikhomirov@virtuozzo.com>
2022-04-28 17:53:52 -07:00
Adrian Reber
247cdc90db bpfmap: handle new field in fdinfo
Starting with Linux Kernel release 5.16 the fdinfo proc entry contains
a map_extra field which breaks CRIU parsing of bpfmap entries.

This commit adds the map_extra as a possible field to CRIU. The value of
map_extra is not passed to the kernel on restore as it does not seem to
be evaluated in the code paths CRIU restore is using for BPF.

This fixes CRIU CI using Fedora with 5.16.

See Linux commit 9330986c03006ab1d33d243b7cfe598a7a3c1baa
 "bpf: Add bloom filter map implementation"

Signed-off-by: Adrian Reber <areber@redhat.com>
2022-04-28 17:53:52 -07:00
Bui Quang Minh
e4fb1dd5f5 memfd, shmem: Add support for checkpoint/restore memfd and anon shared memory
Co-developed-by: Andrei Vagin <avagin@gmail.com>
Signed-off-by: Bui Quang Minh <minhquangbui99@gmail.com>
2022-04-28 17:53:52 -07:00
Bui Quang Minh
4d77b19eb3 ipc: Add support for checkpoint/restore hugetlb System V shared memory
Attach the System V shared memory segments to the address space via shmat() to
determine if they are backed by hugetlb and their page size. Use these
information for setting the correct flags on restore.

Signed-off-by: Bui Quang Minh <minhquangbui99@gmail.com>
2022-04-28 17:53:52 -07:00
Adrian Reber
51a1adbc03 libcriu: add single pre-dump support
In contrast to the CLI it is not possible to do a single pre-dump via
RPC and thus libcriu. In cr-service.c pre-dump always goes into a
pre-dump loop followed by a final dump. runc already works around this
to only do a single pre-dump by killing the CRIU process waiting for the
message for the final dump.

Trying to implement pre-dump in crun via libcriu it is not as easy to
work around CRIU's pre-dump loop expectations as with runc that directly
talks to CRIU via RPC.

We know that LXC/LXD also does single pre-dumps using the CLI and runc
also only does single pre-dumps by misusing the pre-dump loop interface.

With this commit it is possible to trigger a single pre-dump via RPC and
libcriu without misusing the interface provided via cr-service.c. So
this commit basically updates CRIU to the existing use cases.

The existing pre-dump loop still sounds like a very good idea, but so
far most tools have decided to implement the pre-dump loop themselves.

With this change we can implement pre-dump in crun to match what is
currently implemented in runc.

Signed-off-by: Adrian Reber <areber@redhat.com>
2022-04-28 17:53:52 -07:00
Pavel Tikhomirov
e69be16db7 sockets: c/r bufer size locks
When one sets socket buffer sizes with setsockopt(SO_{SND,RCV}BUF*),
kernel sets coresponding SOCK_SNDBUF_LOCK or SOCK_RCVBUF_LOCK flags on
struct sock. It means that such a socket with explicitly changed buffer
size can not be auto-adjusted by kernel (e.g. if there is free memory
kernel can auto-increase default socket buffers to improve perfomance).
(see tcp_fixup_rcvbuf() and tcp_sndbuf_expand())

CRIU is always changing buf sizes on restore, that means that all
sockets receive lock flags on struct sock and become non-auto-adjusted
after migration. In some cases it can decrease perfomance of network
connections quite a lot.

So let's c/r socket buf locks (SO_BUF_LOCKS), so that sockets for which
auto-adjustment is available does not lose it.

Reviewed-by: Alexander Mikhalitsyn <alexander.mikhalitsyn@virtuozzo.com>
Signed-off-by: Pavel Tikhomirov <ptikhomirov@virtuozzo.com>
2022-04-28 17:53:52 -07:00
Zeyad Yasser
ca3e3c50be inventory: save network lock method to reuse in restore
When the network is locked using a specific method like iptables
or nftables there is no need to require passing the same method
during restore.

We save the lock method during dump in the inventory image and
use that in restore.

This always overwrites the restore --network-lock option.

v2: store opts.network_lock_method directly to avoid dependency
    on rpc.proto's 'enum criu_network_lock_method'.
v3: fall back to iptables if image is generated with an older
    version of CRIU.
v4: remove --network-lock from netns_lock_* from restore

Signed-off-by: Zeyad Yasser <zeyady98@gmail.com>
2021-09-03 10:31:00 -07:00
Zeyad Yasser
b85fad797c cr-service: add network_lock option to RPC and libcriu
v2: run make indent

Signed-off-by: Zeyad Yasser <zeyady98@gmail.com>
2021-09-03 10:31:00 -07:00
Tycho Andersen
8d992a680e lsm: support checkpoint/restore of stacked apparmor profiles
Support for apparmor namespaces and stacking is coming to Ubuntu kernels in
16.10, and should hopefully be upstreamed Soon (TM) :).

The basic idea is similar to how cgroups are done: we can restore the
apparmor namespace and profile blobs independently of the tasks, and then
at the end we can just set the task's label appropriately. This means the
code that moves tasks under a label stays the same, and the only new code
is the stuff that dumps and restores the policy blobs that are in the
namespace that were loaded by the container.

Signed-off-by: Tycho Andersen <tycho.andersen@canonical.com>
Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
2021-09-03 10:31:00 -07:00
Liu Chao
80079fbb0d criu: dump and restore notify_thread_id of posix timer
When sigev_notify_thread_id is not set, get_pid will return a NULL
pointer and do_timer_create will return -EINVAL in kernel. So criu
will failed to create posix timer:

(09.806760) pie: 41301: Error (criu/pie/restorer.c:1998): Can't restore posix timers -22
(09.806824) pie: 41301: Error (criu/pie/restorer.c:2133): Restorer fail 41301
(09.891880) Error (criu/cr-restore.c:2596): Restoring FAILED.

Signed-off-by: Liu Chao <liuchao173@huawei.com>
2021-09-03 10:31:00 -07:00
Adrian Reber
64dd64e504 Enable changing of mount context on restore
This change is motivated by checkpointing and restoring container in
Pods.

When restoring a container into a new Pod the SELinux label of the
existing Pod needs to be used and not the SELinux label saved during
checkpointing.

The option --lsm-profile already enables changing of process SELinux
labels on restore. If there are, however, tmpfs checkpointed they
will be mounted during restore with the same context as during
checkpointing. This can look like the following example:

 context="system_u:object_r:container_file_t:s0:c82,c137"

On restore we want to change this context to match the mount label of
the Pod this container is restored into. Changing of the mount label
is now possible with the new option --mount-context:

 criu restore --mount-context "system_u:object_r:container_file_t:s0:c204,c495"

This will lead to mount options being changed to

 context="system_u:object_r:container_file_t:s0:c204,c495"

Now the restored container can access all the files in the container
again.

This has been tested in combination with runc and CRI-O.

Signed-off-by: Adrian Reber <areber@redhat.com>
2021-09-03 10:31:00 -07:00
Zeyad Yasser
ba882893c3 cr-check: add ability to check if pidfd_store feature is supported
pidfd_store which will be used for reliable pidfd based pid reuse
detection for RPC clients requires two recent syscalls (pidfd_open
and pidfd_getfd).

We allow checking if pidfd_store is supported using:
	1. CLI: criu check --feature pidfd_store
	2. RPC: CRIU_REQ_TYPE__FEATURE_CHECK and set pidfd_store to
	   true in the "features" field of the request

Signed-off-by: Zeyad Yasser <zeyady98@gmail.com>
2021-09-03 10:31:00 -07:00
Zeyad Yasser
e3c9c3429a cr-service: add pidfd_store_sk option to rpc.proto
pidfd_store_sk option will be used later to store tasks pidfds
between predumps to detect pid reuse reliably.
pidfd_store_sk should be a fd of a connectionless unix socket.

init_pidfd_store_sk() steals the socket from the RPC client using
pidfd_getfd, checks that it is a connectionless unix socket and
checks if it is not initialized before (i.e. unnamed socket).
If not initialized the socket is first bound to an abstract name
(combination of the real pid/fd to avoid overlap), then it is
connected to itself hence allowing us to store the pidfds in the
receive queue of the socket (this is similar to how fdstore_init()
works).

v2:
	- avoid close(pidfd) overriding errno of SYS_pidfd_open in
	  init_pidfd_store_sk()
	- close pidfd_store_sk because we might have leftover from
	  previous iterations

Signed-off-by: Zeyad Yasser <zeyady98@gmail.com>
2021-09-03 10:31:00 -07:00
Adrian Reber
eb5726c44a images: re-license as Expat license (so-called MIT)
This changes the license of all files in the images/ directory from
GPLv2 to the Expat license (so-called MIT).

According to git the files have been authored by:

   Abhishek Dubey
   Adrian Reber
   Alexander Mikhalitsyn
   Alice Frosi
   Andrei Vagin (Andrew Vagin, Andrey Vagin)
   Cyrill Gorcunov
   Dengguangxing
   Dmitry Safonov
   Guoyun Sun
   Kirill Tkhai
   Kir Kolyshkin
   Laurent Dufour
   Michael Holzheu
   Michał Cłapiński
   Mike Rapoport
   Nicolas Viennot
   Nikita Spiridonov
   Pavel Emelianov (Pavel Emelyanov)
   Pavel Tikhomirov
   Radostin Stoyanov
   rbruno@gsd.inesc-id.pt
   Sebastian Pipping
   Stanislav Kinsburskiy
   Tycho Andersen
   Valeriy Vdovin

The Expat license (so-called MIT) can be found here:
https://opensource.org/licenses/MIT

According to that link the correct SPDX short identifier is 'MIT'.

https://spdx.org/licenses/MIT.html

Signed-off-by: Adrian Reber <areber@redhat.com>
2021-09-03 10:31:00 -07:00
Abhishek Vijeev
02f7e3434d images: adding support for BPF map file name and ifindex
This commit adds a BPF map's name and ifindex to its protobuf image.
ifindex is the index of the network interface to which the BPF map is
attached and can be specified via a parameter while creating the BPF
map (BPF_MAP_CREATE). This commit also provides a default value of
false to the field 'frozen'.

Source files modified:

* images/bpfmap-file.proto

Signed-off-by: Abhishek Vijeev <abhishek.vijeev@gmail.com>
2020-10-20 00:18:24 -07:00
Andrei Vagin
e42f5e032e tcp: allow to specify --tcp-close on dump
In this case, states of established tcp connections will not be dumped
and they will not be blocked. This will be useful in case of snapshots,
when we don't need to restore tcp connections.

Signed-off-by: Andrei Vagin <avagin@gmail.com>
2020-10-20 00:18:24 -07:00
Radostin Stoyanov
5bb5890cb4 socket: c/r support for SO_LINGER
The SO_LINGER option allows to control how a TCP connection is closed.
The default behavior is to return immediately when close() is called,
and any unsent data is not guaranteed to be delivered. When SO_LINGER
is enabled, the close() call would block until all final data is
delivered to the remote end, for a specified time interval. When the
time interval is set to zero, the connection is aborted and any pending
data is immediately discarded upon close().

Signed-off-by: Radostin Stoyanov <rstoyanov1@gmail.com>
2020-10-20 00:18:24 -07:00
Radostin Stoyanov
0aeddba7cc socket: c/r support for SO_OOBINLINE
This patch enables checkpoint/restore of the SO_OOBINLINE socket option.
When the SO_OOBINLINE option is used, out-of-band data is placed in the
normal input queue as it is received. This permits it to be read using
read or recv without specifying the MSG_OOB flag.

Signed-off-by: Radostin Stoyanov <rstoyanov1@gmail.com>
2020-10-20 00:18:24 -07:00
Pavel Tikhomirov
c0f3653108 images: kindly ask not to use fields with id 18 in unix_sk_entry
The field id 18 is used in Virtuozzo criu in multiple releases, so that
we can't change the id easily. So we can at least kindly ask not to use
this field in mainstream criu to decrease the pain of Virtuozzo criu
rebases.

Reference to related patch in Virtuozzo criu:
https://src.openvz.org/projects/OVZ/repos/criu/commits/58e61a20c22c#images/sk-unix.proto

Signed-off-by: Pavel Tikhomirov <ptikhomirov@virtuozzo.com>
2020-10-20 00:18:24 -07:00
Abhishek Vijeev
c26cd1395f images: protobuf definitions for BPF map meta-data and data
This commit adds protobuf definitions needed to checkpoint and
restore BPF map files along with the data they contain

Source files added:

* bpfmap-file.proto - Stores the meta-data about BPF maps

* bpfmap-data.proto - Stores the data (key-value pairs) contained
in BPF maps

Source files modified:

* fdinfo.proto - Added BPF map as a new kind of file descriptor.
'message file_entry' can now hold information about BPF map file
descriptors

* Makefile - Now generates build artifacts for bpfmap-file.proto
and bpfmap-data.proto

Signed-off-by: Abhishek Vijeev <abhishek.vijeev@gmail.com>
2020-10-20 00:18:24 -07:00
Ajay Bharadwaj
7b18c13c19 images/regfile.proto: adds additional fields to RegFileEntry
This adds build-id, checksum, checksum-config and checksum-parameter fields
to RegFileEntry to store metadata used for file verification.

build_id: Holds the build-id if it could be obtained

checksum: Holds the checksum if it could be obtained

checksum_config: Holds the configuration of bytes for which checksum has
been calculated (The entire file, first N bytes or every Nth byte)

checksum_parameter: Specifies the value of 'N', if required, for the
configuration of bytes

Signed-off-by: Ajay Bharadwaj <ajayrbharadwaj@gmail.com>
2020-10-20 00:18:24 -07:00
Adrian Reber
4e7ec3c88b pidns: add pidns image file definition
TODO: create correct magic

Signed-off-by: Adrian Reber <areber@redhat.com>
2020-10-20 00:18:24 -07:00
Guoyun Sun
158e8f8fe6 mips:proto: Add mips to protocol buffer files
Signed-off-by: Guoyun Sun <sunguoyun@loongson.cn>
2020-10-20 00:18:24 -07:00
Nicolas Viennot
7d79a58f4d img-streamer: introduction of criu-image-streamer
This adds the ability to stream images with criu-image-streamer

The workflow is the following:
1) criu-image-streamer is started, and starts listening on a UNIX
   socket.
2) CRIU is started. img_streamer_init() is invoked, which connects to the
   socket. During dump/restore operations, instead of using local disk to
   open an image file, img_streamer_open() is called to provide a UNIX pipe
   that is sent over the UNIX socket.
3) Once the operation is done, img_streamer_finish() is called, and the
   UNIX socket is disconnected.

criu-image-streamer can be found at:
https://github.com/checkpoint-restore/criu-image-streamer

Signed-off-by: Nicolas Viennot <Nicolas.Viennot@twosigma.com>
2020-10-20 00:18:24 -07:00
Andrei Vagin
4127ef4ab7 criu: Add support for time namespaces
The time namespace allows for per-namespace offsets to the system
monotonic and boot-time clocks.

C/R of time namespaces are very straightforward. On dump, criu enters a
target time namespace and dumps currents clocks values, then on restore,
criu creates a new namespace and restores clocks values.

Signed-off-by: Andrei Vagin <avagin@gmail.com>
2020-04-25 00:43:23 -07:00
Andrei Vagin
e3a5d09752 memfd: save all memfd inodes in one image
Per-object image is acceptable if we expect to have 1-3 objects
per-container. If we expect to have more objects, it is better to save
them all into one image. There are a number of reasons for this:
* We need fewer system calls to read all objects from one image.
* It is faster to save or move one image.

Signed-off-by: Andrei Vagin <avagin@gmail.com>
2020-03-27 19:36:20 +03:00
Alexander Mikhalitsyn
1936608ce4 files: allow dumping opened symlinks
To really open symlink file and not the regular file below it, one needs
to do open with O_PATH|O_NOFOLLOW flags. Looks like systemd started to
open /etc/localtime symlink this way sometimes, and before that nobody
actually used this and thus we never supported this in CRIU.

Error (criu/files-ext.c:96): Can't dump file 11 of that type [120777]
(unknown /etc/localtime)

Looks like it is quiet easy to support, as c/r of symlink file is almost
the same as c/r of regular one. We need to only make fstatat not
following links in check_path_remap.

Also we need to take into account support of ghost symlinks.

Signed-off-by: Alexander Mikhalitsyn (Virtuozzo) <alexander@mihalicyn.com>
Co-developed-by: Pavel Tikhomirov <ptikhomirov@virtuozzo.com>
Signed-off-by: Pavel Tikhomirov <ptikhomirov@virtuozzo.com>
2020-03-27 19:36:20 +03:00
Adrian Reber
4129d3262a cgroup2: add minimal cgroup2 support
The runc test cases are (sometimes) mounting a cgroup inside of the
container. For these tests to succeed, let CRIU know that cgroup2 exists
and how to restore such a mount.

This does not fix any specific cgroup2 settings, it just enables CRIU to
mount cgroup2 in the restored container.

Signed-off-by: Adrian Reber <areber@redhat.com>
2020-03-27 19:36:20 +03:00
Nicolas Viennot
56d8e2455f memfd: add seals support
See "man fcntl" for more information about seals.

memfd are the only files that can be sealed, currently. For this
reason, we dump the seal values in the MEMFD_INODE image.

Restoring seals must be done carefully as the seal F_SEAL_FUTURE_WRITE
prevents future write access. This means that any memory mapping with
write access must be restored before restoring the seals.

Signed-off-by: Nicolas Viennot <Nicolas.Viennot@twosigma.com>
2020-03-27 19:36:20 +03:00
Nicolas Viennot
c1e72aa936 memfd: add file support
See "man memfd_create" for more information of what memfd is.

This adds support for memfd open files, that are not not memory mapped.

* We add a new kind of file: MEMFD.
* We add two image types MEMFD_FILE, and MEMFD_INODE.
  MEMFD_FILE contains usual file information (e.g., position).
  MEMFD_INODE contains the memfd name, and a shmid identifier
  referring to the content.
* We reuse the shmem facilities for dumping memfd content as it
  would be easier to support incremental checkpoints in the future.

Signed-off-by: Nicolas Viennot <Nicolas.Viennot@twosigma.com>
2020-03-27 19:36:20 +03:00
Valeriy Vdovin
4232b270b8 image: core -- Reserve start_time field
To ensure consistency of runtime environment processes within a
container need to see same start time values over suspend/resume
cycles. We introduce new field to the core image structure to
store start time of a dumped process. Later same value would be
restored to a newly created task. In future the feature is likely
to be pulled here, so we reserve field id in protobuf descriptor.

Signed-off-by: Valeriy Vdovin <valeriy.vdovin@virtuozzo.com>
2020-02-04 12:39:44 -08:00
Radostin Stoyanov
d4e6fc2a0d socket: c/r support for SO_KEEPALIVE
TCP keepalive packets can be used to determine if a connection
is still valid. When the SO_KEEPALIVE option is set, TCP packets
are periodically sent to keep the connection alive.

This patch implements checkpoint/restore support for SO_KEEPALIVE,
TCP_KEEPIDLE, TCP_KEEPINTVL and TCP_KEEPCNT options.

Signed-off-by: Radostin Stoyanov <rstoyanov1@gmail.com>
2020-02-04 12:39:05 -08:00
Cyrill Gorcunov
ebe3b52353 unix: sysctl -- Preserve max_dgram_qlen value
The /proc/sys/net/unix/max_dgram_qlen is a per-net variable and
we already noticed that systemd inside a container may change its value
(for example it sets it to 512 by now instead of kernel's default
value 10), thus we need keep it inside image and restore then.

Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com>
Signed-off-by: Alexander Mikhalitsyn <alexander@mihalicyn.com>
Signed-off-by: Alexander Mikhalitsyn <alexander.mikhalitsyn@virtuozzo.com>
2020-02-04 12:39:05 -08:00
Andrei Vagin
3efe44382f image: avoid name conflicts in image files
Conflict register for file "sk-opts.proto": READ is already defined in
file "rpc.proto". Please fix the conflict by adding package name on the
proto file, or use different name for the duplication.  Note: enum
values appear as siblings of the enum type instead of children of it.

https://github.com/checkpoint-restore/criu/issues/815
Signed-off-by: Andrei Vagin <avagin@gmail.com>
2020-02-04 12:39:04 -08:00
Abhishek Dubey
e0ea21ad5e Handling iov generation for non-PROT_READ regions
Skip iov-generation for regions not having
PROT_READ, since process_vm_readv syscall
can't process them during "read" pre-dump.
Handle random order of "read" & "splice"
pre-dumps.

Signed-off-by: Abhishek Dubey <dubeyabhishek777@gmail.com>
Signed-off-by: Andrei Vagin <avagin@gmail.com>
2020-02-04 12:39:04 -08:00
Abhishek Dubey
20d4920a8b Adding --pre-dump-mode option
Two modes of pre-dump algorithm:
    1) splicing memory by parasite
        --pre-dump-mode=splice (default)
    2) using process_vm_readv syscall
        --pre-dump-mode=read

Signed-off-by: Abhishek Dubey <dubeyabhishek777@gmail.com>
Signed-off-by: Andrei Vagin <avagin@gmail.com>
2020-02-04 12:39:02 -08:00
Michał Cłapiński
2f337652ad Add new command line option: --cgroup-yard
Instead of creating cgroup yard in CRIU, now we can create it externally
and pass it to CRIU. Useful if somebody doesn't want to grant
CAP_SYS_ADMIN to CRIU.

Signed-off-by: Michał Cłapiński <mclapinski@google.com>
2020-02-04 12:37:37 -08:00
Andrei Vagin
1e2647f123 images: convert type of child_subreaper from int32 to bool
Signed-off-by: Andrei Vagin <avagin@gmail.com>
2019-09-07 15:59:55 +03:00