Major changes:
* plugins/amdgpu: Implement parallel restore
* Handle processes with uprobes vma
* Fix: getsockopt usage for SO_PASSCRED/SO_PASSSEC on Linux 6.16
* Relax ELF magic check to support MIPS libraries
* pagemap: prevent integer overflow in pagemap_len
This release's name is a nod to the growing challenge we face in
maintaining compatibility across the rapidly evolving Linux kernel
ecosystem.
The full changelog can be found here: https://criu.org/Download/criu/4.2.
Signed-off-by: Andrei Vagin <avagin@gmail.com>
Using sizeof(hdr) where hdr is a pointer gives the size of the pointer,
not the size of the structure it points to.
Reported-by: Kir Kolyshkin <kolyshkin@gmail.com>
Signed-off-by: Andrei Vagin <avagin@gmail.com>
vsnprintf does not always return the number of bytes actually written to
the buffer.
If the output was truncated due to the buffer limit, the return value is
the total number of bytes which WOULD have been written to the final
string if enough space had been available.
This means we must cap the return value to the buffer size excluding the
terminating null byte to correctly calculate the log entry size.
Signed-off-by: Andrei Vagin <avagin@google.com>
kerndat_init() can generate a significant volume of logs. If called
before log_init(), all these messages will be saved in the
early_log_buffer, which has a limited capacity. Additionally, saving to
the early_log_buffer can introduce a performance penalty, especially
when verbose mode is not enabled.
Signed-off-by: Radostin Stoyanov <rstoyanov@fedoraproject.org>
Signed-off-by: Andrei Vagin <avagin@google.com>
When we compare two list of vma-s, we need to take into account that
some of them could be merged.
Fixes#12286
Signed-off-by: Andrei Vagin <avagin@google.com>
This functionality (#2527) is being reverted and excluded from this
release due to issue #2812.
It will be included in a subsequent release once all associated issues
are resolved.
Signed-off-by: Andrei Vagin <avagin@google.com>
This patch fixes the following error:
$ sudo make -C test/others/criu-coredump run
...
Traceback (most recent call last):
File "/home/circleci/criu/coredump/coredump", line 55, in <module>
main()
File "/home/circleci/criu/coredump/coredump", line 47, in main
coredump(opts)
File "/home/circleci/criu/coredump/coredump", line 14, in coredump
cores = generator(os.path.realpath(opts['in']))
File "/home/circleci/criu/coredump/criu_coredump/coredump.py", line 192, in __call__
self.coredumps[pid] = self._gen_coredump(pid)
File "/home/circleci/criu/coredump/criu_coredump/coredump.py", line 214, in _gen_coredump
cd.vmas = self._gen_vmas(pid)
File "/home/circleci/criu/coredump/criu_coredump/coredump.py", line 992, in _gen_vmas
v.data = self._gen_mem_chunk(pid, vma, v.filesz)
File "/home/circleci/criu/coredump/criu_coredump/coredump.py", line 879, in _gen_mem_chunk
page_mem = self._get_page(pid, page_no)
File "/home/circleci/criu/coredump/criu_coredump/coredump.py", line 797, in _get_page
num_pages = m.get("nr_pages", m.compat_nr_pages)
AttributeError: 'dict' object has no attribute 'compat_nr_pages'
+ exit 1
make[1]: *** [Makefile:3: run] Error 1
Signed-off-by: Radostin Stoyanov <rstoyanov@fedoraproject.org>
Signed-off-by: Andrei Vagin <avagin@google.com>
Use nr_pages when available, falling back to compat_nr_pages
for compatibility.
Signed-off-by: alam0rt <sam@samlockart.com>
Signed-off-by: Radostin Stoyanov <rstoyanov@fedoraproject.org>
The --mntns-compat-mode option is no longer parsed with CHECK.
Use --log-file instead to test the error message.
Signed-off-by: Radostin Stoyanov <rstoyanov@fedoraproject.org>
_init__.py defines the public API for pycriu. It is important to use
explicit imports to avoid leaking every symbol from criu.py into the
pycriu namespace. This avoids import-time side effects, prevents name
collisions, and circular-import traps.
Fixes the following lint error:
F403 `from .criu import *` used; unable to detect undefined names
Signed-off-by: Radostin Stoyanov <rstoyanov@fedoraproject.org>
This allows users to specify RPC options when
using the check() functionality.
Co-authored-by: Andrii Herheliuk <andrii@herheliuk.com>
Signed-off-by: Radostin Stoyanov <rstoyanov@fedoraproject.org>
The check() functionality is very different from dump, pre-dump,
and restore. It is used only to check if the kernel supports required
features, and does not need the majority of options set via RPC.
In particular, we don't need to open `image_dir` when running `check()`
because this functionality doesn't create or process image files. In
this case, `image_dir` is used as `work_dir`, only when the latter is
not specified and a log file is used.
This patch updates the RPC options parser so that it only handles the
logging options when check() is used. Logging to a file is required when
log_file is explicitly set or no log_to_stderr is used. In such case, we
also resolve images_dir and work_dir where the log file will be created.
Fixes: #2758
Suggested-by: Andrei Vagin <avagin@google.com>
Signed-off-by: Radostin Stoyanov <rstoyanov@fedoraproject.org>
Move the logging initialization into a helper function that
can be reused.
No functional change intended.
Signed-off-by: Radostin Stoyanov <rstoyanov@fedoraproject.org>
Move the images_dir selection logic from setup_opts_from_req() into a
new function: resolve_images_dir_path(). This improves readability and
allows the code to be reused. While at it, use snprintf() instead of
sprintf() for the /proc path and ensure NULL termination after strncpy().
Signed-off-by: Radostin Stoyanov <rstoyanov@fedoraproject.org>
Commit 9089ce8 ("service: use setproctitle") extended cr-service to
get the full path of images_dir using readlink(). However, the RPC
API was later extended to allow setting a custom path (folder) to
be set instead of passing a file descriptor, which causes readlink()
to fail as the path is not a symbolic link.
It would be better to drop the code setting the images-dir path as a
string in the proctitle.
Fixes: #2794
Suggested-by: Andrei Vagin <avagin@google.com>
Co-authored-by: Andrii Herheliuk <andrii@herheliuk.com>
Signed-off-by: Radostin Stoyanov <rstoyanov@fedoraproject.org>
Move the code that opens the images directory, resolves its absolute
path via readlink(), selects the work_dir, and chdir()s into it into a
new function: setup_images_and_workdir(). This reduces the size of
`setup_opts_from_req()`, improves its readability, and allows this
functionality to be reused.
While at it, change open_image_dir() to take a const char *dir
parameter, reflecting that the path is not modified by the function and
allowing callers to pass string literals without casts.
No functional changes are intended.
Signed-off-by: Radostin Stoyanov <rstoyanov@fedoraproject.org>
This change allows users to call criu.use_sk() without any
parameters to use the default socket name.
Co-authored-by: Radostin Stoyanov <rstoyanov@fedoraproject.org>
Signed-off-by: Andrii Herheliuk <andrii@herheliuk.com>
[Errno 2] No such file or directory -> Socket file not found.
[Errno 111] Connection refused -> Service not running.
Signed-off-by: Andrii Herheliuk <andrii@herheliuk.com>
Use system-installed CRIU binary instead of a local file
Thanks to @avagin for suggesting this solution.
Co-authored-by: Andrei Vagin <avagin@gmail.com>
Signed-off-by: Andrii Herheliuk <andrii@herheliuk.com>
Container runtimes that use libcriu (e.g., crun) need to specify a CRIU
configuration file that allows to overwrite default options set via RPC.
This is particularly useful to set options such as `--tcp-established`
via `/etc/criu/runc.conf` in Kubernetes.
Signed-off-by: Radostin Stoyanov <rstoyanov@fedoraproject.org>
Unlike "which", which is a separate executable not always installed by
default, "command -v" is a shell built-in available at least for bash,
dash, and busybox shell.
Unlike "which", "command -v" is also easier to grep for, and it is
already used in a few places here.
Inspired by commit 57251d811.
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
which is used in Makefiles to check for dependencies:
Example:
export USE_ASCIIDOCTOR ?= $(shell which asciidoctor 2>/dev/null)
Signed-off-by: Radostin Stoyanov <rstoyanov@fedoraproject.org>
Don't install external pip dependencies when running `make install`.
As we are not really into developing a Python project, we should
not install additional packages. CRIU does that nowhere else.
Signed-off-by: Radostin Stoyanov <rstoyanov@fedoraproject.org>
The existing test collects all action-script hooks triggered during
`h`, `ns`, and `uns` runs with ZDTM into `actions_called.txt`, then
verifies that each hook appears at least once. However, the test does
not verify that hooks are invoked *exactly once* or in *correct order*.
This change updates the test to run ZDTM only with ns flavour as this
seems to cover all action-script hooks, and checks that all hooks are
called correctly.
Signed-off-by: Radostin Stoyanov <rstoyanov@fedoraproject.org>
This patch consolidates the action-script tests into
`test/others/action-script` to ensure all tests are executed
consistently and reduce duplication. Since we had two tests that appear
to do the same thing, we can remove the one that doesn't use zdtm.py.
Signed-off-by: Radostin Stoyanov <rstoyanov@fedoraproject.org>
Regardless of the actual error message, "Unknown" was always appended
to the end of the string, resulting in messages like:
"DUMP failed: Error(3): No process with such pidUnknown".
Fixed by changing standalone if statements to else-if blocks so
"Unknown" is only added when no specific error condition matches.
Signed-off-by: Andrii Herheliuk <andrii@herheliuk.com>
pycriu depends on protobuf to function correctly. Currently,
it raises an error if protobuf is not installed. Adding
protobuf to the dependencies ensures it is available after
installing pycriu.
Signed-off-by: Andrii Herheliuk <andrii@herheliuk.com>
We use LGPL-v2.1 license for the libcriu and pycriu as they are
intended to be usable by both proprietary and open-source applications.
Signed-off-by: Andrii Herheliuk <andrii@herheliuk.com>
Signed-off-by: Radostin Stoyanov <rstoyanov@fedoraproject.org>
* call shstk_vma_restore() for VMA_AREA_SHSTK in vma_remap()
* delete map/copy/unmap from shstk_restore() and keep token setup + finalize
* before the loop naturally stopped at cet->ssp-8, so a -8 nudge is required here
Signed-off-by: Igor Svilenkov Bozic <svilenkov@gmail.com>
Co-Authored-By: Andrei Vagin <avagin@gmail.com>
[ alex: small code cleanups ]
Signed-off-by: Alexander Mikhalitsyn <aleksandr.mikhalitsyn@canonical.com>
1. create shadow stack vma during vma_remap cycle
2. copy contents from a premapped non-shstk VMA into it
3. unmap premapped non-shstk VMA
4. Mark shstk VMA for remap into the final destination
Signed-off-by: Igor Svilenkov Bozic <svilenkov@gmail.com>
Co-Authored-By: Andrei Vagin <avagin@gmail.com>
Co-Authored-By: Alexander Mikhalitsyn <aleksandr.mikhalitsyn@canonical.com>
[ alex: debugging, rework together with Andrei and code cleanup ]
Signed-off-by: Alexander Mikhalitsyn <aleksandr.mikhalitsyn@canonical.com>
* reserve space for restorer shadow stack
* set tmp_shstk at mem, advance mem by PAGE_SIZE
* forget the extra PAGE_SIZE (shstk) for premapped VMAs
Signed-off-by: Igor Svilenkov Bozic <svilenkov@gmail.com>
Co-Authored-By: Andrei Vagin <avagin@gmail.com>
[ alex: small code cleanups ]
Signed-off-by: Alexander Mikhalitsyn <aleksandr.mikhalitsyn@canonical.com>
* default: return whatever passed in
eg. to be used as
shtk_min_mmap_addr(kdat.mmap_min_addr)
* x86: ignore def and return 4G
On x86, CET shadow stack is required to be mapped above 4GiB
On the other hand forcing 4GiB globally would break 32-bit restores.
Co-Authored-By: Andrei Vagin <avagin@gmail.com>
Signed-off-by: Alexander Mikhalitsyn <aleksandr.mikhalitsyn@canonical.com>
Extend the test for overwriting config options via RPC with
repeatable option (--action-script) and verify that the value
will not be silently duplicated.
Signed-off-by: Radostin Stoyanov <rstoyanov@fedoraproject.org>
When an additional configuration file is specified via RPC, this file is
parsed twice: first at an early stage to load options such as --log-file,
--work-dir, and --images-dir; and again after all RPC options and
configuration files have been evaluated.
This allows users to overwrite options specified via RPC by the
container runtime (e.g., --tcp-established). However, processing
the RPC config file twice leads to silently duplicating the values
of repeatable options such as `--action-script`.
To address this problem, we adjust the order of options parsing so
that the RPC config file is evaluated only once. This change should
not introduce any functional changes. Note that this change does
not affect the logging functionality, as early log messages are
temporarily buffered and only written to the log file once it has
been initialized (see commit 1ff2333 "Printout early log messages").
Fixes#2727
Suggested-by: Andrei Vagin <avagin@google.com>
Signed-off-by: Radostin Stoyanov <rstoyanov@fedoraproject.org>
Program flow:
- Parse the test's own executable to calculate the file offset of the uprobe
target function symbol
- Enable the uprobe at the target function
- Call the target function to trigger the uprobe, and hence the uprobes vma
creation
- C/R
- Call the target function again to check that no SIGTRAP is sent, since the
uprobe is still active
At least v1.7 of libtracefs is required because that's when
tracefs_instance_reset was introduced. The uprobes API was introduced in v1.4,
and the dynamic events API was introduced in v1.3.
Ubuntu Focal doesn't have libtracefs. Jammy has v1.2.5, and Noble has v1.7.
Signed-off-by: Shashank Balaji <shashank.mahadasyam@sony.com>
This commit teaches criu to deal with processes which have a "[uprobes]" vma.
This vma is mapped by the kernel when execution hits a uprobe location. This
is done so as to execute the uprobe'd instruciton out-of-line in the special
vma. The uprobe'd location is replaced by a software breakpoint instruction,
which is int3 on x86. When execution reaches that location, control is
transferred over to the kernel, which then executes whatever handler code
it has to, for the uprobe, and then executed the replaced instruction out-of-line
in the special vma. For more details, refer to this commit:
d4b3b6384f
Reason for adding a new option
------------------------------
A new option is added instead of making the uprobes vma handling transparent
to the user, so that when a dump is attempted on a process tree in which a
process has the uprobes vma, criu will error, asking the user to use this option.
This gives the user a chance to check what uprobes are attached to the processes
being dumped, and try to ensure that those uprobes are active on restore as well.
Again, the same reason for requiring this option on restore as well. Because
if a process is dumped with an active uprobe, and on restore if the uprobe
is not active, then if execution reaches the uprobe location, then the process
will be sent a SIGTRAP, whose default behaviour will terminate and core dump
the process. This is because the code pages are dumped with the software
breakpoint instruction replacement at the uprobe'd locations. On restore, if
execution reaches these locations and the kernel sees no associated active
uprobes, then it'll send a SIGTRAP.
So, using this option is on dump and restore is an implicit guarantee on the
user's behalf that they'll take care of the active uprobes and that any future
SIGTRAPs because of this are not on us! :)
Handling uprobes vma on dump
----------------------------
We don't need to store any information about the uprobes vma because it's
completely handled by the kernel, transparent to userspace. So, when a uprobes
vma is detected, we check if the --allow-uprobes option was specified or not.
If so, then the allow_uprobes boolean in the inventory image is set (this is
used on restore). The uprobes vma is skipped from being added to the vma list.
Handling uprobes vma on restore
-------------------------------
If allow_uprobes is set in the inventory image, then check if --allow-uprobes
is specified or not. Restoring the vma is not required.
Fixes: checkpoint-restore#1961
Signed-off-by: Shashank Balaji <shashank.mahadasyam@sony.com>
Add a ZDTM test case where CRIU uses a helper process to restore
a non-empty process group with a terminated leader and a Unix
domain socket. This reproduces a corner case in which mount
namespace switching can fail during restore:
https://github.com/checkpoint-restore/criu/issues/2687
Signed-off-by: Qiao Ma <mqaio@linux.alibaba.com>
Signed-off-by: Radostin Stoyanov <rstoyanov@fedoraproject.org>