Commit graph

11471 commits

Author SHA1 Message Date
Andrei Vagin
c2b48ff423 criu: Version 4.0 (CRIUDA)
Major changes:
* CUDA plugin to support checkpointing and restoring NVIDIA CUDA applications.
* Shadow stack support
* Pagemap cache: Added support for PAGEMAP_SCAN ioctl

The full changelog can be found here: https://criu.org/Download/criu/4.0.

Signed-off-by: Andrei Vagin <avagin@gmail.com>
2024-09-21 16:34:14 -07:00
Andrei Vagin
a8cbe76d4f util: dump fsfd log messages
It should help to investigate errors of fsconfig, fsmount and etc.

Signed-off-by: Andrei Vagin <avagin@google.com>
2024-09-19 15:23:42 -07:00
David Francis
096c1f7a4d plugins/amdgpu - Increase maximum parameter length
The topology parsing assumed that all parameter names were
30 characters or fewer, but

recommended_sdma_engine_id_mask

is 31 characters.

Make the maximum length a macro, and set it to 64.

Signed-off-by: David Francis <David.Francis@amd.com>
2024-09-19 15:23:42 -07:00
David Francis
60ee5ebd9d plugins/amdgpu: Zero ib_info on initialization
This struct was being used un-initialized, meaning it
was filled with random garbage.

Mea culpa.

Signed-off-by: David Francis <David.Francis@amd.com>
2024-09-19 15:23:42 -07:00
Andrei Vagin
6918998897 plugin/cuda: disable CUDA plugin if /dev/nvidiactl isn't present
The presence of /dev/nvidiactl indicates that the system has a
compatible NVIDIA GPU driver installed and that the GPU is accessible to
the operating system.

Signed-off-by: Andrei Vagin <avagin@google.com>
2024-09-19 15:23:42 -07:00
Andrei Vagin
e1331a4b60 fault: allow to check dont_use_freeze_cgroup
Adds a new "fault" to call dont_use_freeze_cgroup.

Signed-off-by: Andrei Vagin <avagin@google.com>
2024-09-19 15:23:42 -07:00
Andrei Vagin
651df375bd criu: Allow disabling freeze cgroups
Some plugins (e.g., CUDA) may not function correctly when processes are
frozen using cgroups. This change introduces a mechanism to disable the
use of freeze cgroups during process seizing, even if explicitly
requested via the --freeze-cgroup option.

The CUDA plugin is updated to utilize this new mechanism to ensure
compatibility.

Signed-off-by: Andrei Vagin <avagin@google.com>
2024-09-19 15:23:42 -07:00
Radostin Stoyanov
59f49c6276 codespell: fix typos
This patch fixes the following typos reported by codespell:

./test/others/bers/bers.c:394: dependin ==> depending, depend in
./criu/kerndat.c:837: hitted ==> hit

Signed-off-by: Radostin Stoyanov <rstoyanov@fedoraproject.org>
2024-09-19 15:23:42 -07:00
Radostin Stoyanov
edb6fbb820 scripts/uninstall_module: fix package discovery
The `uninstall_module.py` script is a wrapper for the `pip uninstall`
command that enables support for specifying installation prefix
(i.e., `--prefix`). When this functionality is used, we intentionally
set `sys.path` to include only search paths for the specified prefix
to avoid unintentional uninstallation of packages in system paths.

Since `importlib_metadata` version 8.1.0, the `Distribution.from_name()`
method has been modified [1] to perform additional pre-processing of
Distribution objects [2] that requires loading distribution metadata
and results in the following error:

  File "/usr/local/lib/python3.12/site-packages/importlib_metadata/__init__.py", line 422, in <lambda>
    buckets = bucket(dists, lambda dist: bool(dist.metadata))
                                              ^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/site-packages/importlib_metadata/__init__.py", line 454, in metadata
    from . import _adapters
  File "/usr/local/lib/python3.12/site-packages/importlib_metadata/_adapters.py", line 3, in <module>
    import email.message
  File "/usr/lib64/python3.12/email/message.py", line 11, in <module>
    import quopri
  ModuleNotFoundError: No module named 'quopri'

This error occurs because we have excluded system paths from the list
of search paths (`sys.path`).

However, this pre-processing is not required for our use case, as we
only use the discovery mechanism of importlib_metadata to resolve the
metadata directory path of the module being uninstalled.

To fix this problem, this patch updates `uninstall_module` to avoid the
`from_name()` method and use `discover(name=package_name)` directly.

[1] a65c29adc0
[2] a65c29ad/importlib_metadata/__init__.py (L391)

Fixes: #2468

Signed-off-by: Radostin Stoyanov <rstoyanov@fedoraproject.org>
2024-09-19 15:23:42 -07:00
Radostin Stoyanov
b1b3c14b17 cuda: unlock on timeout error
When attempting to checkpoint a container with CUDA processes,
CRIU could fail with the following error:

	Error (criu/cr-dump.c:1791): Timeout reached. Try to interrupt: 1
	Error (cuda_plugin.c:143): cuda_plugin: Unable to read output of cuda-checkpoint: Interrupted system call
	Error (cuda_plugin.c:384): cuda_plugin: PAUSE_DEVICES failed with

In this situation, the target process is locked, but CRIU fails due to
a timeout and exits with an error. We need to make sure that the target
PID is unlocked in such case.

Signed-off-by: Radostin Stoyanov <rstoyanov@fedoraproject.org>
2024-09-19 15:23:42 -07:00
Adrian Reber
dbfa450246 ci: run aarch64 tests native via actuated
Signed-off-by: Adrian Reber <areber@redhat.com>
2024-09-19 15:23:42 -07:00
Adrian Reber
8beac656fc coredump: fail on unsupported architectures early
Currently coredump only works on x86_64. Fail early on any other
architecture.

Signed-off-by: Adrian Reber <areber@redhat.com>
2024-09-19 15:23:42 -07:00
Adrian Reber
d44fc0de5a test: only run macvlan tests if macvlan devices can be created
Some test environments (Actuated runners for example) do not support
maclvan devices. Skip tests depending on it automatically.

Signed-off-by: Adrian Reber <areber@redhat.com>
2024-09-19 15:23:42 -07:00
Adrian Reber
01c65732b6 test: better test for SELinux tools
Previously the check was just if /sys/fs/selinux is mounted. This
extends the check to see if all necessary tools are installed.

Signed-off-by: Adrian Reber <areber@redhat.com>
2024-09-19 15:23:42 -07:00
Adrian Reber
615ccf98cf crit: do not crash on aarch64 doing 'crit x ./ rss'
Running 'crit x ./ rss' on aarch64 crashes with:

    File "/home/criu/crit/crit/__main__.py", line 331, in explore_rss
      while vmas[vmi]['start'] < pme:
            ~~~~^^^^^
  IndexError: list index out of range

This adds an additional check to the while loop to do access indexes out
of range.

Signed-off-by: Adrian Reber <areber@redhat.com>
2024-09-19 15:23:42 -07:00
Radostin Stoyanov
21ea718f9f plugins/amdgpu: fix printf format specifiers
Errors on aarch64:

	In file included from amdgpu_plugin_drm.h:10,
			 from amdgpu_plugin.c:33:
	amdgpu_plugin.c: In function 'amdgpu_plugin_dump_file':
	amdgpu_plugin_util.h:24:20: error: format '%lld' expects argument of type 'long long int', but argument 6 has type '__u64' {aka 'long unsigned int'} [-Werror=format=]
	   24 | #define LOG_PREFIX "amdgpu_plugin: "
	      |                    ^~~~~~~~~~~~~~~~~
	../../criu/include/log.h:47:52: note: in expansion of macro 'LOG_PREFIX'
	   47 | #define pr_info(fmt, ...) print_on_level(LOG_INFO, LOG_PREFIX fmt, ##__VA_ARGS__)
	      |                                                    ^~~~~~~~~~
	amdgpu_plugin.c:1236:9: note: in expansion of macro 'pr_info'
	 1236 |         pr_info("devices:%d bos:%d objects:%d priv_data:%lld\n", args.num_devices, args.num_bos, args.num_objects,
	      |         ^~~~~~~
	cc1: all warnings being treated as errors

Errors on ppc64:

	In file included from amdgpu_plugin_drm.h:10,
			 from amdgpu_plugin.c:33:
	amdgpu_plugin.c: In function 'amdgpu_plugin_dump_file':
	amdgpu_plugin_util.h:24:20: error: format '%llu' expects argument of type 'long long unsigned int', but argument 6 has type '__u64' {aka 'long unsigned int'} [-Werror=format=]
	   24 | #define LOG_PREFIX "amdgpu_plugin: "
	      |                    ^~~~~~~~~~~~~~~~~
	../../criu/include/log.h:47:52: note: in expansion of macro 'LOG_PREFIX'
	   47 | #define pr_info(fmt, ...) print_on_level(LOG_INFO, LOG_PREFIX fmt, ##__VA_ARGS__)
	      |                                                    ^~~~~~~~~~
	amdgpu_plugin.c:1236:9: note: in expansion of macro 'pr_info'
	 1236 |         pr_info("devices:%u bos:%u objects:%u priv_data:%llu\n",
	      |         ^~~~~~~
	cc1: all warnings being treated as errors
	In file included from amdgpu_plugin_util.c:38:
	amdgpu_plugin_util.c: In function 'print_kfd_bo_stat':
	amdgpu_plugin_util.h:24:20: error: format '%llx' expects argument of type 'long long unsigned int', but argument 5 has type '__u64' {aka 'long unsigned int'} [-Werror=format=]
	   24 | #define LOG_PREFIX "amdgpu_plugin: "
	      |                    ^~~~~~~~~~~~~~~~~
	../../criu/include/log.h:47:52: note: in expansion of macro 'LOG_PREFIX'
	   47 | #define pr_info(fmt, ...) print_on_level(LOG_INFO, LOG_PREFIX fmt, ##__VA_ARGS__)
	      |                                                    ^~~~~~~~~~
	amdgpu_plugin_util.c:196:17: note: in expansion of macro 'pr_info'
	  196 |                 pr_info("%s(), %d. KFD BO Addr: %llx \n", __func__, idx, bo->addr);
	      |                 ^~~~~~~
	amdgpu_plugin_util.h:24:20: error: format '%llx' expects argument of type 'long long unsigned int', but argument 5 has type '__u64' {aka 'long unsigned int'} [-Werror=format=]
	   24 | #define LOG_PREFIX "amdgpu_plugin: "
	      |                    ^~~~~~~~~~~~~~~~~
	../../criu/include/log.h:47:52: note: in expansion of macro 'LOG_PREFIX'
	   47 | #define pr_info(fmt, ...) print_on_level(LOG_INFO, LOG_PREFIX fmt, ##__VA_ARGS__)
	      |                                                    ^~~~~~~~~~
	amdgpu_plugin_util.c:197:17: note: in expansion of macro 'pr_info'
	  197 |                 pr_info("%s(), %d. KFD BO Size: %llx \n", __func__, idx, bo->size);
	      |                 ^~~~~~~
	amdgpu_plugin_util.h:24:20: error: format '%llx' expects argument of type 'long long unsigned int', but argument 5 has type '__u64' {aka 'long unsigned int'} [-Werror=format=]
	   24 | #define LOG_PREFIX "amdgpu_plugin: "
	      |                    ^~~~~~~~~~~~~~~~~
	../../criu/include/log.h:47:52: note: in expansion of macro 'LOG_PREFIX'
	   47 | #define pr_info(fmt, ...) print_on_level(LOG_INFO, LOG_PREFIX fmt, ##__VA_ARGS__)
	      |                                                    ^~~~~~~~~~
	amdgpu_plugin_util.c:198:17: note: in expansion of macro 'pr_info'
	  198 |                 pr_info("%s(), %d. KFD BO Offset: %llx \n", __func__, idx, bo->offset);
	      |                 ^~~~~~~
	amdgpu_plugin_util.h:24:20: error: format '%llx' expects argument of type 'long long unsigned int', but argument 5 has type '__u64' {aka 'long unsigned int'} [-Werror=format=]
	   24 | #define LOG_PREFIX "amdgpu_plugin: "
	      |                    ^~~~~~~~~~~~~~~~~
	../../criu/include/log.h:47:52: note: in expansion of macro 'LOG_PREFIX'
	   47 | #define pr_info(fmt, ...) print_on_level(LOG_INFO, LOG_PREFIX fmt, ##__VA_ARGS__)
	      |                                                    ^~~~~~~~~~
	amdgpu_plugin_util.c:199:17: note: in expansion of macro 'pr_info'
	  199 |                 pr_info("%s(), %d. KFD BO Restored Offset: %llx \n", __func__, idx, bo->restored_offset);
	      |                 ^~~~~~~
	cc1: all warnings being treated as errors

Co-developed-by: Andrei Vagin <avagin@gmail.com>
Signed-off-by: Radostin Stoyanov <rstoyanov@fedoraproject.org>
2024-09-19 15:23:42 -07:00
Radostin Stoyanov
3e2ed18790 plugins/amdgpu: use C99-standard types
Co-developed-by: Andrei Vagin <avagin@gmail.com>
Signed-off-by: Radostin Stoyanov <rstoyanov@fedoraproject.org>
2024-09-19 15:23:42 -07:00
Radostin Stoyanov
d68205e919 ci: enable cross compile testing for amdgpu-plugin
Skip cross-compilation on armv7 because, among many other errors,
it fails with the following:

	In file included from ../../include/common/lock.h:9,
			 from ../../criu/include/files.h:9,
			 from amdgpu_plugin.c:30:
	../../include/common/asm/atomic.h:60:2: error: #error ARM architecture version (CONFIG_ARMV*) not set or unsupported.
	   60 | #error ARM architecture version (CONFIG_ARMV*) not set or unsupported.
	      |  ^~~~~
	../../include/common/asm/atomic.h: In function 'atomic_add_return':
	../../include/common/asm/atomic.h:81:9: error: implicit declaration of function 'smp_mb' [-Werror=implicit-function-declaration]
	   81 |         smp_mb();
	      |         ^~~~~~

Signed-off-by: Radostin Stoyanov <rstoyanov@fedoraproject.org>
2024-09-19 15:23:42 -07:00
Radostin Stoyanov
2ee5844411 plugins/amdgpu: fix cross-compilation
To enable cross-compile we need to use the CC definition from
criu/scripts/nmk/scripts/tools.mk:

CC := $(CROSS_COMPILE)$(HOSTCC)

Signed-off-by: Radostin Stoyanov <rstoyanov@fedoraproject.org>
2024-09-19 15:23:42 -07:00
Andrei Vagin
9a19cf34de scripts/ci: run tests with the mocked cuda-checkpoint tool
Signed-off-by: Andrei Vagin <avagin@google.com>
2024-09-11 16:02:11 -07:00
Andrei Vagin
de31abb970 criu/plugin: don't call plugin device hooks for non-alive tasks
Dead tasks don't hold any resources.

Fixes: 2465
Signed-off-by: Andrei Vagin <avagin@google.com>
2024-09-11 16:02:11 -07:00
Andrei Vagin
dea6305914 test/zdtm: allow to run tests with the mocked cuda-checkpoint tool
Here is an example how to run one test:
$ python test/zdtm.py run -t zdtm/static/env00 --ignore-taint --mocked-cuda-checkpoint

Signed-off-by: Andrei Vagin <avagin@google.com>
2024-09-11 16:02:11 -07:00
haozi007
67fe44e981 support user set remote mmap vma address
1. os auto assignment vma addr maybe conflict with vma in gpu living migrate scene;
2. so, we should give choice to user;

Signed-off-by: haozi007 <liuhao27@huawei.com>
2024-09-11 16:02:11 -07:00
Radostin Stoyanov
551cd92447 timer: fix printf specifiers for __suseconds64_t
New internal glibc types __timeval64 [1] and __suseconds64_t [2] have
been introduced as a solution for the Y2038 problem [3]. These 64-bit
types are used across all architectures. However, this change causes
the following build errors when cross-compiling on ARMv7 (armhf):

criu/timer.c:49:17: error: format '%ld' expects argument of type 'long int', but argument 5 has type '__suseconds64_t' {aka 'long long int'} [-Werror=format=]
   49 |         pr_info("Restored %s timer to %" PRId64 ".%ld -> %" PRId64 ".%ld\n", n,
      |                 ^~~~~~~~~~~~~~~~~~~~~~~~
   50 |                 (int64_t)val->it_value.tv_sec, val->it_value.tv_usec,
      |                                                ~~~~~~~~~~~~~~~~~~~~~
      |                                                             |
      |                                                             __suseconds64_t {aka long long int}

criu/timer.c:49:17: error: format '%ld' expects argument of type 'long int', but argument 7 has type '__suseconds64_t' {aka 'long long int'} [-Werror=format=]
   49 |         pr_info("Restored %s timer to %" PRId64 ".%ld -> %" PRId64 ".%ld\n", n,
      |                 ^~~~~~~~~~~~~~~~~~~~~~~~
   50 |                 (int64_t)val->it_value.tv_sec, val->it_value.tv_usec,
   51 |                 (int64_t)val->it_interval.tv_sec, val->it_interval.tv_usec);
      |                                                   ~~~~~~~~~~~~~~~~~~~~~~~~
      |                                                                   |
      |                                                                   __suseconds64_t {aka long long int}

ns.c:234:48: error: format '%ld' expects argument of type 'long int', but argument 5 has type 'time_t' {aka 'long long int'} [-Werror=format=]
  234 |         len = snprintf(buf, sizeof(buf), "%d %ld 0", clk_id, offset);
      |                                              ~~^             ~~~~~~
      |                                                |             |
      |                                                long int      time_t {aka long long int}
      |                                              %lld

msg.c:58:41: error: format '%ld' expects argument of type 'long int', but argument 3 has type '__suseconds64_t' {aka 'long long int'} [-Werror=format=]
   58 |         off += sprintf(buf + off, ".%.3ld: ", tv.tv_usec / 1000);
      |                                     ~~~~^     ~~~~~~~~~~~~~~~~~
      |                                         |                |
      |                                         long int         __suseconds64_t {aka long long int}
      |                                     %.3lld

../lib/zdtmtst.h:137:26: error: format '%ld' expects argument of type 'long int', but argument 4 has type '__time64_t' {aka 'long long int'} [-Werror=format=]
  137 |                 test_msg("ERR: %s:%d: " format " (errno = %d (%s))\n", __FILE__, __LINE__, ##arg, errno, \
      |                          ^~~~~~~~~~~~~~
pthread_timers_h.c:72:17: note: in expansion of macro 'pr_perror'
   72 |                 pr_perror("wrong interval: %ld:%ld", itimerspec.it_interval.tv_sec, itimerspec.it_interval.tv_nsec);
      |                 ^~~~~~~~~

vdso00.c:22:32: error: format '%li' expects argument of type 'long int', but argument 3 has type '__time64_t' {aka 'long long int'} [-Werror=format=]
   22 |         test_msg("%d time: %10li\n", getpid(), tv.tv_sec);
      |                            ~~~~^               ~~~~~~~~~
      |                                |                 |
      |                                long int          __time64_t {aka long long int}
      |                            %10lli

vdso00.c:29:32: error: format '%li' expects argument of type 'long int', but argument 3 has type '__time64_t' {aka 'long long int'} [-Werror=format=]
   29 |         test_msg("%d time: %10li\n", getpid(), tv.tv_sec);
      |                            ~~~~^               ~~~~~~~~~
      |                                |                 |
      |                                long int          __time64_t {aka long long int}
      |                            %10lli

vdso01.c:357:42: error: format '%li' expects argument of type 'long int', but argument 2 has type '__time64_t' {aka 'long long int'} [-Werror=format=]
  357 |         test_msg("gettimeofday: tv_sec %li vdso_gettimeofday: tv_sec %li\n", tv1.tv_sec, tv2.tv_sec);
      |                                        ~~^                                   ~~~~~~~~~~
      |                                          |                                      |
      |                                          long int                               __time64_t {aka long long int}
      |                                        %lli

vdso01.c:357:72: error: format '%li' expects argument of type 'long int', but argument 3 has type '__time64_t' {aka 'long long int'} [-Werror=format=]
  357 |         test_msg("gettimeofday: tv_sec %li vdso_gettimeofday: tv_sec %li\n", tv1.tv_sec, tv2.tv_sec);
      |                                                                      ~~^                 ~~~~~~~~~~
      |                                                                        |                    |
      |                                                                        long int             __time64_t {aka long long int}
      |

vdso01.c:328:43: error: format '%li' expects argument of type 'long int', but argument 2 has type '__time64_t' {aka 'long long int'} [-Werror=format=]
  328 |         test_msg("clock_gettime: tv_sec %li vdso_clock_gettime: tv_sec %li\n", ts1.tv_sec, ts2.tv_sec);
      |                                         ~~^                                    ~~~~~~~~~~
      |                                           |                                       |
      |                                           long int                                __time64_t {aka long long int}
      |                                         %lli

vdso01.c:328:74: error: format '%li' expects argument of type 'long int', but argument 3 has type '__time64_t' {aka 'long long int'} [-Werror=format=]
  328 |         test_msg("clock_gettime: tv_sec %li vdso_clock_gettime: tv_sec %li\n", ts1.tv_sec, ts2.tv_sec);
      |                                                                        ~~^                 ~~~~~~~~~~
      |                                                                          |                    |
      |                                                                          long int             __time64_t {aka long long int}
      |

../lib/zdtmtst.h:144:26: error: format '%ld' expects argument of type 'long int', but argument 4 has type 'time_t' {aka 'long long int'} [-Werror=format=]
  144 |                 test_msg("FAIL: %s:%d: " format " (errno = %d (%s))\n", __FILE__, __LINE__, ##arg, errno, \
      |                          ^~~~~~~~~~~~~~~
mtime_mmap.c:80:17: note: in expansion of macro 'fail'
   80 |                 fail("mtime %ld wasn't updated on mmapped %s file", mtime_new, filename);
      |                 ^~~~

../lib/zdtmtst.h:144:26: error: format '%ld' expects argument of type 'long int', but argument 4 has type '__time64_t' {aka 'long long int'} [-Werror=format=]
  144 |                 test_msg("FAIL: %s:%d: " format " (errno = %d (%s))\n", __FILE__, __LINE__, ##arg, errno, \
      |                          ^~~~~~~~~~~~~~~
mtime_mmap.c:101:17: note: in expansion of macro 'fail'
  101 |                 fail("After migration, mtime changed to %ld", fst.st_mtime);
      |                 ^~~~

[1] https://sourceware.org/git/?p=glibc.git;h=504c98717062cb9bcbd4b3e59e932d04331ddca5
[2] https://sourceware.org/git/?p=glibc.git;h=3fced064f23562ec24f8312ffbc14950993969e6
[3] https://en.wikipedia.org/wiki/Year_2038_problem

Signed-off-by: Radostin Stoyanov <rstoyanov@fedoraproject.org>
2024-09-11 16:02:11 -07:00
Radostin Stoyanov
a045c874cb ci: run tests with amdgpu and cuda plugins
Signed-off-by: Radostin Stoyanov <rstoyanov@fedoraproject.org>
2024-09-11 16:02:11 -07:00
Radostin Stoyanov
2453ed69a2 zdtm: add option to run tests with criu plugins
By default, if the "CRIU_LIBS_DIR" environment variable is not set,
CRIU will load all plugins installed in `/usr/lib/criu`. This may
result in running the ZDTM tests with plugins for a different version
of CRIU (e.g., installed from a package).

This patch updates ZDTM to always set the "CRIU_LIBS_DIR" environment
variable and use a local "plugins" directory. This directory contains
copies of the plugin files built from source. In addition, this patch
adds the `--criu-plugin` option to the `zdtm.py run` command, allowing
tests to be run with specified CRIU plugins.

Example:

  - Run test only with AMDGPU plugin
    ./zdtm.py run -t zdtm/static/busyloop00 --criu-plugin amdgpu

  - Run test only with CUDA plugin
    ./zdtm.py run -t zdtm/static/busyloop00 --criu-plugin cuda

  - Run test with both AMDGPU and CUDA plugins
    ./zdtm.py run -t zdtm/static/busyloop00 --criu-plugin amdgpu cuda

Signed-off-by: Radostin Stoyanov <rstoyanov@fedoraproject.org>
2024-09-11 16:02:11 -07:00
Radostin Stoyanov
ad66c27a11 cuda: fix launch cuda-checkpoint
When the cuda-checkpoint tool is not installed, execvp() is expected to
fail and return -1. In this case, we need to call exit() to terminate
the child process that was created earlier with fork().

Since CRIU can be used with applications that do not use CUDA, even
when the CUDA plugin is installed, this patch also updates the log
messages to show debug and warning (instead of error) when the
cuda-checkpoint tool is not found in $PATH.

Signed-off-by: Radostin Stoyanov <rstoyanov@fedoraproject.org>
Signed-off-by: Andrei Vagin <avagin@google.com>
2024-09-11 16:02:11 -07:00
Radostin Stoyanov
fde0b7ac69 cuda: don't leak fds to cuda-checkpoint
Leaking open file descriptors to third-party tools can lead
to security risks.

Signed-off-by: Radostin Stoyanov <rstoyanov@fedoraproject.org>
2024-09-11 16:02:11 -07:00
Radostin Stoyanov
4dde52a308 ci/podman: show mounts
Show information about mounts available on the host filesystem.
This is useful for debugging.

Signed-off-by: Radostin Stoyanov <rstoyanov@fedoraproject.org>
2024-09-11 16:02:11 -07:00
Radostin Stoyanov
9a85fb6382 ci/podman: show criu logs in case of error
Signed-off-by: Radostin Stoyanov <rstoyanov@fedoraproject.org>
2024-09-11 16:02:11 -07:00
liuchao173
8437663cc6 delete redundant include header files
restorer.h has been included in line 43.

Fixes: 22963d2827 ("Hide asm/restorer.h from sources")

Signed-off-by: liuchao173 <liuchao173@huawei.com>
2024-09-11 16:02:11 -07:00
Radostin Stoyanov
c42b58f4fb plugin: enable multiple plugins for the same hook
CRIU provides two plugins for checkpoint/restore of GPU applications:
amdgpu and cuda. Both plugins use the `RESUME_DEVICES_LATE` hook to
enable restore:

    CR_PLUGIN_REGISTER_HOOK(CR_PLUGIN_HOOK__RESUME_DEVICES_LATE, amdgpu_plugin_resume_devices_late)
    CR_PLUGIN_REGISTER_HOOK(CR_PLUGIN_HOOK__RESUME_DEVICES_LATE, cuda_plugin_resume_devices_late)

However, CRIU currently does not support running more than one plugin
for the same hook. As a result, when both plugins are installed, the
resume function for CUDA applications is not executed. To fix this,
we need to make sure that both `plugin_resume_devices_late()` functions
return `-ENOTSUP` when restore is not supported.

Signed-off-by: Radostin Stoyanov <rstoyanov@fedoraproject.org>
2024-09-11 16:02:11 -07:00
Radostin Stoyanov
85050be66b seize: fix pause-devices plugin hook
The plugin hook "PAUSE_DEVICES" was recently introduced in the following
commit. This hook was intended to execute the cuda-checkpoint tool
before the process tree is frozen. However, the run_plugins() call has
been placed immediately *after* freeze_processes(). This causes the
cuda-checkpoint tool to hang indefinitely during the checkpointing
of CUDA applications running in containers, eventually leading to its
termination by the timeout alarm.

a85f488595
criu/plugin: Introduce new plugin hooks PAUSE_DEVICES and CHECKPOINT_DEVICES to be used during pstree collection

This problem can be reproduced with the following example:

sudo podman run -d --rm \
        --device nvidia.com/gpu=all --security-opt=label=disable \
        quay.io/radostin/cuda-counter

sudo podman container checkpoint -l -e /tmp/checkpoint.tar

Signed-off-by: Radostin Stoyanov <rstoyanov@fedoraproject.org>
2024-09-11 16:02:11 -07:00
Andrei Vagin
21108b40de test/zdtm: mount a new tmpfs to the zdtm root /dev
The current file system can be mounted with nodev.

Fixes #2441

Signed-off-by: Andrei Vagin <avagin@google.com>
2024-09-11 16:02:11 -07:00
Radostin Stoyanov
fcbadfbdbf plugins: set executable bit on .so files
For historical reasons, some tools like rpm [1] or ldd [2,3]
may expect the executable bit to be present for the correct
identification of shared libraries. The executable bit on .so
files is set by default by compilers (e.g., GCC). It is not
strictly necessary but primarily a convention.

[1] https://docs.fedoraproject.org/en-US/package-maintainers/CommonRpmlintIssues/#unstripped_binary_or_object
[2] https://sourceware.org/git/?p=glibc.git;a=blob;f=elf/ldd.bash.in;h=d6b640df;hb=HEAD#l154

[3] $ sudo ldd /usr/lib/criu/*.so
/usr/lib/criu/amdgpu_plugin.so:
ldd: warning: you do not have execution permission for `/usr/lib/criu/amdgpu_plugin.so'
	linux-vdso.so.1 (0x00007fd0a2a3e000)
	libdrm.so.2 => /lib64/libdrm.so.2 (0x00007fd0a29eb000)
	libdrm_amdgpu.so.1 => /lib64/libdrm_amdgpu.so.1 (0x00007fd0a29de000)
	libc.so.6 => /lib64/libc.so.6 (0x00007fd0a27fc000)
	/lib64/ld-linux-x86-64.so.2 (0x00007fd0a2a40000)
/usr/lib/criu/cuda_plugin.so:
ldd: warning: you do not have execution permission for `/usr/lib/criu/cuda_plugin.so'
	linux-vdso.so.1 (0x00007f1806e13000)
	libc.so.6 => /lib64/libc.so.6 (0x00007f1806c08000)
	/lib64/ld-linux-x86-64.so.2 (0x00007f1806e15000)

Signed-off-by: Radostin Stoyanov <rstoyanov@fedoraproject.org>
2024-09-11 16:02:11 -07:00
Radostin Stoyanov
5783706d57 docs: update amdgpu-plugin man page
This patch updates the dependencies section of the AMDGPU plugin man
page to reflect that the plugin has been merged upstream and to fix a
formatting issue.

Signed-off-by: Radostin Stoyanov <rstoyanov@fedoraproject.org>
2024-09-11 16:02:11 -07:00
Florian Weimer
089345f77a Adjust to glibc __rseq_size semantic change
In commit 2e456ccf0c34a056e3ccafac4a0c7effef14d918 ("Linux: Make
__rseq_size useful for feature detection (bug 31965)") glibc 2.40
changed the meaning of __rseq_size slightly: it is now the size
of the active/feature area (20 bytes initially), and not the size
of the entire initially defined struct (32 bytes including padding).
The reason for the change is that the size including padding does not
allow detection of newly added features while previously unused
padding is consumed.

The prep_libc_rseq_info change in criu/cr-restore.c is not necessary
on kernels which have full ptrace support for obtaining rseq
information because the code is not used.  On older kernels, it is
a correctness fix because with size 20 (the new value), rseq
registeration would fail.

The two other changes are required to make rseq unregistration work
in tests.

Signed-off-by: Florian Weimer <fweimer@redhat.com>
2024-09-11 16:02:11 -07:00
Bui Quang Minh
b9081ca56b zdtm: make cgroup testcases run non-parallel
cgroup testcases live in the same cgroup root zdtmtst and
zdtmtst.defaultroot controller then create child subgroup for testing. This
can cause problems when cgroup testcases run in parallel. For example,
testcase A dumps the child subgroup of testcase B since it's in the cgroup
root but in the middle of restoring of testcase A, testcase B completes and
cleans up the subgroup directory. This causes error in testcase A restore.
This commit adds excl flag to all cgroup testcases description so that
these don't run parallel.

Signed-off-by: Bui Quang Minh <minhquangbui99@gmail.com>
2024-09-11 16:02:11 -07:00
Andrei Vagin
4f45572fde util: use close_range when it's supported
close_range is faster than reading /proc/self/fd and closing descriptors
one by one.

Signed-off-by: Andrei Vagin <avagin@gmail.com>
2024-09-11 16:02:11 -07:00
Radostin Stoyanov
42b177da62 scripts/build: drop centos 7 targets
The CI tests with CentOS 7 have been disabled and removed [1,2].
This patch removes the obsolete Makefile targets for these tests.

[1] 24bc083653
[2] f8466ca798

Signed-off-by: Radostin Stoyanov <rstoyanov@fedoraproject.org>
2024-09-11 16:02:11 -07:00
Andrei Vagin
1815838191 vdso: proxify the __vdso_clock_gettime64 function
It was added in v5.3-rc1~211^2~4^2~10.

Fixes #2390

Signed-off-by: Andrei Vagin <avagin@gmail.com>
2024-09-11 16:02:11 -07:00
Andrei Vagin
ac22aaf576 apparmor: get_suspend_policy must return NULL in error cases
Before this fix, it could return MAP_FAILED which is ((void *) -1).

Signed-off-by: Andrei Vagin <avagin@gmail.com>
2024-09-11 16:02:11 -07:00
Pavel Tikhomirov
71999d8883 cgroupd: unblock SIGTERM to make stop_cgroupd actually work
Sometimes due to sigblockmask inheritance cgroupd can inherit SIGTERM
blocked. That will lead cgroupd ignoring SIGTERM from stop_cgroupd() and
CRIU will get stuck due to waiting for never-stopping cgroupd.

I see this happening in lxc-checkpoint, also saw this in OpenVZ jenkins
on cgroup_inotify00 test.

Signed-off-by: Pavel Tikhomirov <ptikhomirov@virtuozzo.com>
2024-09-11 16:02:11 -07:00
Liu Hua
daed6c3535 irmap: duplicate string in irmap_scan_path_add
Duplicate string in irmap_scan_path_add, otherwise it will free before
parsing next configuration input.

[ avagin: handle errors of xstrdup ]

Signed-off-by: Liu Hua <weldonliu@tencent.com>
Signed-off-by: Andrei Vagin <avagin@gmail.com>
2024-09-11 16:02:11 -07:00
Andrei Vagin
b169e3b63d plugins/cuda: fix crosscompilation
Signed-off-by: Andrei Vagin <avagin@gmail.com>
2024-09-11 16:02:11 -07:00
Pratyush Yadav
ca971b7f8b compel: fix build on Amazon Linux 2 due to missing PTRACE_ARCH_PRCTL
Commit fc683cb01 ("compel: shstk: save CET state when CPU supports it")
started using PTRACE_ARCH_PRCTL to query shadow stack status. While
PTRACE_ARCH_PRCTL has existed in the kernel for a long time, it was only
added to glibc in version 2.27. Amazon Linux 2 (AL2) has glibc 2.26,
which does not have this definition. As a result, build on AL2 fails
with the below error:

    compel/arch/x86/src/lib/infect.c: In function ‘get_task_xsave’:
    compel/arch/x86/src/lib/infect.c:276:14: error: ‘PTRACE_ARCH_PRCTL’ undeclared (first use in this function)
    276 |   if (ptrace(PTRACE_ARCH_PRCTL, pid, (unsigned long)&features, ARCH_SHSTK_STATUS)) {
        |              ^~~~~~~~~~~~~~~~~

While the definition is present on the system via the kernel headers (in
asm/ptrace-abi.h) which can be reached by including linux/ptrace.h, the
comment in compel/include/uapi/ptrace.h says:

    We'd want to include both sys/ptrace.h and linux/ptrace.h, hoping
    that most definitions come from either one or another. Alas, on
    Alpine/musl both files declare struct ptrace_peeksiginfo_args, so
    there is no way they can be used together. Let's rely on libc one.

Since including linux/ptrace.h is not an option, define
PTRACE_ARCH_PRCTL if it doesn't already exist. An interesting point to
note is that in sys/ptrace.h, PTRACE_ARCH_PRCTL is an enum value so the
preprocessor doesn't know about it. PT_ARCH_PRCTL is the preprocessor
symbol that matches the value of PTRACE_ARCH_PRCTL. So look for
PT_ARCH_PRCTL to decide if PTRACE_ARCH_PRCTL is available or not.

Another interesting point to note is that AL2 ships with GCC 7 by
default, which does not support the -mshstk option, causing other build
failures. Luckily, it also ships GCC 10 which does have the option.
Using GCC 10 lets the build succeed.

Fixes: fc683cb01 ("compel: shstk: save CET state when CPU supports it")
Signed-off-by: Pratyush Yadav <ptyadav@amazon.de>
2024-09-11 16:02:11 -07:00
Jesus Ramos
bf417dd050 criu/plugin: Add NVIDIA CUDA plugin
Adding support for the NVIDIA cuda-checkpoint utility, requires the use of an
r555 or higher driver along with the cuda-checkpoint binary.

Signed-off-by: Jesus Ramos <jeramos@nvidia.com>
2024-09-11 16:02:11 -07:00
Jesus Ramos
5f486d5aee criu/plugin: Introduce new plugin hooks PAUSE_DEVICES and CHECKPOINT_DEVICES to be used during pstree collection
PAUSE_DEVICES is called before a process is frozen and is used by the CUDA
plugin to place the process in a state that's ready to be checkpointed and
quiesce any pending work

CHECKPOINT_DEVICES is called after all processes in the tree have been frozen
and PAUSE'd and performs the actual checkpointing operation for CUDA
applications

Signed-off-by: Jesus Ramos <jeramos@nvidia.com>
2024-09-11 16:02:11 -07:00
Jesus Ramos
1012e542e5 criu: Restore rseq_cs state slightly earlier in the restore sequence and run the plugin finalizer later in the dump sequence
Restore rseq_cs state before calling RESUME_DEVICES_LATE as the CUDA plugin will
temporarily unfreeze a thread during the plugin hook to assist with device
restore

Run the plugin finalizer later in the dump sequence since the finalizer is used
by the CUDA plugin to handle some process cleanup

Signed-off-by: Jesus Ramos <jeramos@nvidia.com>
2024-09-11 16:02:11 -07:00
Radostin Stoyanov
7ac4537069 readme: update link to FAQ page
The current link opens a page with the following text:

    The MediaWiki FAQ can be found at:
    https://www.mediawiki.org/wiki/Special:MyLanguage/Manual:FAQ

Signed-off-by: Radostin Stoyanov <rstoyanov@fedoraproject.org>
2024-09-11 16:02:11 -07:00