This patch implements the entire logic to enable the offloading of
buffer object content restoration.
The goal of this patch is to offload the buffer object content
restoration to the main CRIU process so that this restoration can occur
in parallel with other restoration logic (mainly the restoration of
memory state in the restore blob, which is time-consuming) to speed up
the restore phase. The restoration of buffer object content usually
takes a significant amount of time for GPU applications, so
parallelizing it with other operations can reduce the overall restore
time.
It has three parts: the first replaces the restoration of buffer objects
in the target process by sending a parallel restore command to the main
CRIU process; the second implements the POST_FORKING hook in the amdgpu
plugin to enable buffer object content restoration in the main CRIU
process; the third stops the parallel thread in the RESUME_DEVICES_LATE
hook.
This optimization only focuses on the single-process situation (common
case). In other scenarios, it will turn to the original method. This is
achieved with the new `parallel_disabled` flag.
Signed-off-by: Yanning Yang <yangyanning@sjtu.edu.cn>
This functionality (#2527) is being reverted and excluded from this
release due to issue #2812.
It will be included in a subsequent release once all associated issues
are resolved.
Signed-off-by: Andrei Vagin <avagin@google.com>
This patch implements the entire logic to enable the offloading of
buffer object content restoration.
The goal of this patch is to offload the buffer object content
restoration to the main CRIU process so that this restoration can occur
in parallel with other restoration logic (mainly the restoration of
memory state in the restore blob, which is time-consuming) to speed up
the restore phase. The restoration of buffer object content usually
takes a significant amount of time for GPU applications, so
parallelizing it with other operations can reduce the overall restore
time.
It has three parts: the first replaces the restoration of buffer objects
in the target process by sending a parallel restore command to the main
CRIU process; the second implements the POST_FORKING hook in the amdgpu
plugin to enable buffer object content restoration in the main CRIU
process; the third stops the parallel thread in the RESUME_DEVICES_LATE
hook.
This optimization only focuses on the single-process situation (common
case). In other scenarios, it will turn to the original method. This is
achieved with the new `parallel_disabled` flag.
Signed-off-by: Yanning Yang <yangyanning@sjtu.edu.cn>
By default, CRIU uses the path "/usr/lib/criu" to install and load
plugins at runtime. This path is defined by the `PLUGINDIR` variable
in Makefile.install and `CR_PLUGIN_DEFAULT` in `criu/include/plugin.h`.
However, some distribution packages might install the CRIU plugins at
"/usr/lib64/criu" instead. This patch updates the makefile to align
the path defined by `CR_PLUGIN_DEFAULT` with the value of `PLUGINDIR`.
Signed-off-by: Radostin Stoyanov <rstoyanov@fedoraproject.org>
To enable cross-compile we need to use the CC definition from
criu/scripts/nmk/scripts/tools.mk:
CC := $(CROSS_COMPILE)$(HOSTCC)
Signed-off-by: Radostin Stoyanov <rstoyanov@fedoraproject.org>
For historical reasons, some tools like rpm [1] or ldd [2,3]
may expect the executable bit to be present for the correct
identification of shared libraries. The executable bit on .so
files is set by default by compilers (e.g., GCC). It is not
strictly necessary but primarily a convention.
[1] https://docs.fedoraproject.org/en-US/package-maintainers/CommonRpmlintIssues/#unstripped_binary_or_object
[2] https://sourceware.org/git/?p=glibc.git;a=blob;f=elf/ldd.bash.in;h=d6b640df;hb=HEAD#l154
[3] $ sudo ldd /usr/lib/criu/*.so
/usr/lib/criu/amdgpu_plugin.so:
ldd: warning: you do not have execution permission for `/usr/lib/criu/amdgpu_plugin.so'
linux-vdso.so.1 (0x00007fd0a2a3e000)
libdrm.so.2 => /lib64/libdrm.so.2 (0x00007fd0a29eb000)
libdrm_amdgpu.so.1 => /lib64/libdrm_amdgpu.so.1 (0x00007fd0a29de000)
libc.so.6 => /lib64/libc.so.6 (0x00007fd0a27fc000)
/lib64/ld-linux-x86-64.so.2 (0x00007fd0a2a40000)
/usr/lib/criu/cuda_plugin.so:
ldd: warning: you do not have execution permission for `/usr/lib/criu/cuda_plugin.so'
linux-vdso.so.1 (0x00007f1806e13000)
libc.so.6 => /lib64/libc.so.6 (0x00007f1806c08000)
/lib64/ld-linux-x86-64.so.2 (0x00007f1806e15000)
Signed-off-by: Radostin Stoyanov <rstoyanov@fedoraproject.org>
Add a new compilation unit to host symbols and methods that will be
needed to C&R DRM devices. Refactor code that indicates support for
C&R and checkpoints KFD and DRM devices
Signed-off-by: Ramesh Errabolu <Ramesh.Errabolu@amd.com>
This patch adds a missing definition for `__nmk_dir` in the Makefile
for the amdgpu plugin. This definition is required, for example, when
building the `test_topology_remap` target:
make -C plugins/amdgpu/ test_topology_remap
Signed-off-by: Radostin Stoyanov <rstoyanov@fedoraproject.org>
Building the criu packages for Ubuntu/Debian fails with:
mkdir: cannot create directory '/var/lib/criu': Permission denied
This patch updates PLUGINDIR with the value /usr/lib/criu
Fixes: #1877
Signed-off-by: Radostin Stoyanov <rstoyanov@fedoraproject.org>
When building packages for CRIU the source directory might have a
name different than 'criu'.
Fixes: #1877
Reported-by: @siris
Signed-off-by: Radostin Stoyanov <rstoyanov@fedoraproject.org>
On newer kernel's (> 5.13), KFD & DRM drivers will only allow the
/dev/renderD* file descriptors that were used during the CRIU_RESTORE
ioctl when calling mmap for the vma's.
During restore, after opening /dev/renderD*, amdgpu_plugin keeps the
FDs opened and instead returns a copy of the FDs to CRIU. The same FDs
are then returned during the UPDATE_VMAMAP hooks so that they can be
used by CRIU to call mmap. Duplicated FDs created using dup are
references to the same struct file inside the kernel so they are also
allowed to mmap.
To prevent the opened FDs inside amdgpu_plugin from conflicting with
FDs used by the target restore application, we make sure that the
lowest-numbered FD that amdgpu_plugin will use is greater than the
highest-numbered FD that is used by the target application.
Signed-off-by: David Yat Sin <david.yatsin@amd.com>
AMD Radeon GPUs have special sDMA (system dma engines) IPs that can be
used to speed up the read write operations from the VRAM and GTT memory.
Depends on:
* The kernel mode driver (kfd) creating the dmabuf objects for the kfd
BOs in both checkpoint and restore operation.
* libdrm and libdrm_amdgpu libraries
Suggested-by: Felix Kuehling <felix.kuehling@amd.com>
Signed-off-by: Rajneesh Bhardwaj <rajneesh.bhardwaj@amd.com>
Signed-off-by: David Yat Sin <david.yatsin@amd.com>
Libhsakmt(thunk) uses a shared memory file in /dev/shm/hsakmt_shared_mem
and its semaphore in /dev/shm/hsakmt_shared_mem. Adding a check during
checkpoint to see if these two files exist. If they exist then the
plugin will try to restore them during restore.
Signed-off-by: David Yat Sin <david.yatsin@amd.com>
Implement multi-threaded code to read and write contents of each GPU
VRAM BOs in parallel in order to speed up dumping process when using
multiple GPUs.
Signed-off-by: David Yat Sin <david.yatsin@amd.com>
Signed-off-by: Rajneesh Bhardwaj <rajneesh.bhardwaj@amd.com>
Adding unit tests for GPU remapping code when checkpointing and
restoring on different nodes with different topologies.
Signed-off-by: David Yat Sin <david.yatsin@amd.com>
Parse local system topology in /sys/class/kfd/kfd/topology/nodes/ and
store properties for each gpu in the CRIU image files. The gpu
properties can then be used later during restore to make the process is
restored on gpu's with similar properties.
Signed-off-by: David Yat Sin <david.yatsin@amd.com>
To support Checkpoint Restore with AMDGPUs for ROCm workloads, introduce
a new plugin to assist CRIU with the help of AMD KFD kernel driver. This
initial commit just provides the basic framework to build up further
capabilities. Like CRIU, the amdgpu plugin also uses protobuf to
serialize
and save the amdkfd data which is mostly VRAM contents with some
metadata.
We generate a data file "amdgpu-kfd-<id>.img" during the dump stage. On restore
this file is read and extracted to re-create various types of buffer
objects that belonged to the previously checkpointed process. Upon
restore the mmap page offset within a device file might change so we use
the new hook to update and adjust the mmap offsets for newly created
target process. This is needed for sys_mmap call in pie restorer phase.
Support for queues and events is added in future patches of this series.
With the current implementation (amdgpu_plugin), we support:
- Only compute workloads such (Non Gfx) are supported
- GPU visible inside a container
- AMD GPU Gfx 9 Family
- Pytorch Benchmarks such as BERT Base
amdgpu plugin dependes on libdrm and libdrm_amdgpu which are typically
installed with libdrm-dev package. We build amdgpu_plugin only when the
dependencies are met on the target system and when user intends to
install the amdgpu plugin and not by default with criu build.
Suggested-by: Felix Kuehling <felix.kuehling@amd.com>
Co-authored-by: David Yat Sin <david.yatsin@amd.com>
Signed-off-by: David Yat Sin <david.yatsin@amd.com>
Signed-off-by: Rajneesh Bhardwaj <rajneesh.bhardwaj@amd.com>
This is just a placeholder dummy plugin and will be replaced by a proper
plugin that implements support for AMD GPU devices. This just
facilitates the initial pull request and CI build test trigger for early
code review of CRIU specific changes. Future PRs will bring in more
support for amdgpu_plugin to enable CRIU with AMD ROCm.
Signed-off-by: Rajneesh Bhardwaj <rajneesh.bhardwaj@amd.com>