Adding support for the NVIDIA cuda-checkpoint utility, requires the use of an
r555 or higher driver along with the cuda-checkpoint binary.
Signed-off-by: Jesus Ramos <jeramos@nvidia.com>
Ruff (https://github.com/astral-sh/ruff) is a Python linter
written in Rust, designed to replace Flake8. It is significantly
faster and actively maintained.
In addition to replacing flake8 with ruff, this patch also
creates separate makefile targets for ruff, shellcheck and
codespell, so that they can be tested independently.
RUFF_FLAGS can be used to specify options such as '--fix'.
Example:
make lint
make ruff RUFF_FLAGS=--fix
Signed-off-by: Radostin Stoyanov <rstoyanov@fedoraproject.org>
WARNINGS variable should be amended, not redefined.
We still need, e.g., `-Wno-dangling-pointer` to build
criu on loongarch64 with gcc13.
Signed-off-by: Ivan A. Melnikov <iv@altlinux.org>
Newer versions of pip use an isolated virtual environment when building
Python projects. However, when the source code of CRIT is copied into
the isolated environment, the symlink for `../lib/py` (pycriu) becomes
invalid. As a workaround, we used the `--no-build-isolation` option for
`pip install`. However, this functionality has issues in some versions
of PIP [1, 2]. To fix this problem, this patch adds separate packages
for pycriu and crit, and each package is installed independently.
[1] https://github.com/pypa/pip/pull/8221
[2] https://github.com/pypa/pip/issues/8165#issuecomment-625401463
Signed-off-by: Radostin Stoyanov <rstoyanov@fedoraproject.org>
There are multiple cases where good human readable code block is
converted to an unreadable mess by clang-format, so we don't want to
rely on clang-format completely. Also there is no way, as far as I can
see, to make clang-format only fix what we want it to fix without
breaking something.
So let's just display hints inline where clang-format is unhappy. When
reviewer sees such a warning it's a good sign that something is broken
in coding-style around this warning.
We add special script which parses diff generated by indent and
generates warning for each hunk.
Signed-off-by: Pavel Tikhomirov <ptikhomirov@virtuozzo.com>
This commit is introducing a test for the action-script functionality
of CRIU to verify that pre-dump, post-dump, pre-restore, pre-resume,
post-restore, post-resume hooks are executed during dump/restore.
Signed-off-by: Radostin Stoyanov <rstoyanov@fedoraproject.org>
This patch reverts changes introduced with the following commits:
4feb07020d
crit: enable python2 or python3 based crit
b78c4e071a
test: fix crit test and extend it
Signed-off-by: Radostin Stoyanov <rstoyanov@fedoraproject.org>
This patch reverts changes introduced for Python 2 compatibility
in commits:
1c866db (Add new files for running criu-coredump via python 2 or 3)
3180d35 (Add support for python3 in criu-coredump).
Signed-off-by: Radostin Stoyanov <rstoyanov@fedoraproject.org>
If the error is ignored it is not important enough - make it a warning
instead.
From: Mian Luo <mianl@google.com>
Change-Id: If2641c3d4e0a4d57fdf04e4570c49be55f526535
Signed-off-by: Michał Mirosław <emmir@google.com>
The patch is similar to what has been done in linux kernel, as this
warning effectively prevents us from adding list elements to local list
head. See 49beadbd47
Else we have:
CC criu/mount.o
In file included from criu/include/cr_options.h:7,
from criu/mount.c:13:
In function '__list_add',
inlined from 'list_add' at include/common/list.h:41:2,
inlined from 'mnt_tree_for_each' at criu/mount.c:1977:2:
include/common/list.h:35:19: error: storing the address of local variable 'postpone' in
'((struct list_head *)((char *)start + 8))[24].prev' [-Werror=dangling-pointer=]
35 | new->prev = prev;
| ~~~~~~~~~~^~~~~~
criu/mount.c: In function 'mnt_tree_for_each':
criu/mount.c:1972:19: note: 'postpone' declared here
1972 | LIST_HEAD(postpone);
| ^~~~~~~~
Signed-off-by: Pavel Tikhomirov <ptikhomirov@virtuozzo.com>
Refactor lib/py/images/images.py to reduce code duplication
by extracting repetitive code into helper functions and
private methods. This improves code readability and maintainability,
as well as reducing the risk of bugs caused by duplicated code.
Additionally, in Makefile, lib/py/images/images.py is added to the
list of files to run by flake8 during CI.
Fixes: #340
Signed-off-by: Kouame Behouba Manasse <behouba@gmail.com>
If we build tags for our repo:
[criu]$ make tags
GEN tags
And then run codespell, we get an error:
[criu]$ codespell
./tags:3755: struc ==> struct
Let's exclude tags file from codespell search, this would add usability
to `make lint`.
Signed-off-by: Pavel Tikhomirov <ptikhomirov@virtuozzo.com>
As our pr_* functions are complex and can call different system calls
inside before actual printing (e.g. gettimeofday for timestamps) actual
errno at the time of printing may be changed.
Signed-off-by: Pavel Tikhomirov <ptikhomirov@virtuozzo.com>
Previousely "make indent" checked all files in criu source directory for
codding style flaws. We have several problems with it:
- clang-format default format sometimes changes in new versions of the
package and we need to reformat all our code base each time it happens
- on different systems we may have different versions of clang-format
and on latest criu-dev "make indent" may be still unhappy on your system
- when we want to update clang-format rules ourselves we need to update
all our code base each time
- sometimes clang-format rules are not fitting all our cases, (e.g.: an
option IndentGotoLabels works nice for simple C code, but is a no go for
assembler and C macros) and putting "clang-format off" everywhere is a
mess
- sometimes we intentionally want to break clang-format rules (e.g.:
we want to put function arguments on a new line separating them
"logically" not "mechanically" following 120-char rule like clang-format
does).
This adds a BASE option for "make indent" where all commits in range
BASE..HEAD would be checked with git-clang-format for codding style
flaws. For instance when developing on top of criu-dev, one can use
"make BASE=origin/criu-dev indent" to check all their commits for
compliance with the clang-format rules. Default base is HEAD~1 to make
last commit checked when "make indent" is called. The closest thing to
the old behaviour would then be "make indent BASE=init", note that only
commited files would be checked.
Extra options to git-clang-format may be passed through OPTS variable.
Also reuse "make indent" in github lint workflow.
Signed-off-by: Pavel Tikhomirov <ptikhomirov@virtuozzo.com>
Name collision with an abandoned project named 'crit' in pypi causes pip
to show crit (CRiu Image Tool) as outdated. This patch updates crit to
use the same version and license as criu.
Fixes#1878
Signed-off-by: Radostin Stoyanov <rstoyanov@fedoraproject.org>
To support Checkpoint Restore with AMDGPUs for ROCm workloads, introduce
a new plugin to assist CRIU with the help of AMD KFD kernel driver. This
initial commit just provides the basic framework to build up further
capabilities. Like CRIU, the amdgpu plugin also uses protobuf to
serialize
and save the amdkfd data which is mostly VRAM contents with some
metadata.
We generate a data file "amdgpu-kfd-<id>.img" during the dump stage. On restore
this file is read and extracted to re-create various types of buffer
objects that belonged to the previously checkpointed process. Upon
restore the mmap page offset within a device file might change so we use
the new hook to update and adjust the mmap offsets for newly created
target process. This is needed for sys_mmap call in pie restorer phase.
Support for queues and events is added in future patches of this series.
With the current implementation (amdgpu_plugin), we support:
- Only compute workloads such (Non Gfx) are supported
- GPU visible inside a container
- AMD GPU Gfx 9 Family
- Pytorch Benchmarks such as BERT Base
amdgpu plugin dependes on libdrm and libdrm_amdgpu which are typically
installed with libdrm-dev package. We build amdgpu_plugin only when the
dependencies are met on the target system and when user intends to
install the amdgpu plugin and not by default with criu build.
Suggested-by: Felix Kuehling <felix.kuehling@amd.com>
Co-authored-by: David Yat Sin <david.yatsin@amd.com>
Signed-off-by: David Yat Sin <david.yatsin@amd.com>
Signed-off-by: Rajneesh Bhardwaj <rajneesh.bhardwaj@amd.com>
Starting with gcc-11, Debian's armhf compiler no longer builds with
a default -mfpu= option. Instead it enables the FPU via an extension
to the -march flag (--with-arch=armv7-a+fp). criu's Makefile explicitly
passes its own -march=armv7-a setting, which overrides the +fp default,
so we end up with no FPU:
cc1: error: '-mfloat-abi=hard': selected architecture lacks an FPU
Signed-off-by: Radostin Stoyanov <radostin@redhat.com>
This is just a placeholder dummy plugin and will be replaced by a proper
plugin that implements support for AMD GPU devices. This just
facilitates the initial pull request and CI build test trigger for early
code review of CRIU specific changes. Future PRs will bring in more
support for amdgpu_plugin to enable CRIU with AMD ROCm.
Signed-off-by: Rajneesh Bhardwaj <rajneesh.bhardwaj@amd.com>
This is another attempt to introduce a tool to format CRIU's source
code. This time it is based on clang-format.
The .clang-format file is taken from the linux kernel git tree (5.13).
I removed all comments from lines which state that it requires at least
clang-format 4 or 5. For this resulting file at least clang-format 11
is required. See scripts/fetch-clang-format.sh for all the changes
done to the Linux kernel .clang-format file.
Acked-by: Mike Rapoport <rppt@linux.ibm.com>
Signed-off-by: Adrian Reber <areber@redhat.com>
This adds a test run to ensure known (but fixed) configuration file
parser errors are not crashing CRIU anymore.
Based on missing test code coverage this script also tests code paths of
the option handling which have not been tested until now.
Signed-off-by: Adrian Reber <areber@redhat.com>
Unlike pr_perror, pr_err and other macros do not append \n
to the message being printed, so the caller needs to take care of it.
Sometimes it was not done, so let's add this manually.
To make sure it won't happen again, add a line to Makefile under the
linter target to check for such missing \n. NOTE this check is only
done for part of such cases (where the pr_* statement fits in one line
and there's no comment after), but it's better than nothing.
Add comments after pr_msg and pr_info statements where we deliberately
don't add \n, so that the above check ignores them.
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
My editor (vim) auto-removes whitespace at EOL for *.c and *.h files,
and I think it makes sense to have a separate commit for this, rather
than littering other commits with such changes.
To make sure this won't pile up again, add a line to Makefile under
the linter target to check for such things (so CI will fail).
This is all whitespace except an addition to Makefile.
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
On my system (shellcheck v0.7.1) make lint shows a few warnings about
needing to quote variables.
Fix those.
PS I am not sure why those are not shown by GHA CI, I assume there is
different shellcheck version used. Add shellcheck -- version to the
appropriate Makefile target to avoid confusion.
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
In many cases developers forget that pr_perror and fail macros
are a bit special, in particular:
1. they already show errno;
2. they already append \n to the message.
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
Running zdtm tests does not require input and therefore it is not
necessary to use -it. This change also allows to run the test in CI
where it currently fails with:
the input device is not a TTY
make: *** [Makefile:388: docker-test] Error 1
Signed-off-by: Radostin Stoyanov <rstoyanov@fedoraproject.org>
All zdtm tests pass on Fedora 33 for `make docker-build && make docker-test`
with devicemapper storage driver.
Signed-off-by: Radostin Stoyanov <rstoyanov@fedoraproject.org>
The test/zdtm_mount_cgroups script fails with 'permission denied'
when running tests with private cgroup namespace.
Using the host network namespace allows us to test criu as if
it is running on the host, sharing iptables rules etc.
Signed-off-by: Radostin Stoyanov <rstoyanov@fedoraproject.org>
This fixes the others/crit test to work again and extends it to make
sure all possible input and output options are correctly handled by
crit.
Signed-off-by: Adrian Reber <areber@redhat.com>
CRIU is already using multiple CI systems and not just Travis. This
renames all Travis related things to 'ci' to show it is actually
independent of Travis.
Just a simple rename.
Signed-off-by: Adrian Reber <areber@redhat.com>
Shellcheck (https://github.com/koalaman/shellcheck) can identify common
errors in shell scripts. This initial integration of shellcheck only
checks the scripts in the 'scripts/' folder. This commit fixes (or
disables) all reports of shellcheck to ensure this part starts error
free. I am not convinced this is really necessary as most changes do not
seem to be necessary for their circumstances. On the other hand it
probably does not hurt to use a checker to avoid unnecessary errors.
Signed-off-by: Adrian Reber <areber@redhat.com>
Source files modified:
* Makefile.config - Checks whether libbpf is installed on the system.
If so, we add -lbpf to LIBS_FEATURES, -DCONFIG_HAS_LIBBPF to
FEATURE_DEFINES and set CONFIG_HAS_LIBBPF. This allows us to check for
the presence of libbpf before compiling or executing BPF c/r code and
ZDTM tests.
* Makefile - Set CONFIG_HAS_LIBBPF to clean all files.
Signed-off-by: Abhishek Vijeev <abhishek.vijeev@gmail.com>
Include warnings that the kernel uses during compilation:
-Wstrict-prototypes: enforces full declaration of functions.
Previously, when declaring extern void func(), one can call func(123)
and have no compilation error. This is dangerous. The correct declaration
is extern void func(void).
Signed-off-by: Nicolas Viennot <Nicolas.Viennot@twosigma.com>
[Generated a commit message from the pull request]
Signed-off-by: Dmitry Safonov <dima@arista.com>