Commit graph

413 commits

Author SHA1 Message Date
Radostin Stoyanov
ddf7a170ff infect-types: fix user_gcs redefine error
In file included from compel/arch/aarch64/src/lib/infect.c:10:
compel/include/uapi/compel/asm/infect-types.h:24:8: error: redefinition of 'user_gcs'
   24 | struct user_gcs {
      |        ^
/usr/include/asm/ptrace.h:329:8: note: previous definition is here
  329 | struct user_gcs {
      |        ^
1 error generated.
make[1]: *** [/criu/scripts/nmk/scripts/build.mk:215: compel/arch/aarch64/src/lib/infect.o] Error 1

Suggested-by: Andrei Vagin <avagin@google.com>
Signed-off-by: Radostin Stoyanov <rstoyanov@fedoraproject.org>
2026-01-08 07:48:23 -08:00
Radostin Stoyanov
b1a51489dd compel: fix sys_clock_gettime function signature
The initialization of the struct timespec used as clockid input
parameter was removed in commit:
b4441d1bd8 ("restorer.c: rm unneded struct init")

This causes the build to fail on Alpine with clang version 21.1.2:

  GEN      criu/pie/parasite-blob.h
  criu/pie/restorer.c:1230:39: error: variable 'ts' is uninitialized when passed as a const pointer argument here [-Werror,-Wuninitialized-const-pointer]
   1230 |                         if (sys_clock_gettime(t->clockid, &ts)) {
        |                                                            ^~
  1 error generated.
  make[2]: *** [/criu/scripts/nmk/scripts/build.mk:118: criu/pie/restorer.o] Error 1
  make[1]: *** [criu/Makefile:59: pie] Error 2
  make: *** [Makefile:278: criu] Error 2

To fix this, we remove the "const" from the declaration of
clock_gettime. Since the kernel writes the current time into
the struct timespec provided by the caller, the pointer must
be writable.

Suggested-by: Andrei Vagin <avagin@google.com>
Signed-off-by: Radostin Stoyanov <rstoyanov@fedoraproject.org>
2026-01-08 07:48:23 -08:00
Igor Svilenkov Bozic
d591e320e0 criu/restore: gcs: adds restore implementation for Guarded Control Stack
This commit finalizes AArch64 Guarded Control Stack (GCS)
support by wiring the full dump and restore flow.

The restore path adds the following steps:

 - Define shared AArch64 GCS types and constants in a dedicated header
   for both compel and CRIU inclusion
 - compel: add get/set NT_ARM_GCS via ptrace, enabling user-space
   GCS state save and restore.
 - During restore switch to the new GCS (via GCSSTR) to place capability
   token sa_restorer address
 - arch_shstk_trampoline() — We enable GCS in a trampoline that using
   prctl(PR_SET_SHADOW_STACK_STATUS, ...) via inline SVC. The trampoline
   ineeded because we can’t RET without a valid GCS.
 - restorer: map the recorded GCS VMA, populate contents top-down with
   GCSSTR, write the signal capability at GCSPR_EL0 and the valid token at
   GCSPR_EL0-8, then switch to the rebuilt GCS (GCSSS1)
 - Save and restore registers via ptrace
 - Extend restorer argument structures to carry GCS state
   into post-restore execution
 - Add shstk_set_restorer_stack(): sets tmp_gcs to temporary restorer
   shadow stack start
 - Add gcs_vma_restore implementation (required for mremap of the GCS VMA)

Tested with:
    GCS_ENABLE=1 ./zdtm.py run -t zdtm/static/env00

Signed-off-by: Igor Svilenkov Bozic <svilenkov@gmail.com>
2025-12-07 19:20:00 +01:00
Igor Svilenkov Bozic
92e6e523b5 compel: gcs: add opt-in GCS test support for AArch64
Introduce an opt-in mode for building and running compel tests
with Guarded Control Stack (GCS) enabled on AArch64.

Changes:
 - Extend compel/test/infect to support `GCS_ENABLE=1` builds,
   adding `-mbranch-protection=standard` and
   `-z experimental-gcs=check` to CFLAGS/LDFLAGS.
 - Export required GLIBC_TUNABLES at runtime via `TEST_ENV`.

Usage:
    make -C compel/test/infect GCS_ENABLE=1
    make -C compel/test/infect GCS_ENABLE=1 run

By default (`GCS_ENABLE` unset or 0), builds and runs are unchanged.

Signed-off-by: Igor Svilenkov Bozic <svilenkov@gmail.com>
2025-12-07 19:20:00 +01:00
Igor Svilenkov Bozic
2f676d20e4 compel: gcs: set up GCS token/restorer for rt_sigreturn
When GCS is enabled, the kernel expects a capability token at GCSPR_EL0-8
and sa_restorer at GCSPR_EL0-16 on rt_sigreturn. The sigframe must be
consistent with the kernel’s expectations, with GCSPR_EL0 advanced by -8
having it point to the token on signal entry. On rt_sigreturn, the kernel
verifies the cap at GCSPR_EL0, invalidates it and increments GCSPR_EL0 by 8
at the end of gcs_restore_signal() .

Implement parasite_setup_gcs() to:
- read NT_ARM_GCS via ptrace(PTRACE_GETREGSET)
- write (via ptrace) the computed capability token and restorer address
- update GCSPR_EL0 to point to the token's location

Call parasite_setup_gcs() into parasite_start_daemon() so the sigreturn
frame satisfies kernel's expectation

Tests with GCS remain opt‑in:
	make -C compel/test/infect GCS_ENABLE=1 && make -C compel/test/infect run

Signed-off-by: Igor Svilenkov Bozic <svilenkov@gmail.com>
[ alex: cleanup fixes ]
Signed-off-by: Alexander Mikhalitsyn <aleksandr.mikhalitsyn@canonical.com>
Acked-by: Mike Rapoport <rppt@kernel.org>
2025-12-07 19:20:00 +01:00
Igor Svilenkov Bozic
6bb856b0af compel: gcs: initial GCS support for signal frames
Add basic prerequisites for Guarded Control Stack (GCS) state on AArch64.

This adds a gcs_context to the signal frame and extends user_fpregs_struct_t to
carry GCS metadata, preparing the groundwork for GCS in the parasite.

For now, the GCS fields are zeroed during compel_get_task_regs(), technically
ignoring GCS since it does not reach the control logic yet; that will be
introduced in the next commit.

The code path is gated and does not affect normal tests. Can be explicitly
enabled and tested via:

    make -C infect GCS_ENABLE=1 && make -C infect run

Signed-off-by: Igor Svilenkov Bozic <svilenkov@gmail.com>
[ alex: clean up fixes ]
Signed-off-by: Alexander Mikhalitsyn <aleksandr.mikhalitsyn@canonical.com>
Acked-by: Mike Rapoport <rppt@kernel.org>
2025-12-07 19:20:00 +01:00
Igor Svilenkov Bozic
73ca071483 gcs: add GCS constants and helper macros
Introduce ARM64 Guarded Control Stack (GCS) constants and macros
in a new uapi header for use in both CRIU and compel.

Includes:
 - NT_ARM_GCS type
 - prctl(2) constants for GCS enable/write/push modes
 - Capability token helpers (GCS_CAP, GCS_SIGNAL_CAP)
 - HWCAP_GCS definition

These are based on upstream Linux definitions

Signed-off-by: Igor Svilenkov Bozic <svilenkov@gmail.com>
Reviewed-by: Alexander Mikhalitsyn <aleksandr.mikhalitsyn@canonical.com>
Acked-by: Mike Rapoport <rppt@kernel.org>
2025-12-07 19:20:00 +01:00
Igor Svilenkov Bozic
501b714f76 compel/aarch64: refactor fpregs handling
Refactor user_fpregs_struct_t to wrap user_fpsimd_state in a
dedicated struct, preparing for future extending by just
adding new members

Signed-off-by: Igor Svilenkov Bozic <svilenkov@gmail.com>
[ alex: fixes ]
Signed-off-by: Alexander Mikhalitsyn <aleksandr.mikhalitsyn@canonical.com>
Reviewed-by: Alexander Mikhalitsyn <aleksandr.mikhalitsyn@canonical.com>
Acked-by: Mike Rapoport <rppt@kernel.org>
2025-12-07 19:20:00 +01:00
Pepper Gray
77553f07d3 make: prevent redefinition of 'struct sigcontext'
Compilation on gentoo/arm64 (llvm+musl) fails with:

In file included from compel/include/uapi/compel/asm/sigframe.h:4,
                 from compel/plugins/std/infect.c:14:
/usr/include/asm/sigcontext.h:28:8: error: redefinition of 'struct sigcontext'
   28 | struct sigcontext {
      |        ^~~~~~~~~~

In file included from criu/arch/aarch64/include/asm/restorer.h:4,
                 from criu/arch/aarch64/crtools.c:11:
/usr/include/asm/sigcontext.h:28:8: error: redefinition of 'struct sigcontext'
   28 | struct sigcontext {
      |        ^~~~~~~~~~

This is happening because <asm/sigcontext.h> and <signal.h> are
mutually incompatible on Linux.

To fix, use  <signal.h> instead of <asm/sigcontext.h> for arm64
(like all others arches do).

Fixes: #2766
Signed-off-by: Pepper Gray <hello@peppergray.xyz>
2025-11-05 15:40:55 -08:00
dong sunchao
80c280610e compel/mips: Relax ELF magic check to support MIPS libraries
On MIPS platforms, shared libraries may use EI_ABIVERSION = 5 to indicate
support for .MIPS.xhash sections. The previous ELF header check in
handle_binary() strictly compared e_ident against a hardcoded value,
causing legitimate shared objects to be rejected.

This patch replaces the memcmp-based check with a structured validation
of ELF magic and class, and allows EI_ABIVERSION values beside 0.

fixes: #2745
Signed-off-by: dong sunchao <dongsunchao@gmail.com>
2025-11-02 07:48:23 -08:00
Ignacio Moreno Gonzalez
95d5e2e59b compel: flush caches after parasite injection
After the CRIU process saves the parasite code for the target thread in
the shared mmap, it is necessary to call __clear_cache before the target
thread executes the code.

Without this step, the target thread may not see the correct code to
execute, which can result in a SIGILL signal.

For the specific arm64 case. this is important so that the newly copied
code is flushed from d-cache to RAM, so that the target thread sees the
new code.

The change is based on commit 6be10a2 by @fu.lin and on input received
from @adrianreber.

[ avagin: tweak code comment ]

Signed-off-by: Ignacio Moreno Gonzalez <Ignacio.MorenoGonzalez@kuka.com>
Signed-off-by: Andrei Vagin <avagin@gmail.com>
2025-11-02 07:48:23 -08:00
Andrei Vagin
9a1e979666 compel: fix the stack test
The stack test incorrectly assumed the page immediately
following the stack pointer could never be changed. This doesn't work,
because this page can be a part of another mapping.

This commit introduces a dedicated "stack redzone," a small guard region
directly after the stack. The stack test is modified to specifically
check for corruption within this redzone.

Signed-off-by: Andrei Vagin <avagin@gmail.com>
2025-11-02 07:42:55 -08:00
AV
8ae5db37bb arm64: C/R PAC keys
PAC stands for Pointer Authentication Code. Each process has 5 PAC keys
and a mask of enabled keys. All this properties have to be C/R-ed.

As they are per-process protperties, we can save/restore them just for
one thread.

Signed-off-by: Andrei Vagin <avagin@google.com>
2025-03-21 12:40:31 -07:00
Adrian Reber
54795f174b criu: use libuuid for criu_run_id generation
criu_run_id will be used in upcoming changes to create and remove
network rules for network locking. Instead of trying to come up with
a way to create unique IDs, just use an existing library.

libuuid should be installed on most systems as it is indirectly required
by systemd (via libmount).

Signed-off-by: Adrian Reber <areber@redhat.com>
2025-03-21 12:40:31 -07:00
Alexander Mikhalitsyn
40b7f04b7c compel/arch/riscv64: properly implement compel_task_size()
We need to dynamically calculate TASK_SIZE depending
on the MMU on RISC-V system. [We are using analogical
approach on aarch64/ppc64le.]

This change was tested on physical machine:
StarFive VisionFive 2
isa		: rv64imafdc_zicntr_zicsr_zifencei_zihpm_zca_zcd_zba_zbb
mmu		: sv39
uarch		: sifive,u74-mc
mvendorid	: 0x489
marchid		: 0x8000000000000007
mimpid		: 0x4210427
hart isa	: rv64imafdc_zicntr_zicsr_zifencei_zihpm_zca_zcd_zba_zbb

Signed-off-by: Alexander Mikhalitsyn <aleksandr.mikhalitsyn@canonical.com>
2025-03-21 12:40:31 -07:00
Alexander Mikhalitsyn
399d7bdcbb compel: fix gitignore and remove autogenerated code
We don't need to have compel/arch/riscv64/plugins/std/syscalls/syscalls.S
tracked in git. It is autogenerated. We also need to update our .gitignore
to ignore autogenerated files with syscall tables.

Signed-off-by: Alexander Mikhalitsyn <aleksandr.mikhalitsyn@canonical.com>
2025-03-21 12:40:31 -07:00
Haorong Lu
95359a62aa compel: add riscv64 support
Co-authored-by: Yixue Zhao <felicitia2010@gmail.com>
Co-authored-by: stove <stove@rivosinc.com>
Signed-off-by: Haorong Lu <ancientmodern4@gmail.com>
---
- rebased
- added a membarrier() to syscall table (fix authored by Cryolitia PukNgae)
Signed-off-by: PukNgae Cryolitia <Cryolitia@gmail.com>
Signed-off-by: Alexander Mikhalitsyn <aleksandr.mikhalitsyn@canonical.com>
2025-03-21 12:40:31 -07:00
haozi007
67fe44e981 support user set remote mmap vma address
1. os auto assignment vma addr maybe conflict with vma in gpu living migrate scene;
2. so, we should give choice to user;

Signed-off-by: haozi007 <liuhao27@huawei.com>
2024-09-11 16:02:11 -07:00
Andrei Vagin
4f45572fde util: use close_range when it's supported
close_range is faster than reading /proc/self/fd and closing descriptors
one by one.

Signed-off-by: Andrei Vagin <avagin@gmail.com>
2024-09-11 16:02:11 -07:00
Pratyush Yadav
ca971b7f8b compel: fix build on Amazon Linux 2 due to missing PTRACE_ARCH_PRCTL
Commit fc683cb01 ("compel: shstk: save CET state when CPU supports it")
started using PTRACE_ARCH_PRCTL to query shadow stack status. While
PTRACE_ARCH_PRCTL has existed in the kernel for a long time, it was only
added to glibc in version 2.27. Amazon Linux 2 (AL2) has glibc 2.26,
which does not have this definition. As a result, build on AL2 fails
with the below error:

    compel/arch/x86/src/lib/infect.c: In function ‘get_task_xsave’:
    compel/arch/x86/src/lib/infect.c:276:14: error: ‘PTRACE_ARCH_PRCTL’ undeclared (first use in this function)
    276 |   if (ptrace(PTRACE_ARCH_PRCTL, pid, (unsigned long)&features, ARCH_SHSTK_STATUS)) {
        |              ^~~~~~~~~~~~~~~~~

While the definition is present on the system via the kernel headers (in
asm/ptrace-abi.h) which can be reached by including linux/ptrace.h, the
comment in compel/include/uapi/ptrace.h says:

    We'd want to include both sys/ptrace.h and linux/ptrace.h, hoping
    that most definitions come from either one or another. Alas, on
    Alpine/musl both files declare struct ptrace_peeksiginfo_args, so
    there is no way they can be used together. Let's rely on libc one.

Since including linux/ptrace.h is not an option, define
PTRACE_ARCH_PRCTL if it doesn't already exist. An interesting point to
note is that in sys/ptrace.h, PTRACE_ARCH_PRCTL is an enum value so the
preprocessor doesn't know about it. PT_ARCH_PRCTL is the preprocessor
symbol that matches the value of PTRACE_ARCH_PRCTL. So look for
PT_ARCH_PRCTL to decide if PTRACE_ARCH_PRCTL is available or not.

Another interesting point to note is that AL2 ships with GCC 7 by
default, which does not support the -mshstk option, causing other build
failures. Luckily, it also ships GCC 10 which does have the option.
Using GCC 10 lets the build succeed.

Fixes: fc683cb01 ("compel: shstk: save CET state when CPU supports it")
Signed-off-by: Pratyush Yadav <ptyadav@amazon.de>
2024-09-11 16:02:11 -07:00
Mike Rapoport (IBM)
a48aa33eaa restorer: shstk: implement shadow stack restore
The restore of a task with shadow stack enabled adds these steps:

* switch from the default shadow stack to a temporary shadow stack
  allocated in the premmaped area
* unmap CRIU mappings; nothing changed here, but it's important that
  CRIU mappings can be removed only after switching to a temporary
  shadow stack
* create shadow stack VMA with map_shadow_stack()
* restore shadow stack contents with wrss
* switch to "real" shadow stack
* lock shadow stack features

Signed-off-by: Mike Rapoport (IBM) <rppt@kernel.org>
2024-09-11 16:02:11 -07:00
Mike Rapoport (IBM)
0aba3dcfa1 compel: shstk: prepare shadow stack signal frame
When calling sigreturn with CET enabled, the kernel verifies that the
shadow stack has proper address of sa_restorer and a "restore token".
Normally, they pushed to the shadow stack when signal processing is
started.

Since compel calls sigreturn directly, the shadow stack should be
updated to match the kernel expectations for sigreturn invocation.

Add parasite_setup_shstk() that sets up the shadow stack with the
address of __export_parasite_head_start as sa_restorer and with the
required restore token.

Signed-off-by: Mike Rapoport (IBM) <rppt@kernel.org>
2024-09-11 16:02:11 -07:00
Mike Rapoport (IBM)
63a45e1c8a compel: infect: prepare parasite_service() for addition of CET support
To support sigreturn with CET enabled parasite must rewind its stack
before calling sigreturn so that shadow stack will be compatible with
actual calling sequence.

In addition, calling sigreturn from top level routine
(__export_parasite_head_start) will significantly simplify the shadow
stack manipulations required to execute sigreturn.

For x86 make fini_sigreturn() return the stack pointer for the signal
frame that will be used by sigreturn and propagate that return value up
to __export_parasite_head_start.

In non-daemon mode parasite_trap_cmd() returns non-positive value
which allows to distinguish daemon and non-daemon mode and properly stop
at int3 in non-daemon mode.

Architectures other than x86 remain unchanged and will still call
sigreturn from fini_sigreturn().

Signed-off-by: Mike Rapoport (IBM) <rppt@kernel.org>
2024-09-11 16:02:11 -07:00
Mike Rapoport (IBM)
6e491a19a3 compel: shstk: save CET state when CPU supports it
Signed-off-by: Mike Rapoport (IBM) <rppt@kernel.org>
2024-09-11 16:02:11 -07:00
Mike Rapoport (IBM)
17f4dd0959 compel: always pass user_fpregs_struct_t to compel_get_task_regs()
All architectures create on-stack structure for floating point save area
in compel_get_task_regs() if the caller passes NULL rather than a valid
pointer.

The only place that calls compel_get_task_regs() with NULL for floating
point save area is parasite_start_daemon() and it is simpler to define
this strucuture on stack of parasite_start_daemon().

The availability of floating point save data is required in
parasite_start_daemon() to detect shadow stack presence early during
parasite infection and will be used in later patches.

Signed-off-by: Mike Rapoport (IBM) <rppt@kernel.org>
2024-09-11 16:02:11 -07:00
sally kang
9f9737c800 comple: correct the syscall number of bind on ARM64
In the compel/arch/arm/plugins/std/syscalls/syscall.def, the syscall number of bind on ARM64 should be 200 instead of 235

Signed-off-by: Sally Kang <snapekang@gmail.com>
2024-09-11 16:02:11 -07:00
Vladislav Khmelevsky
28adebefb7 Return page size as unsigned long
Currently page_size() returns unsigned int value that is after "bitwise
not" is promoted to unsigned long value e.g. in uffd.c
handle_page_fault. Since the value is unsigned promotion is done with 0
MSB that results in lost of MSB pagefault address bits. So make
page_size to return  unsigned long to avoid such situation.

Signed-off-by: Vladislav Khmelevsky <och95@yandex.ru>
2023-10-22 13:29:25 -07:00
Younes Manton
59fcfa80d8 compel: Add support for ppc64le scv syscalls
Power ISA 3.0 added a new syscall instruction. Kernel 5.9 added
corresponding support.

Add CRIU support to recognize the new instruction and kernel ABI changes
to properly dump and restore threads executing in syscalls. Without this
change threads executing in syscalls using the scv instruction will not
be restored to re-execute the syscall, they will be restored to execute
the following instruction and will return unexpected error codes
(ERESTARTSYS, etc) to user code.

Signed-off-by: Younes Manton <ymanton@ca.ibm.com>
2023-10-22 13:29:25 -07:00
Michał Mirosław
f7d7dc9c08 compel/infect: include the relevant pid in "no-breakpoints restore" debug message
Signed-off-by: Michał Mirosław <emmir@google.com>
2023-10-22 13:29:25 -07:00
Andrei Vagin
8c17535f3f loongarch64: fix syscall_64.tbl
The 288d6a61e2 change broke all the syscall numbers.

Reported-by: Michał Mirosław <emmir@google.com>
Fixes: (288d6a61e2 "loongarch64: reformat syscall_64.tbl for 8-wide tabs")
Signed-off-by: Andrei Vagin <avagin@gmail.com>
2023-10-22 13:29:25 -07:00
Michał Mirosław
e07155e194 dump+restore: Implement membarrier() registration c/r.
Note: Silently drops MEMBARRIER_CMD_REGISTER_GLOBAL_EXPEDITED as it's
not currently detectable. This is still better than silently dropping
all membarrier() registrations.

Signed-off-by: Michał Mirosław <emmir@google.com>
2023-10-22 13:29:25 -07:00
Andrei Vagin
5b790aa181 loongarch64: reformat syscall_64.tbl for 8-wide tabs
Signed-off-by: Andrei Vagin <avagin@gmail.com>
2023-10-22 13:29:25 -07:00
znley
c9df09eeab compel: add loongarch64 support
Signed-off-by: znley <shanjiantao@loongson.cn>
2023-10-22 13:29:25 -07:00
Michał Mirosław
5a723937a2 compel: Log the status word with "Task is still running" errors.
Signed-off-by: Michał Mirosław <emmir@google.com>
2023-10-22 13:29:25 -07:00
Haorong Lu
4455444eeb compel/test: Return 0 in case of error in fdspy
This commit revises the error handling in the fdspy test. Previously,
a failure case could have been incorrectly reported as successful because
of a specific check `pass != 0`, leading to potential false positives
when `check_pipe_ends()` returned `-1` due to a read/write pipe error.

To improve this, we've adjusted the error handling to return `0` in case
of any error. As such, the final success condition remains unchanged. This
approach will help accurately differentiate between successful and failed
cases, ensuring the output "All OK" is printed for success, and "Something
went WRONG" for any failure.

Fixes: 5364ca3 ("compel/test: Fix warn_unused_result")

Signed-off-by: Haorong Lu <ancientmodern4@gmail.com>
2023-10-22 13:29:25 -07:00
Michał Mirosław
4c1409b8f6 Fill FPU init state if it's not provided by kernel.
Apparently Skylake uses init-optimization when saving FPU state, and ptrace()
returns XSTATE_BV[0] = 0 meaning FPU was not used by a task (in init state).
Since CRIU restore uses sigreturn to restore registers, FPU state is always
restored. Fill the state with default values on dump to make restore happy.

Signed-off-by: Michał Mirosław <emmir@google.com>
2023-10-22 13:29:25 -07:00
Adrian Reber
df7b897a22 ci: fix new codespell errors
Signed-off-by: Adrian Reber <areber@redhat.com>
2023-10-22 13:29:25 -07:00
Adrian Reber
727d796505 compel: support XSAVE on newer Intel CPUs
Newer Intel CPUs (Sapphire Rapids) have a much larger xsave area than
before. Looking at older CPUs I see 2440 bytes.

    # cpuid -1 -l 0xd -s 0
    ...
        bytes required by XSAVE/XRSTOR area     = 0x00000988 (2440)

On newer CPUs (Sapphire Rapids) it grows to 11008 bytes.

    # cpuid -1 -l 0xd -s 0
    ...
        bytes required by XSAVE/XRSTOR area     = 0x00002b00 (11008)

This increase the xsave area from one page to four pages.

Without this patch the fpu03 test fails, with this patch it works again.

Signed-off-by: Adrian Reber <areber@redhat.com>
2023-10-22 13:29:25 -07:00
Radostin Stoyanov
6d7c0d007e compel/mips: fix parasite with GCC 12
This patch applies the '-ffreestanding' flag that was introduced
with https://github.com/checkpoint-restore/criu/pull/1726 to MIPS.

Fixes: #1725

Signed-off-by: Radostin Stoyanov <radostin@redhat.com>
2023-04-15 21:17:21 -07:00
Pavel Tikhomirov
8cfda2748c log: remove all uses of %m specifier in pr_* functions
As our pr_* functions are complex and can call different system calls
inside before actual printing (e.g. gettimeofday for timestamps) actual
errno at the time of printing may be changed.

Let's just use %s + strerror(errno) instead of %m with pr_* functions to
be explicit that errno to string transformation happens before calling
anything else.

Note: tcp_repair_off is called from pie with no pr_perror defined due to
CR_NOGLIBC set and if I use errno variable there I get "Unexpected
undefined symbol: `__errno_location'. External symbol in PIE?", so it
seems there is no way to print errno there, so let's just skip it.

Signed-off-by: Pavel Tikhomirov <ptikhomirov@virtuozzo.com>
2023-04-15 21:17:21 -07:00
Younes Manton
a39d416568 compel: Fix ppc64le parasite stack layout
The ppc64le ABI allows functions to store data in caller frames.
When initializing the stack pointer prior to executing parasite code
we need to pre-allocating the minimum sized stack frame before
jumping to the parasite code.

Signed-off-by: Younes Manton <ymanton@ca.ibm.com>
2023-04-15 21:17:21 -07:00
Younes Manton
17ec539132 compel: Add test to check parasite stack setup
Some ABIs allow functions to store data in caller frame, which
means that we have to allocate an initial stack frame before
executing code on the parasite stack.

This test saves the contents of writable memory that follows the stack
after the victim has been infected but before we start using the
parasite stack. It later checks that the saved data matches the
current contents of the two memory areas. This is done while the
victim is halted so we expect a match unless executing parasite code
caused memory corruption. The test doesn't detect cases where we
corrupted memory by writing the same value.

Signed-off-by: Younes Manton <ymanton@ca.ibm.com>
2023-04-15 21:17:21 -07:00
Younes Manton
556ab0deaf compel: Fix infect test to not override failures
Signed-off-by: Younes Manton <ymanton@ca.ibm.com>

return zero on chk success

Signed-off-by: Pavel Tikhomirov <ptikhomirov@virtuozzo.com>

Co-authored-by: Pavel Tikhomirov <ptikhomirov@virtuozzo.com>
2023-04-15 21:17:21 -07:00
Younes Manton
461fa72715 compel: Add APIs to facilitate testing
Starting the daemon is the first time we run code in the victim
using the parasite stack.

It's useful for testing to be able to infect the victim without starting
the daemon so that we can inspect the victim's state, set up stack
guards, and so on before stack-related corruption can happen.

Add compel_infect_no_daemon() to infect the victim but not start the
daemon and compel_start_daemon() to start the daemon after the victim
is infected.

Add compel_get_stack() to get the victim's main and thread parasite
stacks.

Signed-off-by: Younes Manton <ymanton@ca.ibm.com>
2023-04-15 21:17:21 -07:00
fu.lin
dfe9d006ad breakpoint: enable breakpoints by default on amd64 and arm64
Signed-off-by: fu.lin <fulin10@huawei.com>
Signed-off-by: Andrei Vagin <avagin@gmail.com>
2023-04-15 21:17:21 -07:00
fu.lin
bb73e1cf5a breakpoint: implement hw breakpoint for arm64 platform
The x86 implement hardware breakpoint to accelerate the tracing syscall
procedure instead of `ptrace(PTRACE_SYSCALL)`. The arm64 has the same
capability according to <<Learn the architecture: Armv8-A self-hosted
debug>>[[1]].

<<Arm Architecture Reference Manual for A-profile architecture>[[2]]
illustrates the usage detailly:
- D2.8 Breakpoint Instruction exceptions
- D2.9 Breakpoint exceptions
- D13.3.2 DBGBCR<n>_EL1, Debug Breakpoint Control Registers, n

Note:
[1]: https://developer.arm.com/documentation/102120/0100
[2]: https://developer.arm.com/documentation/ddi0487/latest

Signed-off-by: fu.lin <fulin10@huawei.com>
Signed-off-by: Andrei Vagin <avagin@gmail.com>
2023-04-15 21:17:21 -07:00
fu.lin
b7953c6c7f compel: switch breakpoint functions to non-inline at arm64 platform
Signed-off-by: fu.lin <fulin10@huawei.com>
Signed-off-by: Andrei Vagin <avagin@gmail.com>
2023-04-15 21:17:21 -07:00
Andrei Vagin
719fea2fc9 compel: clear a breakpoint right after it's been triggered
Breakpoints are used to stop as close as possible to a target system call.

First, we don't need it after this point.
Second, PTRACE_CONT can't pass through a breakpoint on arm64.

Signed-off-by: Andrei Vagin <avagin@gmail.com>
2023-04-15 21:17:21 -07:00
Andrei Vagin
d7477dac03 compel: set TRACESYSGOOD to distinguish breakpoints from syscalls
When delivering system call traps, set bit 7 in the  signal  number  (i.e.,
deliver SIGTRAP|0x80).  This makes it easy for the tracer  to  distinguish
normal traps from those caused by a system call.

Signed-off-by: Andrei Vagin <avagin@gmail.com>
2023-04-15 21:17:21 -07:00
Alexander Mikhalitsyn
c502d480f9 x86/compel/fault-inject: fixup mxcsr for PTRACE_SETFPREGS
Error from:
./test/zdtm.py run -t zdtm/static/fpu00 --fault 134 -f h --norst

(00.003111) Dumping GP/FPU registers for 56
(00.003121) Error (compel/arch/x86/src/lib/infect.c:310): Corrupting fpuregs for 56, seed 1651766595
(00.003125) Error (compel/arch/x86/src/lib/infect.c:314): Can't set FPU registers for 56: Invalid argument
(00.003129) Error (compel/src/lib/infect.c:688): Can't obtain regs for thread 56
(00.003174) Error (criu/cr-dump.c:1564): Can't infect (pid: 56) with parasite

See also:
145e9e0d8c6 ("x86/fpu: Fail ptrace() requests that try to set invalid MXCSR values")
145e9e0d8c

We decided to move from mxcsr cleaning up scheme and use mxcsr mask
(0x0000ffbf) as kernel does. Thanks to Dmitry Safonov for pointing out.

Tested-on: Intel(R) Xeon(R) CPU E3-1246 v3 @ 3.50GHz

Reported-by: Mr. Jenkins
Suggested-by: Dmitry Safonov <dima@arista.com>
Signed-off-by: Alexander Mikhalitsyn <alexander.mikhalitsyn@virtuozzo.com>
2023-04-15 21:17:21 -07:00