Mirrors/criu

mirror of https://github.com/checkpoint-restore/criu.git synced 2026-01-23 02:14:37 +00:00

Author	SHA1	Message	Date
Kirill Tkhai	c9afd17ad6	net: Add ip rule save/restore Add support for save and restore of ip rules. It uses new functionality of iproute which is already in iproute git: http://git.kernel.org/cgit/linux/kernel/git/shemminger/iproute2.git/commit/?id=2f4e171f7df22107b38fddcffa56c1ecb5e73359 v2: Use xstrdup() instead of strdup(). v3: Use open/close instead of helper. v4: Return -1 on empty dump. Signed-off-by: Kirill Tkhai <ktkhai@odin.com> Signed-off-by: Pavel Emelyanov <xemul@parallels.com>	2015-10-27 22:56:33 +03:00
Andrew Vagin	1d8fcb6b94	bfd: add breadchr Reading stops after an EOF or a specified charecter. Signed-off-by: Andrew Vagin <avagin@openvz.org> Signed-off-by: Pavel Emelyanov <xemul@parallels.com>	2015-10-27 22:51:09 +03:00
Cyrill Gorcunov	7a99e699ce	mnt: Export __open_mountpoint We gonna need it for inotify handle testing. Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org> Signed-off-by: Pavel Emelyanov <xemul@parallels.com>	2015-10-21 15:08:03 +03:00
Kir Kolyshkin	5940e3d14c	xfree(): simplify Contrary to a popular opinion, there is no need to check an argument for being non-NULL before calling free(). >From free(3) man page: > > If ptr is NULL, no operation is performed. Let's change xfree macro to be a synonym for free(). Signed-off-by: Kir Kolyshkin <kir@openvz.org> Acked-by: Cyrill Gorcunov <gorcunov@openvz.org> Signed-off-by: Pavel Emelyanov <xemul@parallels.com>	2015-10-21 14:58:39 +03:00
Pavel Emelyanov	68baf8e77d	criu: Fault injection core This patch(set) is inspired by similar from Andrey Vagin sent sime time earlier. The major idea is to artificially fail criu dump or restore at specific places and let zdtm tests check whether failed dump or restore resulted in anything bad. This particular patch introduces the ability to tell criu "fail at X point". Each point is specified with a integer constant and with the next patches there will appear places over the code checking for specific fail code being set and failing. Two points are introduced -- early on dump, right after loading the parasite and right after creation of the root task. Signed-off-by: Pavel Emelyanov <xemul@parallels.com>	2015-10-19 12:42:29 +03:00
Cyrill Gorcunov	61859d1176	fsnotify: Filter out internal inotify bits when restoring marks The kernel prior 4.3 is exporting FS_EVENT_ON_CHILD bit via procfs fdinfo interface. This bit is kernel's internal and should not be passed in inotify_add_watch call. Thus simply filter it out when obtain from old images for backward compatibility reason. More details here https://lkml.org/lkml/2015/9/21/680 Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org> Signed-off-by: Pavel Emelyanov <xemul@parallels.com>	2015-10-14 15:51:55 +03:00
Matthew Krafczyk	29c08d8672	Add pre-dump and pre-restore action scripts This allows the user to perform actions before dumping or restoration occurs. Signed-off-by: Matthew Krafczyk <krafczyk.matthew@gmail.com> Signed-off-by: Pavel Emelyanov <xemul@parallels.com>	2015-10-09 18:23:41 +03:00
Christopher Covington	871da9a111	pie: Give VDSO symbol table local scope In commit c2271198, Laurent Dufour kindly reunified the VDSO code that had become duplicated between architectures. Unfortunately this introduced a regression in AArch64 where apparently due to the scope of vdso_symbols array of pointers to characters changing from local to global, load-time relocations became necessary. The following thread on the GCC mailing list discusses why load-time relocations can be necessary when pointers are used, although it doesn't mention the potential for locally scoped arrays to be handled differently: https://gcc.gnu.org/ml/gcc/2004-05/msg01016.html Because the alternatives, such as porting piegen to AArch64, are far more involved, simply revert the change in scope. Signed-off-by: Christopher Covington <cov@codeaurora.org> Signed-off-by: Pavel Emelyanov <xemul@parallels.com>	2015-10-05 13:21:16 +03:00
Christopher Covington	627f9a9e5f	aarch64: Fix write_intraprocedure_branch types In the recent VDSO code reunification, some types were changed but a pair of necessary corresponding changes was omitted. Fix that so the AArch64 build succeeds without type-related warnings-turned-errors. Also move the definition to the AArch64-specific header since it's not currently being used by any other architectures. Signed-off-by: Christopher Covington <cov@codeaurora.org> Signed-off-by: Pavel Emelyanov <xemul@parallels.com>	2015-10-05 13:20:01 +03:00
Tycho Andersen	f79f4546cf	sysctl: move sysctl calls to usernsd When in a userns, tasks can't write to certain sysctl files: (00.009653) 1: Error (sysctl.c:142): Can't open sysctl kernel/hostname: Permission denied See inline comments for details on affected namespaces. Mostly for my own education in what is required to port something to be userns restorable, I ported the sysctl stuff. A potential concern for this patch is that copying structures with pointers around is kind of gory. I did it ad-hoc here, but it may be worth inventing some mechanisms to make it easier, although I'm not sure what exactly that would look like (potentially re-using some of the protobuf bits; I'll investigate this more if it looks helpful when doing the cgroup user namespaces port?). Another issue is that there is not a great way to return non-fd stuff in memory right now from userns_call; one of the little hacks in this code would be "simplified" if we invented a way to do this. v2: coalesce the individual struct sysctl_req requests into one big sysctl_userns_req that is in a contiguous region of memory so that we can pass it via userns_call. Hopefully nobody finds my little ascii diagram too offensive :) v3: use the fork/setns trick to change the syctl values in the right ns for IPC/UTS nses; see inline comment for details v4: only use sysctl_userns_req when actually doing a userns_call. Signed-off-by: Tycho Andersen <tycho.andersen@canonical.com> Signed-off-by: Pavel Emelyanov <xemul@parallels.com>	2015-10-05 13:16:14 +03:00
Andrew Vagin	a973e6fcb3	net: dump ipv6 routes "ip route dump" dumps only ipv4 routes. Reported-by: Ross Boucher <boucher@gmail.com> Signed-off-by: Andrew Vagin <avagin@openvz.org> Signed-off-by: Pavel Emelyanov <xemul@parallels.com>	2015-10-05 13:11:31 +03:00
Tycho Andersen	97cb181cbc	irmap: don't leak irmap objects in --irmap-scan-path v2: use struct irmap directly in irmap_path_opt Signed-off-by: Tycho Andersen <tycho.andersen@canonical.com> Signed-off-by: Pavel Emelyanov <xemul@parallels.com>	2015-09-28 22:02:51 +03:00
Pavel Emelyanov	efa7dcf7c2	ghost: Remove ghost files if restore fails Issue #18. When restore fails ghost files remain there. And to remove them we have to know their list, paths to original files (to construct the ghost name) and the namespace ghost lives in. For the latter we keep the restore task namespace at hands till the final stage and setns into it to kill ghosts. Signed-off-by: Pavel Emelyanov <xemul@parallels.com>	2015-09-28 22:00:37 +03:00
Pavel Emelyanov	a7c9f3011d	mnt: Read mount images early Mappings from mount id to namespace will be required to remove ghosts on restore failure. Signed-off-by: Pavel Emelyanov <xemul@parallels.com>	2015-09-28 22:00:36 +03:00
Pavel Emelyanov	b0e23c3d4f	files: Collect ghosts and regilfes early Info about ghosts presence and paths will be needed to remove the ghosts itself and thus are needed in criu. Signed-off-by: Pavel Emelyanov <xemul@parallels.com>	2015-09-28 22:00:35 +03:00
Pavel Emelyanov	152222a6b7	remap: Sanitize ghost file path printing First -- avoid two memory copies by printing ns root directly, and second -- remove extra argument from create_ghost, the mnt_id value we need there can be found on the ghost_file object. Signed-off-by: Pavel Emelyanov <xemul@parallels.com>	2015-09-28 21:59:45 +03:00
Pavel Emelyanov	6cf77f6726	remap: Rename fields for easier grep Signed-off-by: Pavel Emelyanov <xemul@parallels.com>	2015-09-28 21:58:28 +03:00
Pavel Emelyanov	7ca6cc1eb2	mnt: Clean roots yard from criu process So here it is. If root task dies on restore the roots yard dir remains unrmdired :( Since we already know its name, we can remove one from criu. By the time we get to this place the sub mount namespace(s) are already dead and yard dir is empty. But umounting should be done by tasks after successfull restore, so keep depopulation there. Signed-off-by: Pavel Emelyanov <xemul@parallels.com>	2015-09-28 21:57:35 +03:00
Pavel Emelyanov	3e7c92ed02	mnt: Renames around roots yard Same thing as in previous patch -- we have too many generic clean_ and fini_ prefixes over the code. And we need more (see next patch), so let's specify what exactly we clean or fini. Signed-off-by: Pavel Emelyanov <xemul@parallels.com>	2015-09-28 21:57:21 +03:00
Pavel Emelyanov	c5c65fe17a	mnt: Create roots in criu context In case root task restore failure we'll have to remove the roots yard dir from criu, so we have to create one by criu to at least have the dit name. It's OK to do it in criu, since the yards is created in the opts.root which is the same for any mnt ns we deal with on restore. Signed-off-by: Pavel Emelyanov <xemul@parallels.com>	2015-09-28 21:56:51 +03:00
Pavel Emelyanov	e3f5ba3c37	ns: Prepare namespaces before tasks There's already two things we do in criu namespaces before forking the init task (start unsd and keep netnsfd for back reference). Next patches will introduce the 3rd action for mount namespaces, so have a special pre-call for all this stuff. Signed-off-by: Pavel Emelyanov <xemul@parallels.com>	2015-09-28 21:56:26 +03:00
Pavel Emelyanov	9b3189fed1	util: Add make_yard helper Signed-off-by: Pavel Emelyanov <xemul@parallels.com>	2015-09-28 11:32:18 +03:00
Pavel Emelyanov	9353051ba7	ns: Check ns type with type field Actually make use of the ns->type field and remove all getpid()'s and other strange/inconsistent checks. Signed-off-by: Pavel Emelyanov <xemul@parallels.com>	2015-09-21 12:15:28 +03:00
Pavel Emelyanov	22b7256612	ns: Introduce ns type We (may) have 3 types of namespace objects in criu -- criu's one, root task's one and others. All of them sometimes make sense and we differentiate them in a weird way -- by checking the ns->pid field against getpid() or by comparing with root_item's. The proposal is to mark ns_id objects explicitly with type field. Signed-off-by: Pavel Emelyanov <xemul@parallels.com>	2015-09-21 12:14:07 +03:00
Tycho Andersen	85ebf0a83b	usernsd: also pass pid of process that made the req We'll use this in the next patch to correctly write sysctls. Signed-off-by: Tycho Andersen <tycho.andersen@canonical.com> Signed-off-by: Pavel Emelyanov <xemul@parallels.com>	2015-09-21 12:01:01 +03:00
Tycho Andersen	72ff44d0dc	usernsd: move MAX_MSG_SIZE to namespaces.h We'll use this size in the next patch to avoid having to do some dynamic allocation. v2: call it MAX_UNSFD_MSG_SIZE instead v3: fix all uses of MAX_MSG_SIZE :) Signed-off-by: Tycho Andersen <tycho.andersen@canonical.com> Signed-off-by: Pavel Emelyanov <xemul@parallels.com>	2015-09-21 11:57:40 +03:00
Andrey Vagin	1174a2ad0f	mount: handle mnt_flags and sb_flags separatly (v4) They both can container the MS_READONLY flag. And in one case it will be read-only bind-mount and in another case it will be read-only super-block. v2: set mnt and sb for one call of mount() when it's posiable v3: return a comment which was deleted by mistake v4: Fix the sentense about restoring mnt flags Signed-off-by: Andrey Vagin <avagin@openvz.org> Signed-off-by: Pavel Emelyanov <xemul@parallels.com>	2015-09-21 11:55:17 +03:00
Tycho Andersen	4f2e4ab3be	irmap: add --irmap-scan-path option This option allows users to specify their own irmap paths to scan in the event that they don't have a path in one of the hard coded hints. Signed-off-by: Tycho Andersen <tycho.andersen@canonical.com> Signed-off-by: Pavel Emelyanov <xemul@parallels.com>	2015-09-21 11:46:12 +03:00
Andrey Vagin	d3be641acd	cgroups: get controllers from /proc/self/cgroups (v2) Some controllers can be disabled in kernel options. In this case they are shown in /proc/cgroups, but they could not be mounted. All enabled controllers can be collected from /proc/self/cgroup. https://github.com/xemul/criu/issues/28 v2: ',' is used to separate controllers Cc: Tycho Andersen <tycho.andersen@canonical.com> Reported-by: Ross Boucher <boucher@gmail.com> Signed-off-by: Andrey Vagin <avagin@openvz.org> Acked-by: Tycho Andersen <tycho.andersen@canonical.com> Signed-off-by: Pavel Emelyanov <xemul@parallels.com>	2015-09-16 15:46:10 +03:00
Laurent Dufour	7f01d691c7	vdso: Rework vdso processing files There were multiple copy of the same code spread over the different architectures handling the vDSO. This patch is merging the duplicated code in arch//vdso-pie.c and arch//include/asm/vdso.h in the common files and let only the architecture specific part in the arch// files. The file are now organized this way: include/asm-generic/vdso.h contains basic definition which could be overwritten by architectures. arch//include/asm/vdso.h contains per architecture definitions. It may includes include/asm-generic/vdso.h pie/util-vdso.c include/util-vdso.h These files contains code and definitions common to both criu and the parasite code. The file include/util-vdso.h includes arch//include/asm/vdso.h. pie/parsite-vdso.c include/parasite-vdso.h contains code and definition specific to the parasite code handling the vDSO. The file include/parasite-vdso.h includes include/util-vdso.h. arch/*/vdso-pie.c contains the architecture specific code installing the vDSO trampoline. vdso.c include/vdso.h contains code and definition specific to the criu code handling the vDSO. The file include/vdso.h includes include/util-vdso.h. CC: Christopher Covington <cov@codeaurora.org> CC: Pavel Emelyanov <xemul@parallels.com> Signed-off-by: Laurent Dufour <ldufour@linux.vnet.ibm.com> Acked-by: Cyrill Gorcunov <gorcunov@gmail.com> Signed-off-by: Pavel Emelyanov <xemul@parallels.com>	2015-09-10 14:07:22 +03:00
Cyrill Gorcunov	80ef8fd2fb	mount: Handle deleted bindmounts To handle deleted bindmounts we simply create the former directory bindmount lived at, mount the target and remove the directory back. For this sake we add @deleted entry into the image. Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org> Acked-by: Tycho Andersen <tycho.andersen@canonical.com> Signed-off-by: Pavel Emelyanov <xemul@parallels.com>	2015-08-21 21:26:17 +03:00
Cyrill Gorcunov	60f6ec7dd6	files-reg: Rework strip_deleted helper Make it handle both postfixes and return non-zero code if stipping happened. Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org> Acked-by: Tycho Andersen <tycho.andersen@canonical.com> Signed-off-by: Pavel Emelyanov <xemul@parallels.com>	2015-08-21 21:26:01 +03:00
Cyrill Gorcunov	4f8f97e0bd	mount: mount_info -- Drop unused @is_file Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org> Acked-by: Tycho Andersen <tycho.andersen@canonical.com> Signed-off-by: Pavel Emelyanov <xemul@parallels.com>	2015-08-21 21:25:14 +03:00
Cyrill Gorcunov	40ed330c88	kcmp: Stop showing ids tree Useless, at least in the form present now it's unreadable anyway. So stop welling out the logs. Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org> Signed-off-by: Pavel Emelyanov <xemul@parallels.com>	2015-08-18 18:17:31 +03:00
Cyrill Gorcunov	3ae67d7b3a	mount: Move mount_info and ext_mount to mount.h It's quite unclean while this structure lives in proc_parse.h, which only have to fill this structure on procfs read, but real handling is inside mount.c. Move it as appropriate. Same time ext_mount structure should be moved into a header as well with sane @list name used instead of @l. Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org> Signed-off-by: Pavel Emelyanov <xemul@parallels.com>	2015-08-13 17:16:47 +03:00
Cyrill Gorcunov	291e632e60	mount: Reorder declarations - gather structs on top - then extern vars Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org> Signed-off-by: Pavel Emelyanov <xemul@parallels.com>	2015-08-13 17:16:05 +03:00
Cyrill Gorcunov	17dc1fea57	mount: Align members of mount_info This a way easier for reading. Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org> Signed-off-by: Pavel Emelyanov <xemul@parallels.com>	2015-08-13 17:16:04 +03:00
Cyrill Gorcunov	ced8f88401	opts: Allo to specify the maximum size of ghost files For example we hit a case where systemd carries journal file with 4M in size. https://jira.sw.ru/browse/PSBM-38571 Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org> Signed-off-by: Pavel Emelyanov <xemul@parallels.com>	2015-08-10 16:51:11 +03:00
Cyrill Gorcunov	1bdb8298d0	util: Fix mega/giga typos Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org> Signed-off-by: Pavel Emelyanov <xemul@parallels.com>	2015-08-10 16:49:43 +03:00
Andrey Vagin	8ea020388e	dump: use freezer cgroup to seize processes (v4) Without using a freezer cgroup, we need to do a few iterations to catch all tasks, because a new tasks can be born. If new tasks appear faster than criu collects them, criu fails. The freezer cgroup allows to solve this problem. We freeze the freezer group, then attaches to tasks with ptrace and thaw the freezer cgroup. We suppose that all tasks which are going to be dumped in a specified freezer group. v2: fix comments from Christopher Reviewed-by: Christopher Covington <cov@codeaurora.org> v3: refactor task_seize v4: fix comments from Pavel Cc: Christopher Covington <cov@codeaurora.org> Signed-off-by: Andrey Vagin <avagin@openvz.org> Signed-off-by: Pavel Emelyanov <xemul@parallels.com>	2015-08-10 16:47:51 +03:00
Andrey Vagin	011231af3b	util: add ability to execute programs in a specified userns It's required for dumping tmpfs, where we use tar to save content. If we need to execute tar from a proper userns to get right uid-s and gid-s for files. Signed-off-by: Andrey Vagin <avagin@openvz.org> Signed-off-by: Pavel Emelyanov <xemul@parallels.com>	2015-08-07 14:42:01 +03:00
Gabriel Guimaraes	dbaab31f31	Workaround for the OverlayFS bug present before Kernel 4.2 This is here only to support the Linux Kernel between versions 3.18 and 4.2. After that, this workaround is not needed anymore, but it will work properly on both a kernel with and without the bug. The bug is that when a process has a file open in an OverlayFS directory, the information in /proc/<pid>/fd/<fd> and /proc/<pid>/fdinfo/<fd> is wrong, so we grab that information from the mountinfo table instead. This is done every time fill_fdlink is called. We first check to see if the mnt_id and st_dev numbers currently match some entry in the mountinfo table. If so, we already have the correct mnt_id and no fixup is needed. Then we proceed to see if there are any overlayFS mounted directories in the mountinfo table. If so, we concatenate the mountpoint with the name of the file, and stat the resulting path to check if we found the correct device id and node number. If that is the case, we update the mount id and link variables with the correct values. Signed-off-by: Gabriel Guimaraes <gabriellimaguimaraes@gmail.com> Signed-off-by: Pavel Emelyanov <xemul@parallels.com>	2015-08-07 14:30:41 +03:00
Andrey Vagin	b9b0730cb1	ptrace: split task_seize into seize_catch_task and seize_wait_task It's preparation to use a freezer cgroup for freezing tasks. Signed-off-by: Andrey Vagin <avagin@openvz.org> Signed-off-by: Pavel Emelyanov <xemul@parallels.com>	2015-08-07 13:47:11 +03:00
Andrey Vagin	2f172c8b24	crtools: split cr-dump.c in two files Signed-off-by: Andrey Vagin <avagin@openvz.org> Signed-off-by: Pavel Emelyanov <xemul@parallels.com>	2015-08-06 14:31:06 +03:00
Christopher Covington	1438f013a2	Pass task_size to vma_area_is_private() If we want one CRIU binary to work across all AArch64 kernel configurations, a single task size value cannot be hard coded. Since vma_area_is_private() is used by both restorer blob code and non restorer blob code, which must use different variables for recording the task size, make task_size a function argument and modify the call sites accordingly. This fixes the following error on AArch64 kernels with CONFIG_ARM64_64K_PAGES=y. pie: Error (pie/restorer.c:929): Can't restore 0x3ffb7e70000 mapping w> pie: ith 0xfffffffffffffff7 Signed-off-by: Christopher Covington <cov@codeaurora.org> Signed-off-by: Pavel Emelyanov <xemul@parallels.com>	2015-08-03 17:14:18 +03:00
Christopher Covington	7451fc7d23	restorer: Replace most hard-coded TASK_SIZE use If we want one CRIU binary to work across all AArch64 kernel configurations, a single task size value cannot be hard coded. This fixes the following error on AArch64 kernels with CONFIG_ARM64_64K_PAGES=y. pie: Error (pie/restorer.c:772): Unable to unmap (-): -1211695104 Signed-off-by: Christopher Covington <cov@codeaurora.org> Signed-off-by: Pavel Emelyanov <xemul@parallels.com>	2015-08-03 17:14:17 +03:00
Christopher Covington	c0c0546c31	kerndat: Introduce task_size variable If we want one CRIU binary to work across all AArch64 kernel configurations, a single task size value cannot be hard coded. Signed-off-by: Christopher Covington <cov@codeaurora.org> Signed-off-by: Pavel Emelyanov <xemul@parallels.com>	2015-08-03 17:14:15 +03:00
Andrew Vagin	f13ec96e58	restore: fix race in calculation of a number of zombies Currently each task subtracts number of zombies from task_entries->nr_threads without locks, so if two tasks will do this operation concurrently, the result may be unpredictable. https://github.com/xemul/criu/issues/13 Cc: Tycho Andersen <tycho.andersen@canonical.com> Signed-off-by: Andrew Vagin <avagin@openvz.org> Acked-by: Tycho Andersen <tycho.andersen@canonical.com> Signed-off-by: Pavel Emelyanov <xemul@parallels.com>	2015-08-03 17:12:10 +03:00
Andrey Vagin	7e413b0771	socket: remove unused code Signed-off-by: Andrey Vagin <avagin@openvz.org> Signed-off-by: Pavel Emelyanov <xemul@parallels.com>	2015-08-03 17:06:19 +03:00
Artem Kuzmitskiy	b790b586eb	Add restoring of unnamed unix sockets. Added functionality for restoring unnamed unix sockets using already implemented feature - inherit fd and using same command line option. Usage example: criu restore -d -D images -o restore.log --pidfile restore.pid -v4 \ -x --inherit-fd fd[3]:socket:[9677263] Signed-off-by: Artem Kuzmitskiy <artem.kuzmitskiy@lge.com> Signed-off-by: Pavel Emelyanov <xemul@parallels.com>	2015-07-29 17:53:36 +03:00

... 2 3 4 5 6 ...

1725 commits