Mirrors/criu

mirror of https://github.com/checkpoint-restore/criu.git synced 2026-01-24 02:35:41 +00:00

Author	SHA1	Message	Date
Cyrill Gorcunov	6217a84ae3	mnt: Carry run-time device ID in mount_info When we're restoring fsnotify watchees we need to resolve path to a handle at some mountpoint referred by @s_dev member (device ID) which is saved inside image. This ID actually may be changed at the every mount (say one restores container after machine reboot) or in case of container's migration. Thus the test for overmounting in __open_mountpoint will fail and we get an error. Lets do a trick: introduce @s_dev_rt member which is supposed to carry run-time device ID. When dumping this member simply equal to traditional @s_dev fetched from the procfs, but when restoring we fetch it from stat call once mountpoint become alive. https://jira.sw.ru/browse/PSBM-41610 v2: - predefine MOUNT_INVALID_DEV - use fetch_rt_stat instead of assigning device in restore_shared_options - copy @s_dev_rt in propagate_siblings and propagate_mount Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org> Signed-off-by: Pavel Emelyanov <xemul@parallels.com>	2015-12-08 14:58:32 +03:00
Andrew Vagin	7017181849	mount: don't inherit mount namespace descriptors to each process close_olds_fds() knows nothing about more than one set of service file descriptros, so it's better to call it before forking children as it was bedore `9d60724eca` ("restore: restore mntns before creating private vma-s") The root task restores all processes and pin them with file descriptors, then a task restores a mount namespace by opening the file descriptor of the root task via /proc/pid/fd/X. Reported-by: Mr Jenkins Signed-off-by: Andrew Vagin <avagin@virtuozzo.com> Signed-off-by: Pavel Emelyanov <xemul@parallels.com>	2015-11-17 10:45:09 +03:00
Cyrill Gorcunov	7a99e699ce	mnt: Export __open_mountpoint We gonna need it for inotify handle testing. Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org> Signed-off-by: Pavel Emelyanov <xemul@parallels.com>	2015-10-21 15:08:03 +03:00
Pavel Emelyanov	a7c9f3011d	mnt: Read mount images early Mappings from mount id to namespace will be required to remove ghosts on restore failure. Signed-off-by: Pavel Emelyanov <xemul@parallels.com>	2015-09-28 22:00:36 +03:00
Pavel Emelyanov	152222a6b7	remap: Sanitize ghost file path printing First -- avoid two memory copies by printing ns root directly, and second -- remove extra argument from create_ghost, the mnt_id value we need there can be found on the ghost_file object. Signed-off-by: Pavel Emelyanov <xemul@parallels.com>	2015-09-28 21:59:45 +03:00
Pavel Emelyanov	7ca6cc1eb2	mnt: Clean roots yard from criu process So here it is. If root task dies on restore the roots yard dir remains unrmdired :( Since we already know its name, we can remove one from criu. By the time we get to this place the sub mount namespace(s) are already dead and yard dir is empty. But umounting should be done by tasks after successfull restore, so keep depopulation there. Signed-off-by: Pavel Emelyanov <xemul@parallels.com>	2015-09-28 21:57:35 +03:00
Pavel Emelyanov	3e7c92ed02	mnt: Renames around roots yard Same thing as in previous patch -- we have too many generic clean_ and fini_ prefixes over the code. And we need more (see next patch), so let's specify what exactly we clean or fini. Signed-off-by: Pavel Emelyanov <xemul@parallels.com>	2015-09-28 21:57:21 +03:00
Pavel Emelyanov	c5c65fe17a	mnt: Create roots in criu context In case root task restore failure we'll have to remove the roots yard dir from criu, so we have to create one by criu to at least have the dit name. It's OK to do it in criu, since the yards is created in the opts.root which is the same for any mnt ns we deal with on restore. Signed-off-by: Pavel Emelyanov <xemul@parallels.com>	2015-09-28 21:56:51 +03:00
Andrey Vagin	1174a2ad0f	mount: handle mnt_flags and sb_flags separatly (v4) They both can container the MS_READONLY flag. And in one case it will be read-only bind-mount and in another case it will be read-only super-block. v2: set mnt and sb for one call of mount() when it's posiable v3: return a comment which was deleted by mistake v4: Fix the sentense about restoring mnt flags Signed-off-by: Andrey Vagin <avagin@openvz.org> Signed-off-by: Pavel Emelyanov <xemul@parallels.com>	2015-09-21 11:55:17 +03:00
Cyrill Gorcunov	80ef8fd2fb	mount: Handle deleted bindmounts To handle deleted bindmounts we simply create the former directory bindmount lived at, mount the target and remove the directory back. For this sake we add @deleted entry into the image. Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org> Acked-by: Tycho Andersen <tycho.andersen@canonical.com> Signed-off-by: Pavel Emelyanov <xemul@parallels.com>	2015-08-21 21:26:17 +03:00
Cyrill Gorcunov	4f8f97e0bd	mount: mount_info -- Drop unused @is_file Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org> Acked-by: Tycho Andersen <tycho.andersen@canonical.com> Signed-off-by: Pavel Emelyanov <xemul@parallels.com>	2015-08-21 21:25:14 +03:00
Cyrill Gorcunov	3ae67d7b3a	mount: Move mount_info and ext_mount to mount.h It's quite unclean while this structure lives in proc_parse.h, which only have to fill this structure on procfs read, but real handling is inside mount.c. Move it as appropriate. Same time ext_mount structure should be moved into a header as well with sane @list name used instead of @l. Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org> Signed-off-by: Pavel Emelyanov <xemul@parallels.com>	2015-08-13 17:16:47 +03:00
Cyrill Gorcunov	291e632e60	mount: Reorder declarations - gather structs on top - then extern vars Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org> Signed-off-by: Pavel Emelyanov <xemul@parallels.com>	2015-08-13 17:16:05 +03:00
Gabriel Guimaraes	dbaab31f31	Workaround for the OverlayFS bug present before Kernel 4.2 This is here only to support the Linux Kernel between versions 3.18 and 4.2. After that, this workaround is not needed anymore, but it will work properly on both a kernel with and without the bug. The bug is that when a process has a file open in an OverlayFS directory, the information in /proc/<pid>/fd/<fd> and /proc/<pid>/fdinfo/<fd> is wrong, so we grab that information from the mountinfo table instead. This is done every time fill_fdlink is called. We first check to see if the mnt_id and st_dev numbers currently match some entry in the mountinfo table. If so, we already have the correct mnt_id and no fixup is needed. Then we proceed to see if there are any overlayFS mounted directories in the mountinfo table. If so, we concatenate the mountpoint with the name of the file, and stat the resulting path to check if we found the correct device id and node number. If that is the case, we update the mount id and link variables with the correct values. Signed-off-by: Gabriel Guimaraes <gabriellimaguimaraes@gmail.com> Signed-off-by: Pavel Emelyanov <xemul@parallels.com>	2015-08-07 14:30:41 +03:00
Oleg Nesterov	e2c38245c6	introduce --enable-fs cli option Finally add --enable-fs option to specify the comma separated list of filesystem names which should be treated as FSTYPE_AUTO. Note: obviously this option is not safe, use at your own risk. "dump" will always succeed if the mntpoint is auto, but "restore" can fail or do something wrong if mount(src, mountpoint, flags, options) can not actually "just work" as FSTYPE_AUTO logic expects. Signed-off-by: Oleg Nesterov <oleg@redhat.com> Signed-off-by: Pavel Emelyanov <xemul@parallels.com>	2015-04-10 17:35:43 +03:00
Oleg Nesterov	9fee3dc817	pass "bool for_dump" argument down to collect_mntinfo() and parse_mountinfo() Preparation. 1. Add the new "bool for_dump" arg to collect/parse_mntinfo(). 2. Introduce "struct collect_mntns_arg" to pass the additional "bool for_dump" field to collect_mntinfo() and change it to pass this boolean to collect_mntinfo()->parse_mountinfo() path. 3. Change other callers of collect_mntinfo() to pass "false". Signed-off-by: Pavel Emelyanov <xemul@parallels.com>	2015-04-03 17:55:18 +03:00
Pavel Emelyanov	5f2a7ac27b	img: Rename fdset -> imgset Since we're going to switch from int-fd-s to class-image soon the fdset name will not fit into the new terminology. This patch is sed -e 's/fdset/imgset/g' -i * sed -e 's/imgset_fd/img_from_set/g' -i * git mv include/fdset.h include/imgset.h Signed-off-by: Pavel Emelyanov <xemul@parallels.com> Acked-by: Cyrill Gorcunov <gorcunov@openvz.org>	2014-09-30 21:48:10 +04:00
Pavel Emelyanov	9fd793e565	stat: Pass namespace into phys_stat_resolve_dev, not mnt tree This makes the API simpler. Signed-off-by: Pavel Emelyanov <xemul@parallels.com>	2014-08-06 10:57:27 +04:00
Pavel Emelyanov	090587e1a1	stat: Pass namespace into phys_stat_dev_match, not mnt tree This makes the API simpler. Signed-off-by: Pavel Emelyanov <xemul@parallels.com>	2014-08-06 10:57:25 +04:00
Andrey Vagin	967dba606a	mount: add helper mntns_get_root_by_mnt_id Signed-off-by: Andrey Vagin <avagin@openvz.org> Signed-off-by: Pavel Emelyanov <xemul@parallels.com>	2014-08-05 16:38:19 +04:00
Pavel Emelyanov	c7e0042946	crtools: Introduce the --ext-mount-map option (v3) On dump one uses one or more --ext-mount-map option with A:B arguments. A denotes a mountpoint (as seen from the target mount namespace) criu dumps and B is the string that will be written into the image file instead of the mountpoint's root. On restore one uses the same --ext-mount-map option(s) with similar A:B arguments, but this time criu treats A as string from the image's root field (foobar in the example above) and B as the path in criu's mount namespace the should be bind mounted into the mountpoint. v3: * Added documentation * Added RPC bits * Changed option name into --ext-mount-map * Use colon as key and value separator Signed-off-by: Pavel Emelyanov <xemul@parallels.com>	2014-06-17 10:36:30 +04:00
Pavel Emelyanov	68e2841a9b	mnt: Turn mntns_get_root_fd into accepting mnt ns_id The only exception (for now) is the irmap -- it should operate on ns as well. Signed-off-by: Pavel Emelyanov <xemul@parallels.com>	2014-04-23 02:31:16 +04:00
Pavel Emelyanov	1435617c40	mnt: Rename _collect_root into _get_root_fd Nowadays this routine is mainly used for getting an fd, rather than keeping one for future reference. Signed-off-by: Pavel Emelyanov <xemul@parallels.com>	2014-04-23 01:38:58 +04:00
Pavel Emelyanov	79f3e90856	rst: Less arguments to restore_task_mnt_ns Acked-by: Andrew Vagin <avagin@parallels.com> Signed-off-by: Pavel Emelyanov <xemul@parallels.com>	2014-04-22 23:48:46 +04:00
Pavel Emelyanov	8550f52017	mnt: Move local mntns collecting on restore into prepare_mnt_ns Acked-by: Andrew Vagin <avagin@parallels.com> Signed-off-by: Pavel Emelyanov <xemul@parallels.com>	2014-04-22 23:48:43 +04:00
Pavel Emelyanov	f4b7a6fedd	mnt: Mark rst_collect_local_mntns as void Acked-by: Andrew Vagin <avagin@parallels.com> Signed-off-by: Pavel Emelyanov <xemul@parallels.com>	2014-04-22 23:48:38 +04:00
Pavel Emelyanov	88eef43e41	mnt: Mark dump_mnt_ns as static Acked-by: Andrew Vagin <avagin@parallels.com> Signed-off-by: Pavel Emelyanov <xemul@parallels.com>	2014-04-22 23:48:33 +04:00
Pavel Emelyanov	4ffa79695d	mnt: Remove unneeded argument from prepare_mnt_ns Acked-by: Andrew Vagin <avagin@parallels.com> Signed-off-by: Pavel Emelyanov <xemul@parallels.com>	2014-04-22 23:48:23 +04:00
Andrey Vagin	2f4be997b6	mount: use per-namespace mntinfo_tree (v2) This patch removes the global mntinfo_tree and collect_mount_info where it was constructed. The mntinfo list is filled from dump_mnt_ns, rst_collect_local_mntns, collect_mnt_namespaces and read_mnt_ns_img. A mountinfo entry contains a reference on a proper ns_id entry, so we cau use mnt_id to look up a proper mount namespace. v2: remove trash after rebasing. Signed-off-by: Andrey Vagin <avagin@openvz.org> Signed-off-by: Pavel Emelyanov <xemul@parallels.com>	2014-04-21 22:40:19 +04:00
Andrey Vagin	fb3ce0fbeb	mount: prepare to work without mnt_id Kernels before 3.15 doesn't show mnt_id and mnt_id isn't saved in images, if mntns isn't dumped. Signed-off-by: Andrey Vagin <avagin@openvz.org> Signed-off-by: Pavel Emelyanov <xemul@parallels.com>	2014-04-21 22:40:10 +04:00
Andrey Vagin	b6d3314c54	check: collect mounts of the current mntns They are used for collecting unix sockets Signed-off-by: Andrey Vagin <avagin@openvz.org> Signed-off-by: Pavel Emelyanov <xemul@parallels.com>	2014-04-21 22:40:04 +04:00
Andrey Vagin	26a0dc91dd	mount: add a function to get a temporary root for mntns On restore all mount namespaces are restored in the root mntns and sub-namecpeaces are restored in temorary places. This function allows to get paths to these places. It will be used in open_remap_ghost(), because it's called in the root task, when other tasks are not forked yet. Signed-off-by: Andrey Vagin <avagin@openvz.org> Signed-off-by: Pavel Emelyanov <xemul@parallels.com>	2014-04-21 22:39:59 +04:00
Andrey Vagin	5418938ec3	resotre: collect mounts of current mntns It's required for restoring in the current mntns. Signed-off-by: Andrey Vagin <avagin@openvz.org> Signed-off-by: Pavel Emelyanov <xemul@parallels.com>	2014-04-21 22:39:46 +04:00
Andrey Vagin	e827a695f3	mount: separate collect_mnt_ns from dump_mnt_ns We are going to support nested mntns, so the global mntinfo_tree variable are useless and information about tree should be connected to a proper namespace. But when we don't dump mntns, we need to collect mounts for the current mntns. Signed-off-by: Andrey Vagin <avagin@openvz.org> Signed-off-by: Pavel Emelyanov <xemul@parallels.com>	2014-04-21 22:39:41 +04:00
Andrey Vagin	cc1fd5760a	mount: save mount tree for each namespace We are going to support nested mount namespaces and each NS has own tree. The mount tree is used for checking that a file is reachable. Signed-off-by: Andrey Vagin <avagin@openvz.org> Signed-off-by: Pavel Emelyanov <xemul@parallels.com>	2014-04-21 22:39:34 +04:00
Andrey Vagin	3a291e33ff	crtools: restore nested mount namespaces (v2) Known issue: * currently only namespaces with the same root is supported * nested namespaces can be dumped and restored only if the root task has own mount namespace. All nested namespaces are restored in a root namespace in temporary directories. All mount points restored in one tree and then they are divided into namesaces. The task with minimal pid for each namespaces unshared mntns and then it makes pivot_root in a proper temporary directory. All other tasks makes setns to enter into a mount namespace of the task with minimal pid. v2: clean up Signed-off-by: Andrey Vagin <avagin@openvz.org> Signed-off-by: Pavel Emelyanov <xemul@parallels.com>	2014-04-21 22:38:17 +04:00
Andrey Vagin	e7e9c2ee6e	mounts: create a temporary directory for restoring non-root mntns (v2) All non-root namespaces will be restored as sub-trees of the root tree. This patch adds helpers to create a temporary directory and mount tmpfs in it, then create directories for each non-root mount namespace. tmpfs is quite useful here to simplify destroying this construction, we don't need to unmount each namespace separately. v2: add a comment why MNT_DETACH is not dangerous here Signed-off-by: Andrey Vagin <avagin@openvz.org> Signed-off-by: Pavel Emelyanov <xemul@parallels.com>	2014-04-21 22:38:12 +04:00
Andrey Vagin	87a49bdfaf	servicefd: add a service fd for current root It's already used for dumping files and it will be used for restoring, so it should be service fd to avoid intersection with restored descriptors. Signed-off-by: Andrey Vagin <avagin@openvz.org> Signed-off-by: Pavel Emelyanov <xemul@parallels.com>	2014-04-17 12:03:11 +04:00
Pavel Emelyanov	ae98ef6ae0	mount: Factor out mount tree build for NEWNS and non-NS cases We anyway build the tree, in the NS case -- few calls later. Signed-off-by: Pavel Emelyanov <xemul@parallels.com>	2013-12-12 16:19:48 +04:00
Cyrill Gorcunov	1ba08ca664	mount: Extend phys_stat_dev_match to use path resolving instead of btrfs engine Instead of scanning btrfs subvolumes (which can be even unaccessbile if mount point lays on directory instead of subvolume itself) we use path resolving feature here -- once we need to figure out if some device number need to be altered up to mount point (as we know stat() called on subvolume returns st_dev for subvolume itself, but not one that associated with a superblock and shown in /proc/self/mountinfo output). This as well implies that we need to check if device number for ghost files are to be updated to match mountinfo, thus we use phys_stat_resolve_dev helper here. After this patch the previously merged btrfs engine is no longer needed (at least it seems so) and can be dropped. Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org> Signed-off-by: Pavel Emelyanov <xemul@parallels.com>	2013-12-11 16:05:22 +04:00
Cyrill Gorcunov	5372e3910c	mount: btrfs -- Introduce phys_stat_resolve_dev helper (v2) This routine is aimed to find a mount point on which the path passed as argument is laying on. We walk over all mount points and see which one is matching. Once found (in worst case it will be a root mount point so function is never failing) we're checking if this is btrfs and then return subvolume0 device id. See commit `921cf873f3` for details what the hell we're doing here. v2: rewrite mount_resolve_path w/o recursion Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org> Signed-off-by: Andrew Vagin <avagin@openvz.org> Signed-off-by: Pavel Emelyanov <xemul@parallels.com>	2013-12-11 16:05:21 +04:00
Cyrill Gorcunov	cf1ce5f817	mount: Build mount tree on dump restore early, if needed For paths resolution we will need mount tree to be parsed and built, but it's not that simple -- the current code implies that once parsed the tree must not be re-parsed again, so we pass @parse argument from a caller: if a task we're restoring do not use mount namespace, we should parse mount tree early, otherwise defer this action until mount tree is read from the image. Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org> Signed-off-by: Pavel Emelyanov <xemul@parallels.com>	2013-12-11 16:05:19 +04:00
Cyrill Gorcunov	e54ad19a06	mount: Add phys_stat_dev_match helper This helper serves to hide fs specifics (in particular btrfs) thus the caller won't need the details. Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org> Signed-off-by: Pavel Emelyanov <xemul@parallels.com>	2013-12-04 19:23:32 +04:00
Cyrill Gorcunov	291aa3f6d6	headers: Add extern specificator to functions We really have a mess of extern/non-extern declaration of functions in our headers. Always use extern for unification purpose. Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org> Signed-off-by: Pavel Emelyanov <xemul@parallels.com>	2013-11-15 17:00:58 +04:00
Andrey Vagin	b895c73c82	mntns: don't use global fdset for dumping namespace We are going to replace pid on id in names of image files. The id is uniq for each namespace, so it's more convient, if image files are opened per namespace. Signed-off-by: Andrey Vagin <avagin@openvz.org> Signed-off-by: Pavel Emelyanov <xemul@parallels.com>	2013-10-01 12:11:09 +04:00
Pavel Emelyanov	b18fb09eb9	show: Replace one-line show_foo calls with args array We have generic do_pb_show() call and tons of show_foo routines, that just call one with proper args. Compact the code by putting the args into array and calling the do_pb_show() in one place. Signed-off-by: Pavel Emelyanov <xemul@parallels.com>	2013-08-24 04:00:32 +04:00
Pavel Emelyanov	6bf22f8c75	crtools: Get rid of on-stack cr_options We have global instance of them, that's enough. Signed-off-by: Pavel Emelyanov <xemul@parallels.com>	2013-05-28 21:11:13 +04:00
Pavel Emelyanov	add21b75c9	show: Remove options args from ->show callback This thing is global, we can address one explicitly. Signed-off-by: Pavel Emelyanov <xemul@parallels.com>	2013-05-08 00:23:42 +04:00
Andrey Vagin	13a8e60bca	cr-dump: collect mount points in the target namespace Information about mount points is used for dumping fanotify. Signed-off-by: Andrey Vagin <avagin@openvz.org> Acked-by: Cyrill Gorcunov <gorcunov@openvz.org> Signed-off-by: Pavel Emelyanov <xemul@parallels.com>	2013-04-11 22:06:48 +04:00
Pavel Emelyanov	3a1c7d1d76	ns: Introduce ns descriptors These are structs that (now) tie together ns string and the CLONE_ flag. It's nice to have one (some code becomes simpler) and will help us with auto-namespaces detection. Signed-off-by: Pavel Emelyanov <xemul@parallels.com>	2013-01-15 23:24:01 +04:00

1 2

60 commits