Discussion:
system boot hangs with the latest systemd
Vladimir Elisseev
2012-11-20 07:51:00 UTC
Permalink
Hello,

Updating systemd-188 to >=193 leads to boot error: immediately after
issuing switch_root systemd hangs with the message "systemd[1] - nfs
branch is not exportable", though system can be booted properly if using
the SysV init. I've been using AUFS with NFS root (NFS ro with NFS rw
branches) for a long while and lately successfully with systemd until
got this error. I've asked this question already in the systemd ML and
below is the answer:
_________________________________________________________________________
Umm, aufs is out of the kernel due to issues. I have no idea really what
this might be caused by.

Maybe this is related to MS_SHARED change? I.e. the default mount
propagation flag we now set to MS_SHARED rather than MS_PRIVATE, so that
containers are happy. We encountered numerous issues with that change,
maybe aufs is another piece in the puzzle that trips over it?

Lennart
_________________________________________________________________________

I know that it looks like systemd broke (again) capabilities, but,
anyway, is it possible to fix this without changing something in
systemd?

Regards,
Vlad.
s***@users.sourceforge.net
2012-11-20 12:42:47 UTC
Permalink
Hello Vlad,
Post by Vladimir Elisseev
I know that it looks like systemd broke (again) capabilities, but,
anyway, is it possible to fix this without changing something in
systemd?
It seems to be a problem of aufs.
The message from aufs "nfs branch is not exportable" means that
- your branch fs is NFS which is not exportable via NFS obviously.
- the internal function aufs's "encode_fh()" is called.
it is NFSD which calls the function usually.
- aufs tries respoding NFSD. but as long as its branch fs is not
exportable, supporting NFSD is impossible.
- aufs produces the message and returns an error.

Recently there may be made some changes about the error code for NFSD,
and I am afraid aufs doesn't follow such changes.
Additionally some versions before, a new systemcall
name_to_handle_at() was introduced which calls "encode_fh()" internally.
Aufs should support this too.

Since systemd looks using name_to_handle_at(), the lack of aufs's
supporting name_to_handle_at() is the cause of your problem.
This is my current guess.

I will dive into these two problems,
- the return code of "encode_fh()"
- supporting name_to_handle_at()
when I have enough time.
Unfortunately I am busy in these days, and I don't know when it will be.


J. R. Okajima
Elisseev V.
2012-11-20 12:55:23 UTC
Permalink
Thanks for your extended answer! For the time being I'll use the last
working version of systemd. Hopefully, you'll have some time to fix
these problems soon!

Regards,
Vlad.
Post by s***@users.sourceforge.net
Hello Vlad,
Post by Vladimir Elisseev
I know that it looks like systemd broke (again) capabilities, but,
anyway, is it possible to fix this without changing something in
systemd?
It seems to be a problem of aufs.
The message from aufs "nfs branch is not exportable" means that
- your branch fs is NFS which is not exportable via NFS obviously.
- the internal function aufs's "encode_fh()" is called.
it is NFSD which calls the function usually.
- aufs tries respoding NFSD. but as long as its branch fs is not
exportable, supporting NFSD is impossible.
- aufs produces the message and returns an error.
Recently there may be made some changes about the error code for NFSD,
and I am afraid aufs doesn't follow such changes.
Additionally some versions before, a new systemcall
name_to_handle_at() was introduced which calls "encode_fh()" internally.
Aufs should support this too.
Since systemd looks using name_to_handle_at(), the lack of aufs's
supporting name_to_handle_at() is the cause of your problem.
This is my current guess.
I will dive into these two problems,
- the return code of "encode_fh()"
- supporting name_to_handle_at()
when I have enough time.
Unfortunately I am busy in these days, and I don't know when it will be.
J. R. Okajima
s***@users.sourceforge.net
2012-11-20 15:05:39 UTC
Permalink
Post by Elisseev V.
Thanks for your extended answer! For the time being I'll use the last
working version of systemd. Hopefully, you'll have some time to fix
these problems soon!
If the problem is related to name_to_handle_at() only and if you disable
CONFIG_FHANDLE which is at "General setup" --> "open by fhandle
syscalls", then latest systemd may work.
Just my guess.


J. R. Okajima
s***@users.sourceforge.net
2012-11-22 07:36:07 UTC
Permalink
Post by s***@users.sourceforge.net
I will dive into these two problems,
- the return code of "encode_fh()"
- supporting name_to_handle_at()
when I have enough time.
Unfortunately I am busy in these days, and I don't know when it will be.
I have took a glance at "encode_fh()" and name_to_handle_at(), and found
that aufs has nothing necessary to support it.
- the protocol around the return code of "encode_fh()" is unchanged.
- aufs already supports name_to_handle_at().

Also I looked at systemd and found the suspicous code around handling
the error of name_to_handle_at().
- systemd calls name_to_handle_at()
- name_to_handle_at() calls aufs's "encode_fh()"
- "encode_fh()" return an error since your branch fs is NFS which
doesn't support exporting via NFS.
- name_to_handle_at() gets 255 from aufs and returns EOVERFLOW to
userspace.
linux/fs/fhandle.c:do_sys_name_to_handle()
/* we ask for a non connected handle */
retval = exportfs_encode_fh(path->dentry,
(struct fid *)handle->f_handle,
&handle_dwords, 0); // which calls aufs's "encode_fh()"
:::
if ((handle->handle_bytes > f_handle.handle_bytes) ||
(retval == 255) || (retval == -ENOSPC)) {
/* As per old exportfs_encode_fh documentation
* we could return ENOSPC to indicate overflow
* But file system returned 255 always. So handle
* both the values
*/
/*
* set the handle size to zero so we copy only
* non variable part of the file_handle
*/
handle_bytes = 0;
retval = -EOVERFLOW;
} else
retval = 0;
- systemd gets EOVERFLOW from name_to_handle_at(), but it doesn't handle
this error code.
systemd/src/shared/path-util.c:path_is_mount_point()
r = name_to_handle_at(AT_FDCWD, t, h, &mount_id, allow_symlink ? AT_SYMLINK_FOLLOW : 0);
if (r < 0) {
if (errno == ENOSYS || errno == ENOTSUP)
/* This kernel or file system does not support
* name_to_handle_at(), hence fallback to the
* traditional stat() logic */
goto fallback;

if (errno == ENOENT)
return 0;

return -errno;
}
:::
r = name_to_handle_at(AT_FDCWD, parent, h, &mount_id_parent, 0);
:::
if (r < 0) {
/* The parent can't do name_to_handle_at() but the
* directory we are interested in can? If so, it must
* be a mount point */
if (errno == ENOTSUP)
return 1;

return -errno;
}

Although I didn't read the systemd's fallback routine, my current
conclusion is,
- as long as systemd doesn't handle EOVERFLOW (and doesn't go into the
fallback routine), systemd will not work with many(? a few?) fs, I am
afraid.
- if systemd support EOVERFLOW, aufs can co-work with systemd. the "nfs
branch is not exportable" message will be still produced, but systemd
will go into its fallback routine.
- since systemd has "configure" script and if you disable
CONFIG_FHANDLE, then systemd will not try calling name_to_handle_at().
And it will go into the fallback routine. and you will be happy too.


J. R. Okajima
Santiago Gimeno
2016-05-10 13:51:12 UTC
Permalink
Hello!
Post by s***@users.sourceforge.net
Although I didn't read the systemd's fallback routine, my current
conclusion is,
- as long as systemd doesn't handle EOVERFLOW (and doesn't go into the
fallback routine), systemd will not work with many(? a few?) fs, I am
afraid.
- if systemd support EOVERFLOW, aufs can co-work with systemd. the "nfs
branch is not exportable" message will be still produced, but systemd
will go into its fallback routine.
- since systemd has "configure" script and if you disable
CONFIG_FHANDLE, then systemd will not try calling name_to_handle_at().
And it will go into the fallback routine. and you will be happy too.
Sorry for resurrecting this old thread, but I'm experiencing this very same
issue.

I'm observing this in a Debian Jessie (kernel 3.16.7) box, so my guess is
that the issue was never solved in the systemd part (I'm using version
215-17+deb8u4). Can anyone confirm this?

Also, regarding the disable of CONFIG_FHANDLE, how would I do that? Should I
set CONFIG_FHANDLE=n and recompile the kernel (with aufs)? When I tried that
I got this error:

fs/built-in.o: In function `aufs_encode_fh':
/home/sgimeno/kernel-patches/linux-backports/fs/aufs/export.c:752: undefined
reference to `exportfs_encode_fh'
fs/built-in.o: In function `decode_by_path':
/home/sgimeno/kernel-patches/linux-backports/fs/aufs/export.c:518: undefined
reference to `exportfs_decode_fh'
Makefile:893: recipe for target 'vmlinux' failed

... so I think it must be something else.

Any help would be appreciated!!

Best regards,

Santiago
s***@users.sourceforge.net
2016-05-10 16:01:06 UTC
Permalink
Hello Santiago,
Post by Santiago Gimeno
Sorry for resurrecting this old thread, but I'm experiencing this very same
issue.
Whao, it is back in Nov 2012... Incredebly old thread!
Post by Santiago Gimeno
Also, regarding the disable of CONFIG_FHANDLE, how would I do that? Should I
set CONFIG_FHANDLE=n and recompile the kernel (with aufs)? When I tried that
/home/sgimeno/kernel-patches/linux-backports/fs/aufs/export.c:752: undefined
reference to `exportfs_encode_fh'
/home/sgimeno/kernel-patches/linux-backports/fs/aufs/export.c:518: undefined
reference to `exportfs_decode_fh'
While I don't remember the details of what I investigated at that time,
I am afraid you need to reconfigure and rebuild your kernel. As you
know, CONFIG_FHANDLE is a kernel configuration. Disabling it is what I
meant.

On the other hand, the error "undefined reference to
`exportfs_encode_fh'" is related to CONFIG_EXPORTFS and
CONFIG_AUFS_EXPORT. Such error will happen if you disable
CONFIG_EXPORTFS and enable CONFIG_AUFS_EXPORT. Generally it should not
happen since CONFIG_AUFS_EXPORT depends upon CONFIG_EXPORTFS (See
fs/aufs/Kconfig in detail).


J. R. Okajima
Santiago Gimeno
2016-05-11 13:37:25 UTC
Permalink
Hello again,

(Adding everyone in copy now, sorry for the double post)

Thanks for the prompt response!

On the other hand, the error "undefined reference to
Post by s***@users.sourceforge.net
`exportfs_encode_fh'" is related to CONFIG_EXPORTFS and
CONFIG_AUFS_EXPORT. Such error will happen if you disable
CONFIG_EXPORTFS and enable CONFIG_AUFS_EXPORT. Generally it should not
happen since CONFIG_AUFS_EXPORT depends upon CONFIG_EXPORTFS (See
fs/aufs/Kconfig in detail).
I'm still having issues with this. Apparently disabling CONFIG_FHANDLE
makes
CONFIG_EXPORTFS=m instead of CONFIG_EXPORTFS=y and I imagine that is
causing the error.
Am I doing something wrong?

Thanks in advance.

Santiago
s***@users.sourceforge.net
2016-05-11 13:57:45 UTC
Permalink
Post by Santiago Gimeno
I'm still having issues with this. Apparently disabling CONFIG_FHANDLE
makes
CONFIG_EXPORTFS=m instead of CONFIG_EXPORTFS=y and I imagine that is
causing the error.
Am I doing something wrong?
How are you configuring?
If you set CONFIG_AUFS=y and CONFIG_EXPORTFS=m, then such error will
happen I am afraid. The ordinary and proper way should not such
configuration.


J. R. Okajima
Santiago Gimeno
2016-05-11 14:18:01 UTC
Permalink
Hello
Post by s***@users.sourceforge.net
How are you configuring?
If you set CONFIG_AUFS=y and CONFIG_EXPORTFS=m, then such error will
happen I am afraid. The ordinary and proper way should not such
configuration.
I'm running make menuconfig. If I disable CONFIG_FHANDLE in

General setup ---> open by fhandle syscalls

The .config file generated contains CONFIG_EXPORTFS=m,

If I run make menuconfig again and enable CONFIG_FHANDLE, the .config file
generated contains CONFIG_EXPORTFS=y


I've tried to set the values in the .config file manually (CONFIG_FHANDLE=n
and CONFIG_EXPORTFS=y ) and compile the kernel with make-kpkg, but it
regenerates the .config file and sets back CONFIG_EXPORTFS=m, so the build
fails :(.

Any other ideas?

Thanks!

Santiago
s***@users.sourceforge.net
2016-05-11 14:24:43 UTC
Permalink
Post by Santiago Gimeno
I'm running make menuconfig. If I disable CONFIG_FHANDLE in
General setup ---> open by fhandle syscalls
The .config file generated contains CONFIG_EXPORTFS=m,
And how did you get the aufs source files?
If it is aufs[34]-linux.git instead of aufs[34]-standalone.git, then you
cannot select =m for CONFIG_AUFS and such error will happen obviously.


J. R. Okajima
Santiago Gimeno
2016-05-11 14:36:57 UTC
Permalink
Hello,
Post by s***@users.sourceforge.net
And how did you get the aufs source files?
If it is aufs[34]-linux.git instead of aufs[34]-standalone.git, then you
cannot select =m for CONFIG_AUFS and such error will happen obviously.
I've tried two ways:

1) Getting the 3.16.7 vanilla kernel from kernel.org and patching it with
aufs3-standalone.git

2) Getting the Debian Jessie kernel (3.16.7-ckt25-2) that already contains
aufs.

In both cases the result is the same.

Thanks,

Santiago
s***@users.sourceforge.net
2016-05-11 14:39:25 UTC
Permalink
Post by Santiago Gimeno
1) Getting the 3.16.7 vanilla kernel from kernel.org and patching it with
aufs3-standalone.git
Which branch?
Post by Santiago Gimeno
2) Getting the Debian Jessie kernel (3.16.7-ckt25-2) that already contains
aufs.
Unfortunately I am not a debian kernel maintainer.


J. R. Okajima
Santiago Gimeno
2016-05-11 14:46:40 UTC
Permalink
Hello,
Post by s***@users.sourceforge.net
Post by Santiago Gimeno
1) Getting the 3.16.7 vanilla kernel from kernel.org and patching it
with
Post by Santiago Gimeno
aufs3-standalone.git
Which branch?
git clone --depth 1 --single-branch --branch v3.16.7 git://
git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git


Thanks

Santiago
s***@users.sourceforge.net
2016-05-11 15:07:34 UTC
Permalink
Post by Santiago Gimeno
git clone --depth 1 --single-branch --branch v3.16.7 git://
git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git
I am asking which branch in aufs3-standalone.git.
And if you set CONFIG_AUFS=y and CONFIG_EXPORTFS=m, then you cannot set
CONFIG_AUFS_EXPORT=y.
But if you set CONFIG_AUFS=m, then you can set CONFIG_EXPORTFS=m and
CONFIG_AUFS_EXPORT=y.

----------------------------------------------------------------------
Post by Santiago Gimeno
This is exactly what it's happening to me in 3.16. With this configuration,
the build fails :(.
Then you should investigate why your build system compiles
fs/aufs/export.c even CONFIG_AUFS_EXPORT is NOT set.


J. R. Okajima
s***@users.sourceforge.net
2016-05-11 15:23:57 UTC
Permalink
Post by s***@users.sourceforge.net
I am asking which branch in aufs3-standalone.git.
And if you set CONFIG_AUFS=y and CONFIG_EXPORTFS=m, then you cannot set
CONFIG_AUFS_EXPORT=y.
But if you set CONFIG_AUFS=m, then you can set CONFIG_EXPORTFS=m and
CONFIG_AUFS_EXPORT=y.
Also, which aufs patch did you apply?


J. R. Okajima
Santiago Gimeno
2016-05-11 15:39:03 UTC
Permalink
Hi,
Post by s***@users.sourceforge.net
Post by s***@users.sourceforge.net
I am asking which branch in aufs3-standalone.git.
And if you set CONFIG_AUFS=y and CONFIG_EXPORTFS=m, then you cannot set
CONFIG_AUFS_EXPORT=y.
But if you set CONFIG_AUFS=m, then you can set CONFIG_EXPORTFS=m and
CONFIG_AUFS_EXPORT=y.
Also, which aufs patch did you apply?
I'm applying (in this order)

- aufs3-kbuild.patch
- aufs3-base.patch
- aufs3-mmap.patch
(no standalone patch for the moment)

- Copying the files {Documentation,fs} to the source tree
- cp include/linux/aufs_type.h ./include/linux/aufs_type.h
Post by s***@users.sourceforge.net
J. R. Okajima
Santiago Gimeno
2016-05-11 15:41:45 UTC
Permalink
But it such incorrect config is really producible, then fs/aufs/Kconfig
should be fixed.
Even if I fix Kconfig, aufs3.16 is not supported anymore as you might
know.
Yes, I totally understand that.

Thanks!

Santiago
Santiago Gimeno
2016-05-19 09:39:19 UTC
Permalink
Hello again!,

Sorry for the delay, it's been quite a busy days for me.

After making some more tests I can confirm:

1) Setting the following options:

CONFIG_FHANDLE disabled
CONFIG_AUFS=m
CONFIG_EXPORTFS=m
CONFIG_AUFS_EXPORT=y

and applying all the patches (including the standalone patch) from
aufs3.16 fixes the original issue (the "nfs branch is not exportable" issue)

2) This configuration that fails to build is reproducible for me using a
kernel 3.16.7 with aufs3.16

# CONFIG_FHANDLE is not set
CONFIG_EXPORTFS=m
CONFIG_AUFS_EXPORT=y

Thanks a lot for your help and patience.

Santiago
But it such incorrect config is really producible, then fs/aufs/Kconfig
should be fixed.
Even if I fix Kconfig, aufs3.16 is not supported anymore as you might
know.
Yes, I totally understand that.
Thanks!
Santiago
s***@users.sourceforge.net
2016-05-19 14:10:17 UTC
Permalink
Post by Santiago Gimeno
2) This configuration that fails to build is reproducible for me using a
kernel 3.16.7 with aufs3.16
# CONFIG_FHANDLE is not set
CONFIG_EXPORTFS=m
CONFIG_AUFS_EXPORT=y
Now I am testing next Monday aufs release which contains this commit

commit b180f236a10e6fab2daa19d5ee708ec2369979e9
Author: J. R. Okajima <***@gmail.com>
Date: Wed May 18 04:58:00 2016 +0900

aufs: build bugfix, unsupport CONFIG_EXPORTFS=m and CONFIG_AUFS=y explicitly

Reported-by: Santiago Gimeno <***@gmail.com>
See-also: http://www.mail-archive.com/aufs-***@lists.sourceforge.net/msg05313.html
Signed-off-by: J. R. Okajima <***@gmail.com>

diff --git a/fs/aufs/export.c b/fs/aufs/export.c
index 08c8637..2ff86c6 100644
--- a/fs/aufs/export.c
+++ b/fs/aufs/export.c
@@ -822,6 +822,11 @@ void au_export_init(struct super_block *sb)
struct au_sbinfo *sbinfo;
__u32 u;

+ BUILD_BUG_ON_MSG(IS_BUILTIN(CONFIG_AUFS_FS)
+ && IS_MODULE(CONFIG_EXPORTFS),
+ AUFS_NAME ": unsupported configuration "
+ "CONFIG_EXPORTFS=m and CONFIG_AUFS_FS=y");
+
sb->s_export_op = &aufs_export_op;
sbinfo = au_sbi(sb);
sbinfo->si_xigen = NULL;

Santiago Gimeno
2016-05-11 14:59:37 UTC
Permalink
Now that I think of it, I've noticed the failure in the Debian kernel only,
haven't tried a complete build in the vanilla yet (build in progress now),
as I saw the failure in the Debian kernel and as the configuration
CONFIG_EXPORTFS=m happened in both, I didn't bother to run it. I'll come
back with the results once the build ends. Sorry for the trouble.

Best regards,

Santiago
Post by Vladimir Elisseev
Hello,
Post by s***@users.sourceforge.net
And how did you get the aufs source files?
If it is aufs[34]-linux.git instead of aufs[34]-standalone.git, then you
cannot select =m for CONFIG_AUFS and such error will happen obviously.
1) Getting the 3.16.7 vanilla kernel from kernel.org and patching it with
aufs3-standalone.git
2) Getting the Debian Jessie kernel (3.16.7-ckt25-2) that already contains
aufs.
In both cases the result is the same.
Thanks,
Santiago
s***@users.sourceforge.net
2016-05-11 14:37:49 UTC
Permalink
Post by s***@users.sourceforge.net
Post by Santiago Gimeno
I'm running make menuconfig. If I disable CONFIG_FHANDLE in
General setup ---> open by fhandle syscalls
The .config file generated contains CONFIG_EXPORTFS=m,
And how did you get the aufs source files?
If it is aufs[34]-linux.git instead of aufs[34]-standalone.git, then you
cannot select =m for CONFIG_AUFS and such error will happen obviously.
I've tried aufs4-linux.git#aufs4.4.
By disabling CONFIG_FHANDLE, .config changed like this.

--- .config 2016-05-11 23:26:48.000000000 +0900
+++ .config 2016-05-11 23:34:40.000000000 +0900
@@ -71,7 +71,7 @@
CONFIG_POSIX_MQUEUE=y
CONFIG_POSIX_MQUEUE_SYSCTL=y
CONFIG_CROSS_MEMORY_ATTACH=y
-CONFIG_FHANDLE=y
+# CONFIG_FHANDLE is not set
# CONFIG_USELIB is not set
CONFIG_AUDIT=y
CONFIG_HAVE_ARCH_AUDITSYSCALL=y
@@ -2336,7 +2336,7 @@
# CONFIG_F2FS_IO_TRACE is not set
CONFIG_FS_DAX=y
CONFIG_FS_POSIX_ACL=y
-CONFIG_EXPORTFS=y
+CONFIG_EXPORTFS=m
CONFIG_FILE_LOCKING=y
CONFIG_FSNOTIFY=y
CONFIG_DNOTIFY=y
@@ -2454,8 +2454,6 @@
CONFIG_AUFS_SBILIST=y
CONFIG_AUFS_HNOTIFY=y
CONFIG_AUFS_HFSNOTIFY=y
-CONFIG_AUFS_EXPORT=y
-CONFIG_AUFS_INO_T_64=y
CONFIG_AUFS_XATTR=y
CONFIG_AUFS_FHSM=y
CONFIG_AUFS_RDU=y

As long as CONFIG_AUFS_EXPORT is disabled, fs/aufs/export.c won't be compiled.


J. R. Okajima
Santiago Gimeno
2016-05-11 14:44:40 UTC
Permalink
Hello,
Post by s***@users.sourceforge.net
I've tried aufs4-linux.git#aufs4.4.
By disabling CONFIG_FHANDLE, .config changed like this.
--- .config 2016-05-11 23:26:48.000000000 +0900
+++ .config 2016-05-11 23:34:40.000000000 +0900
@@ -71,7 +71,7 @@
CONFIG_POSIX_MQUEUE=y
CONFIG_POSIX_MQUEUE_SYSCTL=y
CONFIG_CROSS_MEMORY_ATTACH=y
-CONFIG_FHANDLE=y
+# CONFIG_FHANDLE is not set
# CONFIG_USELIB is not set
CONFIG_AUDIT=y
CONFIG_HAVE_ARCH_AUDITSYSCALL=y
@@ -2336,7 +2336,7 @@
# CONFIG_F2FS_IO_TRACE is not set
CONFIG_FS_DAX=y
CONFIG_FS_POSIX_ACL=y
-CONFIG_EXPORTFS=y
+CONFIG_EXPORTFS=m
CONFIG_FILE_LOCKING=y
CONFIG_FSNOTIFY=y
CONFIG_DNOTIFY=y
@@ -2454,8 +2454,6 @@
CONFIG_AUFS_SBILIST=y
CONFIG_AUFS_HNOTIFY=y
CONFIG_AUFS_HFSNOTIFY=y
-CONFIG_AUFS_EXPORT=y
-CONFIG_AUFS_INO_T_64=y
CONFIG_AUFS_XATTR=y
CONFIG_AUFS_FHSM=y
CONFIG_AUFS_RDU=y
This is exactly what it's happening to me in 3.16. With this configuration,
the build fails :(.
Loading...