Discussion:
Aufs returns ENOSUP when writing in the middle of a file on XFS
Vasily Tarasov
2017-06-07 20:13:36 UTC
Permalink
Hi J. R. and others,

I'm using aufs in Docker environment and experience the following problem:

1. Writing in the *middle* of the file fails from inside the container with
ENOSUP error. Writes do work without errors if I overwrite the whole file
from the start.
2. Underlying file system is XFS. If I use Ext4 it works.
3. My aufs version is 4.x-rcN-20170522, kernel version is 4.11.3 (see more
details in the end of the e-mail).

Before, I dig deeper, I wonder you have any ideas on what can be the reason?

Some setup details (sorry for not providing more):

$ cat /sys/module/aufs/version
4.x-rcN-20170522

$ uname -a
Linux localhost 4.11.3-1.el7.centos.x86_64 #1 SMP Sat May 27 00:19:26 EDT
2017 x86_64 x86_64 x86_64 GNU/Linux

$ rpm -qa | grep 4.11.3-1.el7.centos.x86_64
kernel-ml-aufs-tools-libs-devel-4.11.3-1.el7.centos.x86_64
kernel-ml-aufs-4.11.3-1.el7.centos.x86_64
kernel-ml-aufs-tools-4.11.3-1.el7.centos.x86_64
kernel-ml-aufs-headers-4.11.3-1.el7.centos.x86_64
kernel-ml-aufs-tools-libs-4.11.3-1.el7.centos.x86_64
kernel-ml-aufs-devel-4.11.3-1.el7.centos.x86_64

$ cat /boot/config-4.11.3-1.el7.centos.x86_64 | grep AUFS
CONFIG_AUFS_FS=y
CONFIG_AUFS_BRANCH_MAX_127=y
# CONFIG_AUFS_BRANCH_MAX_511 is not set
# CONFIG_AUFS_BRANCH_MAX_1023 is not set
# CONFIG_AUFS_BRANCH_MAX_32767 is not set
CONFIG_AUFS_SBILIST=y
# CONFIG_AUFS_HNOTIFY is not set
CONFIG_AUFS_EXPORT=y
CONFIG_AUFS_INO_T_64=y
CONFIG_AUFS_XATTR=y
CONFIG_AUFS_FHSM=y
# CONFIG_AUFS_RDU is not set
# CONFIG_AUFS_SHWH is not set
CONFIG_AUFS_BR_RAMFS=y
CONFIG_AUFS_BR_FUSE=y
CONFIG_AUFS_POLL=y
# CONFIG_AUFS_BR_HFSPLUS is not set
CONFIG_AUFS_BDEV_LOOP=y
# CONFIG_AUFS_DEBUG is not set

Thank you!

Vasily
Demmel Nikolaus (BOSP/PAR)
2017-06-07 21:10:47 UTC
Permalink
Thank you for your message. Unfortunately I have left the company in May 2017 and will no longer be reachable here.

Mit freundlichen Gr??en / Best regards

Nikolaus Demmel

Robert Bosch Start-up GmbH
Product Area Intralogistic Robotics (BOSP/PAR)
Gr?nerstra?e 5
71636 Ludwigsburg
GERMANY
www.bosch.com<http://www.bosch.com>

Tel. +49(711)811-34272
Fax +49(711)811-0
Mobil +49(173)581-3827
***@de.bosch.com<mailto:***@de.bosch.com>

Sitz: Stuttgart, Registergericht: Amtsgericht Stuttgart, HRB 747561
Gesch?ftsf?hrung: Peter Guse, J?rg Heckel, Dr. Amos Albert
UStID DE294351842
Vasily Tarasov
2017-06-07 21:44:22 UTC
Permalink
Hi J.R., thank you for the quick reply.

1) Here is strace output:

pwrite(6, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"...,
4096, 532480) = -1 EOPNOTSUPP (Operation not supported)

So, it is pwrite() syscall that fails for me.

2) Here is why I'm using aufs4.x-rcN. I run CentOS 7.3 which does not
support aufs natively. I did not want to spend time on compiling my one
kernel with aufs support and instead used pre-compiled kernel with aufs
support from here:

https://github.com/bnied/kernel-ml-aufs (sources and procedure)
https://yum.spaceduck.org/kernel-ml-aufs/7/x86_64/ (binary RPMs themselves).

And the people that maintain this RPM seem to use 4.x-rcN not 4.11.

3) Quick, somewhat related, question. I do not see aufs4.11 branch in

https://github.com/sfjro/aufs4-linux

Do I look in the wrong place?

4) Here is some additional information that you requested:

# cat /proc/mounts
sysfs /sys sysfs rw,nosuid,nodev,noexec,relatime 0 0
proc /proc proc rw,nosuid,nodev,noexec,relatime 0 0
devtmpfs /dev devtmpfs rw,nosuid,size=49409204k,nr_inodes=12352301,mode=755
0 0
securityfs /sys/kernel/security securityfs rw,nosuid,nodev,noexec,relatime
0 0
tmpfs /dev/shm tmpfs rw,nosuid,nodev 0 0
devpts /dev/pts devpts rw,nosuid,noexec,relatime,gid=5,mode=620,ptmxmode=000
0 0
tmpfs /run tmpfs rw,nosuid,nodev,mode=755 0 0
tmpfs /sys/fs/cgroup tmpfs ro,nosuid,nodev,noexec,mode=755 0 0
cgroup /sys/fs/cgroup/systemd cgroup rw,nosuid,nodev,noexec,
relatime,xattr,release_agent=/usr/lib/systemd/systemd-cgroups-agent,name=systemd
0 0
pstore /sys/fs/pstore pstore rw,nosuid,nodev,noexec,relatime 0 0
efivarfs /sys/firmware/efi/efivars efivarfs rw,nosuid,nodev,noexec,relatime
0 0
cgroup /sys/fs/cgroup/cpu,cpuacct cgroup
rw,nosuid,nodev,noexec,relatime,cpu,cpuacct
0 0
cgroup /sys/fs/cgroup/blkio cgroup rw,nosuid,nodev,noexec,relatime,blkio 0 0
cgroup /sys/fs/cgroup/freezer cgroup rw,nosuid,nodev,noexec,relatime,freezer
0 0
cgroup /sys/fs/cgroup/pids cgroup rw,nosuid,nodev,noexec,relatime,pids 0 0
cgroup /sys/fs/cgroup/cpuset cgroup rw,nosuid,nodev,noexec,relatime,cpuset
0 0
cgroup /sys/fs/cgroup/perf_event cgroup
rw,nosuid,nodev,noexec,relatime,perf_event
0 0
cgroup /sys/fs/cgroup/net_cls,net_prio cgroup
rw,nosuid,nodev,noexec,relatime,net_cls,net_prio
0 0
cgroup /sys/fs/cgroup/memory cgroup rw,nosuid,nodev,noexec,relatime,memory
0 0
cgroup /sys/fs/cgroup/devices cgroup rw,nosuid,nodev,noexec,relatime,devices
0 0
cgroup /sys/fs/cgroup/hugetlb cgroup rw,nosuid,nodev,noexec,relatime,hugetlb
0 0
configfs /sys/kernel/config configfs rw,relatime 0 0
/dev/mapper/centos_b-root / xfs rw,relatime,attr2,inode64,noquota 0 0
systemd-1 /proc/sys/fs/binfmt_misc autofs rw,relatime,fd=36,pgrp=1,
timeout=300,minproto=5,maxproto=5,direct,pipe_ino=37070 0 0
mqueue /dev/mqueue mqueue rw,relatime 0 0
debugfs /sys/kernel/debug debugfs rw,relatime 0 0
hugetlbfs /dev/hugepages hugetlbfs rw,relatime 0 0
nfsd /proc/fs/nfsd nfsd rw,relatime 0 0
/dev/sdc2 /boot xfs rw,relatime,attr2,inode64,noquota 0 0
/dev/sdc1 /boot/efi vfat rw,relatime,fmask=0077,dmask=
0077,codepage=437,iocharset=ascii,shortname=winnt,errors=remount-ro 0 0
/dev/mapper/centos_b-home /home xfs rw,relatime,attr2,inode64,noquota 0 0
sunrpc /var/lib/nfs/rpc_pipefs rpc_pipefs rw,relatime 0 0
tmpfs /run/user/54948 tmpfs rw,nosuid,nodev,relatime,size=
9884224k,mode=700,uid=54948,gid=100 0 0
/dev/sda /var/lib/docker xfs rw,relatime,inode64,noquota 0 0
/dev/sda /var/lib/docker/aufs xfs rw,relatime,inode64,noquota 0 0
none /var/lib/docker/aufs/mnt/6032d552977ac12c689ee7a038f1f9
0b48c239aa634b73cbc58bab41edeccba8 aufs
rw,relatime,si=c153134408543131,dio,dirperm1
0 0
shm /var/lib/docker/containers/623fd99d7802cb01fc9e8f44dd0ee5
2ebe353ae5d704e4436c95fc1e204f3827/shm tmpfs
rw,nosuid,nodev,noexec,relatime,size=65536k
0 0
nsfs /run/docker/netns/6c297567a4e0 nsfs rw 0 0

# ls /sys/module/aufs/*
/sys/module/aufs/uevent /sys/module/aufs/version

/sys/module/aufs/parameters:
allow_userns brs

# ls /sys/module/aufs/version
/sys/module/aufs/version

# cat /sys/module/aufs/version
4.x-rcN-20170522

# cat /sys/module/aufs/parameters/brs
1
# cat /sys/module/aufs/parameters/allow_userns
N


# ls /sys/fs/aufs/
si_c153134408543131

# ls /sys/fs/aufs/si_c153134408543131/
br0 br1 br2 br3 brid0 brid1 brid2 brid3 xi_path

# uname -a
Linux b 4.11.3-1.el7.centos.x86_64 #1 SMP Sat May 27 00:19:26 EDT 2017
x86_64 x86_64 x86_64 GNU/Linux
# dmesg | grep aufs
[ 6.713336] aufs 4.x-rcN-20170522
[77671.824164] aufs au_opts_verify:1585:dockerd[11934]: dirperm1 breaks the
protection by the permission bits on the lower branch

Thank you!

Vasily


On Wed, Jun 7, 2017 at 2:10 PM, <***@users.sourceforge.net> wrote:

> Hello Vasily,
>
> Vasily Tarasov:
> > 1. Writing in the *middle* of the file fails from inside the container
> with
> > ENOSUP error. Writes do work without errors if I overwrite the whole file
> > from the start.
> > 2. Underlying file system is XFS. If I use Ext4 it works.
> > 3. My aufs version is 4.x-rcN-20170522, kernel version is 4.11.3 (see
> more
> > details in the end of the e-mail).
>
> - Do you mean open, lseek, and write systemcalls? If so, try strace and
> find which systemcall returns the error.
> - Why do you use aufs4.x-rcN? It is not for v4.11 series. Use aufs4.11
> for your kernel version.
>
> Last but not lease, please provide these ifno.
>
> (from aufs README file)
> ----------------------------------------
> When you have any problems or strange behaviour in aufs, please let me
> know with:
> - /proc/mounts (instead of the output of mount(8))
> - /sys/module/aufs/*
> - /sys/fs/aufs/* (if you have them)
> - /debug/aufs/* (if you have them)
> - linux kernel version
> if your kernel is not plain, for example modified by distributor,
> the url where i can download its source is necessary too.
> - aufs version which was printed at loading the module or booting the
> system, instead of the date you downloaded.
> - configuration (define/undefine CONFIG_AUFS_xxx)
> - kernel configuration or /proc/config.gz (if you have it)
> - behaviour which you think to be incorrect
> - actual operation, reproducible one is better
> - mailto: aufs-users at lists.sourceforge.net
> ----------------------------------------
>
>
> J. R. Okajima
>
>
Vasily Tarasov
2017-06-08 00:56:52 UTC
Permalink
Thank you, J. R.!

Here is the content of the files:

# for f in /sys/fs/aufs/si_c15313440863e131/*; do echo $f; cat $f; done
/sys/fs/aufs/si_c15313440863e131/br0
/var/lib/docker/aufs/diff/7695f20e87e2346b7100f466beff845a0e79880945ca5f503bd3c29bfbb6ca68=rw
/sys/fs/aufs/si_c15313440863e131/br1
/var/lib/docker/aufs/diff/7695f20e87e2346b7100f466beff845a0e79880945ca5f503bd3c29bfbb6ca68-init=ro+wh
/sys/fs/aufs/si_c15313440863e131/br2
/var/lib/docker/aufs/diff/b9eb592be093122328b53fa360619755ef85cd26cd93abfd62676e923ba408cf=ro+wh
/sys/fs/aufs/si_c15313440863e131/br3
/var/lib/docker/aufs/diff/1964cd7e003935516b5374e1509556fca29919695a2c452ceb77b2ecf217ed2a=ro+wh
/sys/fs/aufs/si_c15313440863e131/brid0
64
/sys/fs/aufs/si_c15313440863e131/brid1
65
/sys/fs/aufs/si_c15313440863e131/brid2
66
/sys/fs/aufs/si_c15313440863e131/brid3
67
/sys/fs/aufs/si_c15313440863e131/xi_path
/dev/shm/aufs.xino

Best,
Vasily

On Wed, Jun 7, 2017 at 4:20 PM, <***@users.sourceforge.net> wrote:

> Vasily Tarasov:
> > 1) Here is strace output:
> >
> > pwrite(6, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\
> 0\0\0"...,
> > 4096, 532480) = -1 EOPNOTSUPP (Operation not supported)
>
> Ok, pwrite(2) instead lseek + write.
>
>
> > 3) Quick, somewhat related, question. I do not see aufs4.11 branch in
> >
> > https://github.com/sfjro/aufs4-linux
> >
> > Do I look in the wrong place?
>
> Ah, sorry.
> aufs4.11 is not released yet because of a bug in mainline.
> I know that a patch already exists, but it is not merged into v4.11.x
> series yet.
> Anyway aufs4.x-rcN-20170522 should work for you.
>
>
> > # ls /sys/fs/aufs/si_c153134408543131/
> > br0 br1 br2 br3 brid0 brid1 brid2 brid3 xi_path
>
> I need the contents of these files.
>
> I am going to dive into XFS in v4.11.3 and
> https://github.com/bnied/kernel-ml-aufs (sources and procedure)
>
>
> J. R. Okajima
>
>
Vasily Tarasov
2017-06-08 23:11:46 UTC
Permalink
Thank you, J.R., that was fast! :)

Just to clarify, by "the fix won't be included in next Monday release" you
mean that you won't push this change to the public git until then?

Sorry, I'm not too familiar with Aufs release policy...

Best,
Vasily

On Thu, Jun 8, 2017 at 3:48 PM, <***@users.sourceforge.net> wrote:

> Vasily Tarasov:
> > Here is the content of the files:
>
> Ok, thanx.
> I could reproduce the problem on my local aufs4.11 (not 4.11.3), and
> here is a fix for you.
> Even if everything goes well, the fix won't be included in next Monday
> release since it is too late.
>
>
> J. R. Okajima
>
> commit aa77e32fdb0a56fed2d3120055f01c39cecb8add
> Author: J. R. Okajima <***@gmail.com>
> Date: Fri Jun 9 07:44:36 2017 +0900
>
> aufs: bugfix, copy-up using clone_file_range() on XFS branch
>
> XFS has its version and older one doesn't support clone_file_range().
> It
> simply returns EOPNOTSUPP. Additionally XFS acquires inode_lock for a
> normal read().
> This commit fixes the error path after trying clone_file_range().
>
> Reported-by: Vasily Tarasov <***@vasily.name>
> See-also: http://www.mail-archive.com/aufs-***@lists.sourceforge.
> net/msg05498.html
> Signed-off-by: J. R. Okajima <***@gmail.com>
>
> diff --git a/fs/aufs/cpup.c b/fs/aufs/cpup.c
> index 125fef6..5c51d94 100644
> --- a/fs/aufs/cpup.c
> +++ b/fs/aufs/cpup.c
> @@ -352,6 +352,41 @@ int au_copy_file(struct file *dst, struct file *src,
> loff_t len)
> return err;
> }
>
> +static int au_clone_or_copy(struct file *dst, struct file *src, loff_t
> len)
> +{
> + int err;
> + struct super_block *h_src_sb;
> + struct inode *h_src_inode;
> +
> + h_src_inode = file_inode(src);
> + h_src_sb = h_src_inode->i_sb;
> + if (!au_test_nfs(h_src_sb)) {
> + inode_unlock(h_src_inode);
> + err = vfsub_clone_file_range(src, dst, len);
> + inode_lock(h_src_inode);
> + } else
> + err = vfsub_clone_file_range(src, dst, len);
> + /* older XFS has a condition in cloning */
> + if (err != -EOPNOTSUPP)
> + goto out;
> +
> + /*
> + * XFS acquires inode_lock.
> + * the backend fs on NFS may not support cloning.
> + */
> + if (!au_test_xfs(h_src_sb))
> + err = au_copy_file(dst, src, len);
> + else {
> + inode_unlock(h_src_inode);
> + err = au_copy_file(dst, src, len);
> + inode_lock(h_src_inode);
> + }
> +
> +out:
> + AuTraceErr(err);
> + return err;
> +}
> +
> /*
> * to support a sparse file which is opened with O_APPEND,
> * we need to close the file.
> @@ -406,21 +441,9 @@ static int au_cp_regular(struct au_cp_generic *cpg)
> if (h_src_sb != file_inode(file[DST].file)->i_sb
> || !file[DST].file->f_op->clone_file_range)
> err = au_copy_file(file[DST].file, file[SRC].file,
> cpg->len);
> - else {
> - if (!au_test_nfs(h_src_sb)) {
> - inode_unlock(h_src_inode);
> - err = vfsub_clone_file_range(file[SRC].file,
> - file[DST].file,
> cpg->len);
> - inode_lock(h_src_inode);
> - } else
> - err = vfsub_clone_file_range(file[SRC].file,
> - file[DST].file,
> cpg->len);
> - if (unlikely(err == -EOPNOTSUPP && au_test_nfs(h_src_sb)))
> - /* the backend fs on NFS may not support cloning */
> - err = au_copy_file(file[DST].file, file[SRC].file,
> - cpg->len);
> - AuTraceErr(err);
> - }
> + else
> + err = au_clone_or_copy(file[DST].file, file[SRC].file,
> + cpg->len);
>
> /* i wonder if we had O_NO_DELAY_FPUT flag */
> if (tsk->flags & PF_KTHREAD)
>
>
Vasily Tarasov
2017-06-09 00:16:41 UTC
Permalink
Got it, thank you!

Vasily

On Thu, Jun 8, 2017 at 5:10 PM, <***@users.sourceforge.net> wrote:

> Vasily Tarasov:
> > Just to clarify, by "the fix won't be included in next Monday release"
> you
> > mean that you won't push this change to the public git until then?
> >
> > Sorry, I'm not too familiar with Aufs release policy...
>
> Basically I release or used to release aufs every Monday. Needless to
> say, when there is nothing changed, the release will be skipped. I am
> testing for every release which takes several days.
>
> Now I am testing the patch I've sent (and found a problem). The whole
> test won't complete before next Monday. That's why "the fix won't be
> included in next Monday release."
>
> Here is a refined one. If you build aufs module by yourself. Use this.
>
>
> J. R. Okajima
>
> commit eaa5a0db8e6977c884cb0d3c99f30c91d3aaba50
> Author: J. R. Okajima <***@gmail.com>
> Date: Fri Jun 9 07:44:36 2017 +0900
>
> aufs: bugfix, copy-up using clone_file_range() on XFS branch
>
> XFS has its version and older one doesn't support clone_file_range().
> It
> simply returns EOPNOTSUPP. Additionally XFS acquires inode_lock for a
> normal read().
> This commit fixes the error path after trying clone_file_range().
>
> Reported-by: Vasily Tarasov <***@vasily.name>
> See-also: http://www.mail-archive.com/aufs-***@lists.sourceforge.
> net/msg05498.html
> Signed-off-by: J. R. Okajima <***@gmail.com>
>
> diff --git a/fs/aufs/cpup.c b/fs/aufs/cpup.c
> index 125fef6..dd9f90a 100644
> --- a/fs/aufs/cpup.c
> +++ b/fs/aufs/cpup.c
> @@ -352,6 +352,59 @@ int au_copy_file(struct file *dst, struct file *src,
> loff_t len)
> return err;
> }
>
> +static int au_do_copy(struct file *dst, struct file *src, loff_t len)
> +{
> + int err;
> + struct super_block *h_src_sb;
> + struct inode *h_src_inode;
> +
> + h_src_inode = file_inode(src);
> + h_src_sb = h_src_inode->i_sb;
> +
> + /* XFS acquires inode_lock */
> + if (!au_test_xfs(h_src_sb))
> + err = au_copy_file(dst, src, len);
> + else {
> + inode_unlock(h_src_inode);
> + err = au_copy_file(dst, src, len);
> + inode_lock(h_src_inode);
> + }
> +
> + return err;
> +}
> +
> +static int au_clone_or_copy(struct file *dst, struct file *src, loff_t
> len)
> +{
> + int err;
> + struct super_block *h_src_sb;
> + struct inode *h_src_inode;
> +
> + h_src_inode = file_inode(src);
> + h_src_sb = h_src_inode->i_sb;
> + if (h_src_sb != file_inode(dst)->i_sb
> + || !dst->f_op->clone_file_range) {
> + err = au_do_copy(dst, src, len);
> + goto out;
> + }
> +
> + if (!au_test_nfs(h_src_sb)) {
> + inode_unlock(h_src_inode);
> + err = vfsub_clone_file_range(src, dst, len);
> + inode_lock(h_src_inode);
> + } else
> + err = vfsub_clone_file_range(src, dst, len);
> + /* older XFS has a condition in cloning */
> + if (unlikely(err != -EOPNOTSUPP))
> + goto out;
> +
> + /* the backend fs on NFS may not support cloning */
> + err = au_do_copy(dst, src, len);
> +
> +out:
> + AuTraceErr(err);
> + return err;
> +}
> +
> /*
> * to support a sparse file which is opened with O_APPEND,
> * we need to close the file.
> @@ -402,25 +455,7 @@ static int au_cp_regular(struct au_cp_generic *cpg)
> h_src_sb = h_src_inode->i_sb;
> if (!au_test_nfs(h_src_sb))
> IMustLock(h_src_inode);
> -
> - if (h_src_sb != file_inode(file[DST].file)->i_sb
> - || !file[DST].file->f_op->clone_file_range)
> - err = au_copy_file(file[DST].file, file[SRC].file,
> cpg->len);
> - else {
> - if (!au_test_nfs(h_src_sb)) {
> - inode_unlock(h_src_inode);
> - err = vfsub_clone_file_range(file[SRC].file,
> - file[DST].file,
> cpg->len);
> - inode_lock(h_src_inode);
> - } else
> - err = vfsub_clone_file_range(file[SRC].file,
> - file[DST].file,
> cpg->len);
> - if (unlikely(err == -EOPNOTSUPP && au_test_nfs(h_src_sb)))
> - /* the backend fs on NFS may not support cloning */
> - err = au_copy_file(file[DST].file, file[SRC].file,
> - cpg->len);
> - AuTraceErr(err);
> - }
> + err = au_clone_or_copy(file[DST].file, file[SRC].file, cpg->len);
>
> /* i wonder if we had O_NO_DELAY_FPUT flag */
> if (tsk->flags & PF_KTHREAD)
>
>
Loading...