Programming odds and ends — InfiniBand, RDMA, and low-latency networking for now.

Introducing s3fuse, a FUSE driver for Amazon S3

I temporarily lost access to some data not very long ago as a result of an unplanned outage, and the incident woke me up to the utility of offsite backup. I wanted something I could mount as a local file system under Linux, that I could access over the Web, and that was backed by a reasonably-reliable data storage infrastructure. There are some decent options out there — I eventually settled on Jungle Disk, which has Windows, Linux, and MacOS clients, as well as a Web client. Unfortunately, after a few weeks of use, an outage resulted in the loss of a non-trivial chunk of my Jungle Disk data. This prompted me to look into using Amazon’s S3 directly (rather than through Jungle Disk). There are several FUSE drivers for S3 in existence, but I wanted something written from the ground up with support for concurrent requests and extended attributes, and with a directory structure compatible with what Amazon’s S3 Web client expects. I also wanted to learn about Amazon Web Services and libcurl.

The end result is s3fuse, my FUSE driver for Amazon S3. It’s very much alpha-level, but it has the features I need and is reliable enough for my purposes. Try it out, feel free to comment, make changes, and report bugs.

Advertisements

44 responses

  1. s3fs lack of support for folders created with other software makes it unusable for me, so s3fuse sounds very promising. Unfortunately, i couldn’t build it where i need it most – on Amazon Linux.

    configure: error: Package requirements (libxml-2.0 >= 2.7.8 libxml++-2.6 >= 2.30.0) were not met:
    Requested ‘libxml-2.0 >= 2.7.8’ but version of libXML is 2.7.6
    No package ‘libxml++-2.6’ found

    Amazon doesn’t have libxml++ in it’s repository at all, and only 2.7.6 version of libXML 😦

    September 30, 2011 at 2:32 pm

    • libxml++ is available in EPEL, which you should be able to use from the Linux AMI. As for the version numbers — I’ve not tested this, but you should be able to modify configure.ac at line 25 to version 2.7.6 for libxml2 and 2.20 for libxml++-2.6. Run autoreconf --force --install afterwards, then ./configure. Let me know how it goes!

      October 1, 2011 at 2:56 pm

      • # yum install libxml++

        LError: Package: libxml++-2.30.0-1.el6.i686 (epel)
        Requires: libglibmm-2.4.so.1
        Error: Package: libxml++-2.30.0-1.el6.i686 (epel)
        Requires: libsigc-2.0.so.0

        # yum install libsigc
        Loaded plugins: fastestmirror, priorities, security, update-motd
        Loading mirror speeds from cached hostfile
        * amzn-main: packages.us-east-1.amazonaws.com
        * amzn-updates: packages.us-east-1.amazonaws.com
        * epel: mirror.itc.virginia.edu
        223 packages excluded due to repository priority protections
        Setting up Install Process
        No package libsigc available.
        Error: Nothing to do

        October 1, 2011 at 6:21 pm

        • You could try installing the missing packages manually.

          October 4, 2011 at 9:08 pm

          • I did manage to install libsigc and libglibmm (and *-devel) manually, and libxml++-devel using yum.

            autoreconf didn’t work, because Amazon Linux has autoconf version 2.63, and configure.ac requires 2.67, so i had to manually edit configure.

            Next i had to install boost-devel (configure didn’t mention that boost is required). Amazon Linux has boost-1.41, and s3fuze seems to require some later version, because i get undefined references to `boost::thread::join()’ etc.

            I’m not ready to download and upgrade so many packages each time i start an instance. Would be nice to have a version of s3fuse that is compatible with amazon versions of libxml, boost, etc. A binary RPM would be just great.

            October 4, 2011 at 10:21 pm

          • So here’s what i had to do:
            1) change configure.ac:
            -AC_PREREQ([2.67])
            +AC_PREREQ([2.63])
            -AC_CHECK_LIB([boost_thread], [main])
            +AC_CHECK_LIB([boost_thread-mt], [thread_proxy])
            -PKG_CHECK_MODULES([DEPS], [libxml-2.0 >= 2.7.6 libxml++-2.6 >= 2.30.0])
            +PKG_CHECK_MODULES([DEPS], [libxml-2.0 >= 2.7.6 libxml++-2.6 >= 2.28.0])
            2) autoreconf -i:
            twice. it gave 3 warnings like this: configure.ac:2: warning: AC_INIT: not a literal: m4_esyscmd_s(echo $VERSION), and aborted with error on the first run. The second run produced 4 such warnings, but didn’t terminate.

            Unfortunately, it still doesn’t work:
            init: caught exception while initializing: curl reports unsupported non-OpenSSL SSL library. cannot continue.

            Amazon Linux uses NSS instead of OpenSSL…

            October 10, 2011 at 9:32 pm

  2. It seems to be leaking file descriptors. Of course it might be file cache, but i couldn’t find any options controlling cache size. After i read a hundred files from an s3fuse fs, s3fuse process has one open file descriptor for each file that has been read, and the free disk space also decreases by the amount corresponding to the total size of the files that had been read.

    Here’s how the leaked fds look like:

    ls -laps /proc/25879/fd/1??

    0 lrwx—— 1 root root 64 Mar 5 20:14 /proc/25879/fd/145 -> /tmp/s3fuse.local-6llfW8 (deleted)
    0 lrwx—— 1 root root 64 Mar 5 20:14 /proc/25879/fd/146 -> /tmp/s3fuse.local-zoNlSM (deleted)
    0 lrwx—— 1 root root 64 Mar 5 20:14 /proc/25879/fd/147 -> /tmp/s3fuse.local-QRQr1A (deleted)

    March 5, 2012 at 2:19 pm

    • Good catch — I must have introduced a regression at some point with my open_file changes. I’ve fixed this in the trunk, but haven’t built new packages yet. What did you change to get your Amazon RPMs to build? I can incorporate those changes into the next set of packages I build.

      March 10, 2012 at 5:03 pm

      • Thanks, that’s great. Because reverting to s3fs was not an option for me due to the ugly way it handles directories.

        I only had to change configure.ac, and i guess my modified version only works with Amazon-Linux/RHEL6.

        Index: configure.ac
        ===================================================================
        — configure.ac (revision 189)
        +++ configure.ac (working copy)
        @@ -1,4 +1,6 @@
        -AC_PREREQ([2.67])
        +AC_PREREQ([2.63])
        +m4_define([m4_chomp_all], [m4_format([[%.*s]], m4_bregexp(m4_translit([[$1]], [/], [/ ]), [/*$]), [$1])])
        +m4_define([m4_esyscmd_s], [m4_chomp_all(m4_esyscmd([$1]))])
        AC_INIT([FUSE Driver for Amazon S3], m4_esyscmd_s([echo $VERSION]), [], [s3fuse])
        AM_INIT_AUTOMAKE([no-define foreign])
        AC_CONFIG_SRCDIR([src/logger.h])
        @@ -18,11 +20,12 @@
        AM_CONDITIONAL([BUILD_RPM], [test x$build_rpm = xtrue])

        AC_CHECK_LIB([boost_thread], [main])
        +AC_CHECK_LIB([boost_thread-mt], [thread_proxy])
        AC_CHECK_LIB([curl], [Curl_perform])
        AC_CHECK_LIB([fuse], [fuse_opt_parse])
        AC_CHECK_LIB([pthread], [pthread_create])

        -PKG_CHECK_MODULES([DEPS], [libxml-2.0 >= 2.7.8 libxml++-2.6 >= 2.30.0])
        +PKG_CHECK_MODULES([DEPS], [libxml-2.0 >= 2.7.6 libxml++-2.6 >= 2.28.0])

        AC_CHECK_HEADERS([inttypes.h stdint.h stdlib.h string.h sys/time.h syslog.h])

        March 10, 2012 at 5:11 pm

        • Thanks. I removed the m4_esyscmd_s dependency altogether, and changed the package versions to match yours. You should be able to fetch the latest sources and build an RPM without any modification. Let me know if this works.

          March 10, 2012 at 6:09 pm

          • Works perfectly out-of-the box. Thanks again!

            March 10, 2012 at 6:25 pm

          • Fantastic! I’m glad it works.

            March 10, 2012 at 6:33 pm

  3. Oops, one more problem – wanted to test it with Google Storage, and s3fuse_gs_get_token crashes after i enter the token… That’s because i have NSS instead of OpenSLL, so ssl_locks init didn’t initialize the locks, but teardown tries to destroy them.

    Here’s the fix:

    Index: src/ssl_locks.cc
    ===================================================================
    — src/ssl_locks.cc (revision 192)
    +++ src/ssl_locks.cc (working copy)
    @@ -88,6 +88,9 @@

    void teardown()
    {
    + if (NULL == s_openssl_locks)
    + return;
    +
    CRYPTO_set_id_callback(NULL);
    CRYPTO_set_locking_callback(NULL);

    March 10, 2012 at 7:32 pm

  4. And one more problem: i remove a directory with 4 files in it, and try to copy them again, but often some of the files are somewhere in between existing and not existing. They aren’t visible in other tools, so i guess this is some problem with caching.

    [root@ip-10-85-33-77 RPMS]# rm -rf /mnt/s3.cuetools.net/RPMS/repodata
    [root@ip-10-85-33-77 RPMS]# cp -r repodata/ /mnt/s3.cuetools.net/RPMS/
    cp: cannot create regular file `/mnt/s3.cuetools.net/RPMS/repodata/filelists.xml.gz’: File exists
    [root@ip-10-85-33-77 RPMS]# ll /mnt/s3.cuetools.net/RPMS/repodata
    total 8
    -rw-r–r– 1 root root 2045 Mar 11 01:21 other.xml.gz
    -rw-r–r– 1 root root 4138 Mar 11 01:21 primary.xml.gz
    -rw-r–r– 1 root root 1359 Mar 11 01:21 repomd.xml
    [root@ip-10-85-33-77 RPMS]# cp -i -r repodata/ /mnt/s3.cuetools.net/RPMS/
    cp: overwrite `/mnt/s3.cuetools.net/RPMS/repodata/repomd.xml’? y
    cp: overwrite `/mnt/s3.cuetools.net/RPMS/repodata/other.xml.gz’? y
    cp: overwrite `/mnt/s3.cuetools.net/RPMS/repodata/primary.xml.gz’? y
    cp: overwrite `/mnt/s3.cuetools.net/RPMS/repodata/filelists.xml.gz’? y
    cp: cannot create regular file `/mnt/s3.cuetools.net/RPMS/repodata/filelists.xml.gz’: No such file or directory
    [root@ip-10-85-33-77 RPMS]# umount /mnt/s3.cuetools.net/
    [root@ip-10-85-33-77 RPMS]# mount /mnt/s3.cuetools.net/
    [root@ip-10-85-33-77 RPMS]# cp -i -r repodata/ /mnt/s3.cuetools.net/RPMS/
    cp: overwrite `/mnt/s3.cuetools.net/RPMS/repodata/repomd.xml’? y
    cp: overwrite `/mnt/s3.cuetools.net/RPMS/repodata/other.xml.gz’? y
    cp: overwrite `/mnt/s3.cuetools.net/RPMS/repodata/primary.xml.gz’? y
    [root@ip-10-85-33-77 RPMS]# ll /mnt/s3.cuetools.net/RPMS/repodata
    total 11
    -rw-r–r– 1 root root 2993 Mar 11 01:22 filelists.xml.gz
    -rw-r–r– 1 root root 2045 Mar 11 01:21 other.xml.gz
    -rw-r–r– 1 root root 4138 Mar 11 01:21 primary.xml.gz
    -rw-r–r– 1 root root 1359 Mar 11 01:21 repomd.xml

    March 10, 2012 at 7:38 pm

    • I think I just checked in a fix for this, but I wasn’t able to replicate in the first place so I can’t verify that this fixes anything.

      March 11, 2012 at 4:49 pm

      • Unfortunately it doesn’t seem to help 😦 It might be happening a bit less frequently now, but i can’t be sure.

        [root@ip-10-85-33-77 RPMS]# umount /mnt/s3.cuetools.net/ ; mount /mnt/s3.cuetools.net/
        [root@ip-10-85-33-77 RPMS]# rm -rf /mnt/s3.cuetools.net/RPMS/repodata ; cp -r repodata/ /mnt/s3.cuetools.net/RPMS/
        [root@ip-10-85-33-77 RPMS]# rm -rf /mnt/s3.cuetools.net/RPMS/repodata ; cp -r repodata/ /mnt/s3.cuetools.net/RPMS/

        [root@ip-10-85-33-77 RPMS]# rm -rf /mnt/s3.cuetools.net/RPMS/repodata ; cp -r repodata/ /mnt/s3.cuetools.net/RPMS/
        [root@ip-10-85-33-77 RPMS]# rm -rf /mnt/s3.cuetools.net/RPMS/repodata ; cp -r repodata/ /mnt/s3.cuetools.net/RPMS/
        cp: cannot create regular file `/mnt/s3.cuetools.net/RPMS/repodata/other.xml.gz’: File exists

        March 11, 2012 at 5:42 pm

        • I’m still not able to reproduce this behavior. Can you try running s3fuse in the foreground with verbose logging (s3fuse -f -v -v -v)? I’m particularly interested in knowing if you see something like “attempt to overwrite object at path xyz”.

          March 11, 2012 at 6:02 pm

          • No, when it happens – log is just shorter, as if it didn’t even attempt to do anything for the problem file.

            Also, when it gets to this state, and i try to copy the problem file specifically, no log messages are produced:

            [root@ip-10-85-33-77 RPMS]# cp repodata/filelists.xml.gz /mnt/s3.cuetools.net/RPMS/repodata/filelists.xml.gz
            cp: cannot create regular file `/mnt/s3.cuetools.net/RPMS/repodata/filelists.xml.gz’: No such file or directory

            March 11, 2012 at 6:21 pm

          • Disregard the last part, i’m feeling under the weather so i do stupid things – the directory didn’t exist at this point.

            So here’s what happens when i try to copy a problem file:

            [root@ip-10-85-33-77 RPMS]# ll /mnt/s3.cuetools.net/RPMS/repodata/
            total 9
            -rw-r–r– 1 root root 2311 Mar 11 23:25 other.xml.gz
            -rw-r–r– 1 root root 4756 Mar 11 23:25 primary.xml.gz
            -rw-r–r– 1 root root 1359 Mar 11 23:25 repomd.xml
            [root@ip-10-85-33-77 RPMS]# cp repodata/filelists.xml.gz /mnt/s3.cuetools.net/RPMS/repodata/filelists.xml.gz
            cp: overwrite `/mnt/s3.cuetools.net/RPMS/repodata/filelists.xml.gz’? y
            cp: cannot create regular file `/mnt/s3.cuetools.net/RPMS/repodata/filelists.xml.gz’: No such file or directory

            And here’s a log:

            readdir: path: /RPMS/repodata
            open: path: /RPMS/repodata/filelists.xml.gz
            open_file::open_file: opening [RPMS/repodata/filelists.xml.gz] in [/tmp/s3fuse.local-kdl2dy].
            open_file::init: file [RPMS/repodata/filelists.xml.gz] ready.
            object_cache::open_handle: failed to open file [RPMS/repodata/filelists.xml.gz] with error -2.
            open_file::~open_file: closing temporary file for [RPMS/repodata/filelists.xml.gz].

            March 11, 2012 at 6:28 pm

          • Also, it happens much less frequently with logging turned on.

            March 11, 2012 at 6:32 pm

          • I have a suspicion that the error is triggered by effects of S3’s eventual consistency. For example, folder creation was not propagated to all servers yet before we try to create a file in this folder. Propagation usually takes fractions of a second, so any additional delay for e.g. logging reduces the chances of that.

            So to reproduce it you probably need to run s3fuse on EC2 host in the same availability zone as the S3 bucket, as i do.

            March 12, 2012 at 5:28 pm

          • Yeah, that’s probably the root cause, but I think what’s happening here is that the file is still around when cp calls stat() to check if the target file exists. This causes the file metadata to be loaded into the object cache, but when cp calls open() on the file, it fails (with an HTTP 404 error). I just checked in a change that removes the metadata from the object cache if the open() call fails. This likely won’t fix your problem, but it should at least prevent you from having to remount (or waiting until the cache entry expires).

            March 12, 2012 at 8:45 pm

          • The fix didn’t seem to work 😦 Errors still accumulate until remount

            [root@ip-10-85-33-77 RPMS]# for i in {1..40}; do echo $i; rm -rf /mnt/s3.cuetools.net/RPMS/repodata ; cp -r repodata/ /mnt/s3.cuetools.net/RPMS/ ; done
            1
            ..

            12
            13
            cp: cannot create regular file `/mnt/s3.cuetools.net/RPMS/repodata/other.xml.gz’: File exists
            14
            cp: cannot create regular file `/mnt/s3.cuetools.net/RPMS/repodata/other.xml.gz’: File exists
            cp: cannot create regular file `/mnt/s3.cuetools.net/RPMS/repodata/filelists.xml.gz’: File exists
            ..

            29
            cp: cannot create regular file `/mnt/s3.cuetools.net/RPMS/repodata/other.xml.gz’: File exists
            cp: cannot create regular file `/mnt/s3.cuetools.net/RPMS/repodata/primary.xml.gz’: File exists
            cp: cannot create regular file `/mnt/s3.cuetools.net/RPMS/repodata/filelists.xml.gz’: File exists
            ..

            40
            cp: cannot create regular file `/mnt/s3.cuetools.net/RPMS/repodata/repomd.xml’: File exists
            cp: cannot create regular file `/mnt/s3.cuetools.net/RPMS/repodata/other.xml.gz’: File exists
            cp: cannot create regular file `/mnt/s3.cuetools.net/RPMS/repodata/primary.xml.gz’: File exists
            cp: cannot create regular file `/mnt/s3.cuetools.net/RPMS/repodata/filelists.xml.gz’: File exists

            March 12, 2012 at 9:15 pm

          • I think now you’re bumping into the eventual-consistency issue. Before, the copy was failing with ENOENT. Here, you’re getting EEXIST, which suggests an attempt to create a file where one already exists (that is, has yet to be deleted).

            March 12, 2012 at 9:21 pm

  5. Some more strange behaviour – first time when i copy a directory, it is created and copied, but the second time i do this, it’s contents is copied instead:

    [root@ip-10-85-33-77 RPMS]# mkdir /mnt/s3.cuetools.net/test1
    [root@ip-10-85-33-77 RPMS]# cp -r repodata/ /mnt/s3.cuetools.net/test1/
    [root@ip-10-85-33-77 RPMS]# ll /mnt/s3.cuetools.net/test1
    total 0
    drwxr-xr-x 1 root root 0 Mar 11 01:42 repodata
    [root@ip-10-85-33-77 RPMS]# rm -rf /mnt/s3.cuetools.net/test1
    [root@ip-10-85-33-77 RPMS]# cp -r repodata/ /mnt/s3.cuetools.net/test1/
    [root@ip-10-85-33-77 RPMS]# ll /mnt/s3.cuetools.net/test1
    total 11
    -rw-r–r– 1 root root 2993 Mar 11 01:43 filelists.xml.gz
    -rw-r–r– 1 root root 2045 Mar 11 01:43 other.xml.gz
    -rw-r–r– 1 root root 4138 Mar 11 01:43 primary.xml.gz
    -rw-r–r– 1 root root 1359 Mar 11 01:43 repomd.xml

    March 10, 2012 at 7:44 pm

    • Nevermind, i’m an idiot – forgot to create test1 the second time 🙂

      March 10, 2012 at 7:53 pm

  6. i’m running ‘ls’ on a folder with about 400k files, and each time s3fuse leaks about 70m of memory.

    March 12, 2012 at 1:53 pm

    • It also continues to eat up CPU time after ‘ls’ is done:
      11689 root 20 0 250m 77m 4176 S 10.6 12.9 0:31.76 s3fuse

      March 12, 2012 at 2:30 pm

      • This is because when readdir() is called, s3fuse’s background worker threads will populate the object cache with the metadata for the files in the directory you just listed. It can take some time to process all the files in a large directory.

        March 12, 2012 at 8:47 pm

        • I’m not sure that’s desirable behavior.

          March 12, 2012 at 8:57 pm

          • It is if you’re trying to optimize for interactive use. You can set precache_on_readdir to false (after fetching r197) to disable this behavior.

            March 12, 2012 at 9:05 pm

          • Thanks a lot, that will come in handy.

            March 12, 2012 at 9:20 pm

    • The metadata cache isn’t pruned periodically — objects that expire are removed only on access. In retrospect this probably wasn’t a great decision. I’ll see what I can do about this.

      March 12, 2012 at 8:51 pm

      • There has to be at least an upper limit to cache size, because after several calls to ls it just crashes with ‘out of memory’ error.

        March 12, 2012 at 8:53 pm

        • Also, i don’t understand why each new call to ‘ls’ in the same folder allocates additional 70 megabytes – there can be several copies of metadata for the same file in the cache?

          March 12, 2012 at 8:55 pm

          • Do you have cache_directories enabled? There shouldn’t be more than one copy of the metadata for any given file.

            March 12, 2012 at 9:16 pm

          • No…

            March 12, 2012 at 9:20 pm

          • Yeah that’s going to require a little investigation on my part. Can you open an issue for this?

            March 12, 2012 at 9:23 pm

  7. Le Fleur

    i too am trying to install s3fuse on an aws box running RHEL any chance you could make those rpms public?

    Cheers

    June 1, 2012 at 7:01 am

  8. Jim

    Any way to support http vs https in the config file? I see the https in source, and I’m currently working on compiling from source, but it would have been a lot easier just to specify http the same as with the endpoint.
    Using https is giving me certificate errors on my Ubuntu EC2 instance when installed from the ppm.

    December 8, 2012 at 11:49 pm

    • I’ve committed a change for this (r421) and it’ll be in 0.13. I don’t want to break configuration files that already specify an endpoint, so I’ve added a new option, aws_use_ssl, that you can set to false.

      December 9, 2012 at 9:54 am

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s