Skip to content

Commit

Permalink
docs: update wording for juicefs_vs_s3fs.md (#4221)
Browse files Browse the repository at this point in the history
  • Loading branch information
CaitinChen authored Nov 30, 2023
1 parent 73dfd51 commit 2d3627b
Showing 1 changed file with 11 additions and 10 deletions.
21 changes: 11 additions & 10 deletions docs/en/introduction/comparison/juicefs_vs_s3fs.md
Original file line number Diff line number Diff line change
@@ -1,30 +1,31 @@
---
slug: /comparison/juicefs_vs_s3fs
description: This document compares S3FS and JuiceFS, examining their product positioning, architecture, caching, and features.
---

# JuiceFS vs. S3FS

[S3FS](https://github.com/s3fs-fuse/s3fs-fuse) is an open source tool developed in C++ that mounts S3 object storage locally via FUSE for read and write access as a local disk. In addition to Amazon S3, it supports all S3 API-compatible object stores.

In terms of basic functionality, S3FS and JuiceFS can both mount object storage bucket locally via FUSE and use them through POSIX interfaces. However, in terms of functional details and technical implementation, they are essentially different.
While both S3FS and JuiceFS share the basic functionality of mounting object storage buckets locally via FUSE and using them through POSIX interfaces, they differ significantly in functional details and technical implementation.

## Product positioning

S3FS is a utility that allows users to mount object storage buckets locally and read and write in a way that the users used to. It targets the general use scenarios that are not sensitive to performance and network latency.
S3FS is a utility that allows users to mount object storage buckets locally and read and write in a way that the users used to. It targets general use scenarios that are not sensitive to performance and network latency.

JuiceFS is a distributed file system with a unique approach to data management and a series of technical optimizations for high performance, reliability and security, which primarily addresses the storage needs of large volumes of data.
JuiceFS is a distributed file system with a unique approach to data management and a series of technical optimizations for high performance, reliability, and security. It primarily addresses the storage needs of large volumes of data.

## Architecture

S3FS does not do special optimization for files. It acts like an access channel between local and object storage, allowing the same content to be seen on the local mount point and the object storage browser, which makes it easy to use cloud storage locally. On the other hand, with this simple architecture, retrieving, reading and writing of files with S3FS require direct interaction with the object store, and network latency can impact strongly on performance and user experience.
S3FS does not do special optimization for files. It acts as an access channel between local and object storage, allowing the same content to be seen on the local mount point and the object storage browser. This makes it easy to use cloud storage locally. On the other hand, with this simple architecture, retrieving, reading, and writing files with S3FS require direct interaction with the object store, and network latency can impact strongly on performance and user experience.

JuiceFS uses a technical architecture that separates data and metadata, where any file is first split into data blocks according to specific rules before being uploaded to the object storage, and the corresponding metadata is stored in a separated database. The advantage of this is that retrieval of files and modification of metadata such as file names can directly interact with the database with a faster response, bypassing the network latency impact of interacting with the object store.
JuiceFS uses a architecture that separates data and metadata. Files are split into data blocks according to specific rules before being uploaded to object storage, and the corresponding metadata is stored in a separate database. The advantage of this is that retrieval of files and modification of metadata such as file names can directly interact with the database with a faster response, bypassing the network latency impact of interacting with the object store.

In addition, when processing large files, although S3FS can solve the problem of transferring large files by uploading them in chunks, the nature of object storage dictates that appending files requires rewriting the entire object. For large files of tens or hundreds of gigabytes or even terabytes, repeated uploads waste a lot of time and bandwidth resources.

JuiceFS avoids such problems by splitting individual files into chunks locally according to specific rules (default 4MiB) before uploading, regardless of their size. The rewriting and appending operations will eventually become new data blocks instead of modifying already generated data blocks, which greatly reduces the waste of time and bandwidth resources.
JuiceFS avoids such problems by splitting individual files into chunks locally according to specific rules (default 4MiB) before uploading, regardless of their size. The rewriting and appending operations will eventually become new data blocks instead of modifying already generated data blocks. This greatly reduces the waste of time and bandwidth resources.

For a detailed description of the architecture of JuiceFS, please refer to [documentation](../../introduction/architecture.md).
For a detailed description of the JuiceFS architecture, refer to the [documentation](../../introduction/architecture.md).

## Caching

Expand All @@ -34,11 +35,11 @@ S3FS does not limit the cache capacity by default, which may cause the cache to

JuiceFS uses a completely different caching approach than S3FS. First, JuiceFS guarantees data consistency. Secondly, JuiceFS defines a default disk cache usage limit of 100GiB, which can be freely adjusted by users as needed, and by default ensures that no more space is used when disk free space falls below 10%. When the cache usage limit reaches the upper limit, JuiceFS will automatically do cleanup using an LRU-like algorithm to ensure that cache is always available for subsequent read and write operations.

For more on JuiceFS caching, see [documentation](../../guide/cache.md).
For more information on JuiceFS caching, see the [documentation](../../guide/cache.md).

## Features

| | S3FS | JuiceFS |
| Comparison basis | S3FS | JuiceFS |
|---------------------------|----------------------------------------------------------------|----------------------------------------------|
| Data Storage | S3 | S3, other object storage, WebDAV, local disk |
| Metadata Storage | No | Database |
Expand All @@ -61,4 +62,4 @@ For more on JuiceFS caching, see [documentation](../../guide/cache.md).

## Additional notes

[OSSFS](https://github.com/aliyun/ossfs), [COSFS](https://github.com/tencentyun/cosfs), [OBSFS](https://github.com/huaweicloud/huaweicloud-obs-obsfs) are all derivatives based on S3FS and have essentially the same functional features and usage as S3FS.
[OSSFS](https://github.com/aliyun/ossfs), [COSFS](https://github.com/tencentyun/cosfs), and [OBSFS](https://github.com/huaweicloud/huaweicloud-obs-obsfs) are all derivatives based on S3FS and have essentially the same functional features and usage as S3FS.

0 comments on commit 2d3627b

Please sign in to comment.