Add sync flag to flatfs #30

rht · 2015-09-27T09:57:43Z

(should only be enabled for the paranoid, cc: @tv42 and ipfs-inactive/archives#20)

jbenet · 2015-09-27T14:52:01Z

Ugh -- writes to disk should be durable...

@whyrusleeping didn't you find that the syncs weren't the bottleneck?

rht · 2015-09-27T16:40:13Z

Ugh -- writes to disk should be durable...

cp doesn't do this [1], and not git by default either [2].

Perf increases by an order magnitude (1.5s -> 0.13s) with 10 1.1MB files, and 6min -> 1.7min with 999 1MB files. Of course 0.13s is still much slower compared to git/cp.

Reproducible test script (or if someone already wrote a benchmark script for git vs cp vs rsync etc) coming soon like @ion1 did in ipfs/kubo#1750.

[1] https://github.com/coreutils/coreutils/search?utf8=%E2%9C%93&q=fsync
[2] ipfs-inactive/archives#20 (comment) copied here for convenience

Also, found this http://git.kernel.org/cgit/git/git.git/tree/Documentation/config.txt#n693:

This is a total waste of time and effort on a filesystem that orders data writes properly, but can be useful for filesystems that do not use journalling (traditional UNIX filesystems) or that only journal metadata and not file contents (OS X's HFS+, or Linux ext3 with "data=writeback").

jbenet · 2015-09-27T23:42:36Z

Hm, how do @tv42 @whyrusleeping @cryptix @chriscool feel about this?

As far as IPFS is concerned, maybe we can make this a config setting, or even be able to do it per-command (like ipfs add --no-sync type of thing).

tv42 · 2015-09-28T00:31:58Z

I feel comparing cp to something providing services to other hosts is apples to oranges wrt crash safety.

The git config option seems to be only about loose objects; there still is an fsync_or_die in e.g. pack creation -- that is, there is an fsync for pushed data. Git is fast because of a series of smart design decisions, not because it's unsafe.

Would like to hear what @whyrusleeping has to say about his experiments.

jbenet · 2015-09-28T01:49:52Z

I feel comparing cp to something providing services to other hosts is apples to oranges wrt crash safety.

👍

rht · 2015-09-28T01:59:30Z

I feel comparing cp to something providing services to other hosts is apples to oranges wrt crash safety.

ipfs add is used under several conditions, some of which intersects with cp (offline add where the blocks being added don't have to be streamed live), while some others (distributed db on ipfs) don't.

Ok the fsync flag in git is only used in loose object writes.
Maybe fsync is not git's bottleneck (since its perf ~ cp perf), but it is for flatfs.
edit: but git is fast even with fsync.

chriscool · 2015-09-28T02:22:52Z

When writing many files or creating many directories, I think it is probably overkill to fsync after each write. It might be a good idea to sync when all the writes are done though.

chriscool · 2015-09-28T02:43:31Z

I mean that it is impossible anyway to prevent crashes from happening while writing files (or between writing files), but it could be a good idea to sync after writing all the stuff and before advertising it on the network, to make sure that everything advertised has been writen on disk at one point. Though it is of course impossible to ensure that a disk crash will not wipe out the content anyway.

jbenet · 2015-09-28T03:53:47Z

@chriscool yeah, i believe that's what @whyrusleeping's attempt did previously -- fsync at the and of all writes. would that be good enough for you rht?

also if we had a write-ahead log style arena-storage repo, it might work well.

rht · 2015-09-28T07:12:46Z

The current implementation is already to fsync after all files have been written to tmpfiles, then mv to 'blocks/'.
This PR is meant to make these explicit fsync calls optional.

I have no idea what magic git uses, but I'd opt for no-sync as default, and only do ipfs add --sync for ipdb. My reason: the former case ("offline" batch add) is more common today.

Still relevant today: https://web.archive.org/web/20150315020954/http://thunk.org/tytso/blog/2009/03/15/dont-fear-the-fsync/ ?

jbenet · 2015-09-28T07:23:49Z

I think right now, if we add the option to disable fsyncs, it should be an explicit opt in.

I think later, we can move to mmapped files and perhaps a journal. That will likely have a different sync model with different knobs to turn.

jbenet · 2015-09-28T07:25:49Z

flatfs/flatfs.go

 }

 var _ datastore.Datastore = (*Datastore)(nil)

-func New(path string, prefixLen int) (*Datastore, error) {
+func New(path string, prefixLen int, sync bool) (*Datastore, error) {


perhaps we should keep New as is, setting sync to true and just add another constructor:

NewSync(...)

with the bool flag option. this prevents changing the interface on whoever depends on this code

Enabled for the paranoid

rht · 2015-09-28T11:39:43Z

(assuming sync is disabled.
Still with the 10 1.1MB files test, the bulk of the 0.13s comes from merkledag pb Marshal, in particular the line https://github.com/ipfs/go-ipfs/blob/master/merkledag/pb/merkledag.pb.go#L557 (~100ms).
One option is to use unsafe_marshaler_all in the protobuf generation.
The doc says, https://github.com/gogo/protobuf/blob/master/plugin/marshalto/marshalto.go#L44, but currently checking how fast it could be.
e.g. syscall.Read in zsyscall_linux_*.go opts to use unsafe pointer.
)

we can move to mmapped files and perhaps a journal.

Mind to elaborate?
For the latter, what if the host fs is already journaled?
Here is another compromise: run this once after all the files have been written to tmpfiles https://golang.org/pkg/syscall/#Sync.

cryptix · 2015-09-28T11:53:30Z

If this is about ipfs add performance and not a consistency issue, it feels like a band-aid patch to me.

rht · 2015-09-28T13:08:05Z

@cryptix the per-file sync is the most significant bottleneck. Even if add perf issue were to be addressed later, it will have to go down first.

chriscool · 2015-09-28T15:46:47Z

Here is another compromise: run this once after all the files have been written to tmpfiles
https://golang.org/pkg/syscall/#Sync.

Yeah, this might be worth trying. It should just call: https://www.cl.cam.ac.uk/cgi-bin/manpage?2+sync and might be more efficient than many calls to fsync

rht · 2015-09-28T16:45:45Z

(This still does not guarantee data integrity: modern disks have large caches.)

With syscall.Sync(): 1.3s -> ~0.9s.

chriscool · 2015-09-28T20:15:00Z

@rht did you try with 10 1.1MB files? What about with 999 1MB files?

jbenet · 2015-09-28T20:16:49Z

It would be very useful to have good benchmarks and graphs of all this, across disk types, host fses, and OSes. Maybe we can include this in our thought re golang/build

—
Sent from Mailbox

On Mon, Sep 28, 2015 at 4:15 PM, Christian Couder
[email protected] wrote:

@rht did you try with 10 1.1MB files? What about with 999 1MB files?

Reply to this email directly or view it on GitHub:
#30 (comment)

rht · 2015-09-29T01:50:24Z

10 1MB files (actually it is 1): 1.3s -> 0.9s.
999 1MB files: 6 min -> 1.7.

(making the benchmark. should including adding updated files as well)

tv42 · 2015-09-29T02:28:57Z

Does that calling code use whyrusleeping's batching feature?

rht · 2015-09-29T02:46:46Z

Iirc the batching is used in the Commit() call, since flatfsBatch uses putMany, then yes.
I modified doPut and putMany to use syscall.Read().

(but then again, fsync is required for ipdb, but only for ipdb use case)

rht · 2015-11-06T02:06:14Z

What is the decision on the --no-sync flag?
Benchmark is shown in #31, and it hasn't changed much in the latest master ipfs/kubo@b508c23.

Notice that even sqlite has http://www.sqlite.org/pragma.html#pragma_synchronous.

Here is the rationale behind the rare usage of fsync in git: https://lwn.net/Articles/326505/, https://plus.google.com/+JonathanCorbet/posts/JBxiKPe3VXa (+comments).

rht · 2015-11-13T03:10:50Z

(duplicate of #32)

rht force-pushed the syncflag branch from 7332f0c to 4feefce Compare September 27, 2015 16:44

jbenet reviewed Sep 28, 2015
View reviewed changes

rht force-pushed the syncflag branch from 4feefce to 5cce390 Compare September 28, 2015 07:29

Add sync flag to flatfs

c37960d

Enabled for the paranoid

rht force-pushed the syncflag branch from 5cce390 to c37960d Compare September 28, 2015 07:31

This was referenced Nov 7, 2015

Add 'nosync' option ipfs/kubo#1616

Closed

add option to disable syncing #32

Merged

rht closed this Nov 13, 2015

rht deleted the syncflag branch November 13, 2015 03:21

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add sync flag to flatfs #30

Add sync flag to flatfs #30

rht commented Sep 27, 2015

jbenet commented Sep 27, 2015

rht commented Sep 27, 2015

jbenet commented Sep 27, 2015

tv42 commented Sep 28, 2015

jbenet commented Sep 28, 2015

rht commented Sep 28, 2015

chriscool commented Sep 28, 2015

chriscool commented Sep 28, 2015

jbenet commented Sep 28, 2015

rht commented Sep 28, 2015

jbenet commented Sep 28, 2015

jbenet Sep 28, 2015

rht commented Sep 28, 2015

cryptix commented Sep 28, 2015

rht commented Sep 28, 2015

chriscool commented Sep 28, 2015

rht commented Sep 28, 2015

chriscool commented Sep 28, 2015

jbenet commented Sep 28, 2015

@rht did you try with 10 1.1MB files? What about with 999 1MB files?

rht commented Sep 29, 2015

tv42 commented Sep 29, 2015

rht commented Sep 29, 2015

rht commented Nov 6, 2015

rht commented Nov 13, 2015

Add sync flag to flatfs #30

Add sync flag to flatfs #30

Conversation

rht commented Sep 27, 2015

jbenet commented Sep 27, 2015

rht commented Sep 27, 2015

jbenet commented Sep 27, 2015

tv42 commented Sep 28, 2015

jbenet commented Sep 28, 2015

rht commented Sep 28, 2015

chriscool commented Sep 28, 2015

chriscool commented Sep 28, 2015

jbenet commented Sep 28, 2015

rht commented Sep 28, 2015

jbenet commented Sep 28, 2015

jbenet Sep 28, 2015

Choose a reason for hiding this comment

rht commented Sep 28, 2015

cryptix commented Sep 28, 2015

rht commented Sep 28, 2015

chriscool commented Sep 28, 2015

rht commented Sep 28, 2015

chriscool commented Sep 28, 2015

jbenet commented Sep 28, 2015

@rht did you try with 10 1.1MB files? What about with 999 1MB files?

rht commented Sep 29, 2015

tv42 commented Sep 29, 2015

rht commented Sep 29, 2015

rht commented Nov 6, 2015

rht commented Nov 13, 2015