Arc Forumnew | comments | leaders | submitlogin
1 point by venizio 5646 days ago | link | parent

a bit surprised there isn't a sync call in writefile before the rename. how come? if there is data in user space buffer (there's no fflush) and the rename gets committed it could lead to corruption if the process crashes. maybe?


1 point by CatDancer 5646 days ago | link

The data will be flushed when the file is closed, which happens before the rename. This means that the mzscheme process can be hard killed (kill -9) or can crash right after the rename, and the data will still be saved safely by the operating system.

sync, now, is completely different issue: that's for when you want to crash your operating system or power off your computer and have had your data physically written out to disk already. I've heard it said (though I don't know if this is true) that sync will really slow down your write performance, unless you carefully use a complicated technique such as first writing your updates to a journal log file (which can be replayed if your computer crashes) and then writing those changes to their permanent location (file or database tables).

-----

1 point by venizio 5646 days ago | link

hello,

i was missing the flush upon close. so for instance in this scenario:

1. write

2. close (flush happens, buffers sit on kernel space but not disk)

3. rename

4. a soft-update commit happens so the rename is already out on disk.

5. process crashes

the big question is then: will the os then eventually sync the data flushed in step 2? even after a process crash? i would think yes, but not sure.

an os crash at this point will probably just zero the file out. http://www.freebsd.org/doc/en/books/handbook/configtuning-di... (11.12.2.1)

i get your point regarding performance. arc does it right. it basically leverages file system magic to sync asynchronously at the (small) risk of zeroing out files on an os crash or power failure.

thanks.

-----

1 point by CatDancer 5646 days ago | link

Yes, once the operating system has the data it will write it out to disk, regardless of what happens to the process.

-----

1 point by venizio 5646 days ago | link

out of curiosity, do you know what is the backup policy? thanks @CatDancer.

-----

1 point by CatDancer 5646 days ago | link

the backup policy? of what?

-----

[deleted]
1 point by CatDancer 5646 days ago | link

Is your question, "how do I back up my data files written by my Arc program?"

In Unix, rename is an atomic operation in the kernel. Renaming a file does not immediately delete the old contents of the file, instead, the old contents are deleted when no process no longer has the file open.

Say you have some file "data" which contains "foo". You have some process B such as tar or rsync which is reading your files to back them up. Meanwhile Arc is writing "bar" to "data".

B opens "data" for reading.

Arc creates "data.tmp" and writes "bar" to it. Then Arc renames "data.tmp" to "data".

A process which now opened "data" would read "bar" from it. However B is still reading the previous version of the file, and so reads "foo".

Thus the backup process will always read either the complete previous version of the file, or the complete next version of the file.

(Whether or not the data has already been physically written out to disk makes no difference to what the processes see).

-----