Merge "doc: Last write wins behaviour"

This commit is contained in:
Jenkins 2015-12-08 14:00:40 +00:00 committed by Gerrit Code Review
commit 55857a2e21
2 changed files with 83 additions and 2 deletions

View File

@ -3,7 +3,7 @@
# Swift-on-File
Swift-on-File is a Swift Object Server implementation that enables users to
access the same data, both as an object and as a file. Data can be stored and
retrieved through Swift's REST interface or as files from NAS interfaces
retrieved through Swift's REST interface or as files from NAS interfaces
including native GlusterFS, GPFS, NFS and CIFS.
Swift-on-File is to be deployed as a Swift [storage policy](http://docs.openstack.org/developer/swift/overview_policies.html),
@ -49,6 +49,10 @@ Currently, files added over a file interface (e.g., native GlusterFS), do not sh
up in container listings, still those files would be accessible over Swift's REST
interface with a GET request. We are working to provide a solution to this limitation.
There is also subtle but very important difference in the implementation of
[last write wins](doc/markdown/last_write_wins.md) behaviour when compared to
OpenStack Swift.
Because Swift-On-File relies on the data replication support of the filesystem the Swift
Object replicator process does not have any role for containers using the Swift-on-File
storage policy. This means that Swift geo replication is not available to objects in
@ -72,4 +76,4 @@ or find us in the #swiftonfile channel on Freenode.
# Guides to get started:
1. [Quick Start Guide with XFS/GlusterFS](doc/markdown/quick_start_guide.md)
1. [Developer Guide](doc/markdown/dev_guide.md)
2. [Developer Guide](doc/markdown/dev_guide.md)

View File

@ -0,0 +1,77 @@
###Last write wins: Swift vs Swift-on-File
**OpenStack Swift:** The timestamp assigned to the request by proxy server
ultimately decides which *last* write wins. A `201 Created` is sent as
response to both the clients. Example:
Transaction T1 at time = t seconds:
`curl -i http://vm1:8080/v1/AUTH_abc/c1/o1 -X PUT -T /tmp/reallybigfile`
(Assume it takes 5 seconds to upload the *reallybigfile*)
Transaction T2 at time = (t + 0.01) seconds:
`curl -i http://vm1:8080/v1/AUTH_abc/c1/o1 -X PUT -d 'tinydata'`
(Assume it takes 1 second to upload this tiny data)
Here T2 wins although T1 will complete last. This is because T2 was the last
one to reach proxy server (the client facing server process that tags the
request with a timestamp).
Simultaneous PUT and DELETE illustrated: If a client has a long running PUT to
`AUTH_abc/c1/o1` and another client issues a DELETE on `AUTH_abc/c1/o1`
during the ongoing upload:
1. If the object existed earlier, the tombstone(.ts) created by DELETE request
will have a later timestamp and will eventually take precedence even if the
other upload finishes after. In effect, to any new clients performing a
HEAD/GET on `AUTH_abc/c1/o1`, the client would receive a `HTTP 404` response
as the object stands deleted.
2. If object did not exist earlier, the client doing a DELETE would recieve a
`HTTP 404` response. The client doing the PUT request will successfully
upload the object.
**Swift-on-File:** Unlike in vanilla OpenStack Swift, Swift-on-File does not
honour the timestamp set on request by the proxy server to decide which of
the write is the "last" one. In Swift-on-File, the last write to complete
(at the filesystem layer) is the one that wins. Example:
Transaction T1 at time = t seconds:
`curl -i http://vm1:8080/v1/AUTH_abc/c1/o1 -X PUT -T /tmp/reallybigfile`
(Assume it takes 5 seconds to upload the *reallybigfile*)
Transaction T2 at time = (t + 0.01) seconds:
`curl -i http://vm1:8080/v1/AUTH_abc/c1/o1 -X PUT -d 'tinydata'`
(Assume it takes 1 second to upload this tiny data)
Here T1 wins although T2 is the last transaction among the two to reach the
proxy server. This is because T1 was the last to complete and will overwrite
the object created by T2. For a small duration, between T2 completed and
T1 in progress, clients will be served the object created by T2.
Simultaneous PUT and DELETE illustrated: If a client has a long running PUT to
`AUTH_abc/c1/o1` and another client issues a DELETE on `AUTH_abc/c1/o1` during
the ongoing upload, the DELETE request would be responded with either of these:
* `HTTP 404` as the file does not exist because it's still being uploaded to
a temp path and rename has not been performed yet.
* `HTTP 204` as an older version of the file existed and the DELETE was
successful.
In effect, after completion of both PUT and DELETE, to any new client
performing a HEAD/GET on `AUTH_abc/c1/o1`, the client would receive the newer
object uploaded by the last PUT operation.
###Access from fileystem interface
Operations done solely from Swift interface will create/modify the objects
atomically. A PUT would result in data being written to a temporary file and
when the write of data and metadata is complete, the temporary file is renamed
to it's actual name. Hence, any client accessing the file/object from file
interface or Swift interface will see the file in a consistent state (either
the previous version or the newer version).
However, it's different when you create/modify file from filesystem interface.
As the file is written to it's actual path and not some temporary location,
a GET on the file from Swift interface while the file is being written from
filesystem interface might result in the Swift client getting partial file.
In other words, swiftonfile will serve the object present in filesystem
"as is" without checking if the file is being written or not.