CS475 Final Project, due 12/7/19, 9:00pm, NO LATE SUBMISSIONS ACCEPTED

Start by downloading the handout

In this final project, you will create a consistent, fault-tolerant distributed key/value store using ZooKeeper. The design of this system has the following goals in mind:

  1. Consistency: Key/value pairs can be created by any node. Any node can query the system to find the current value of a key. Replication will be sequentially consistent: no node will be able to read an out-of-date value.
  2. Fault tolerance: The system overall will tolerate a simultaneous failure of a minority of nodes. Any data that was held only by those nodes will be lost, but the system will otherwise continue to function.
  3. Partition awareness: If a participant becomes aware that it is in a minority partition, it will cease to operate and abort any pending operations.

Due to the added technical complexity of this project (including its use of a new-to-you technology, ZooKeeper), you may complete this project individually, or in groups up to size 3 (e.g. you, with 2 other team-mates). We are also trying something new with the test cases for this assignment: we are distributing some of the core test cases to you, but a significant portion of them will not be visible to you, and will be only on AutoLab. You will be strictly limited to 10 submissions on AutoLab, so you are very strongly encouraged to (1) find one/two team mates, and (2) reason through all of the possible failure modes of your clients so that you can get the assignment correct, even without repeatedly running all of the tests.

You implemented a distributed key-value store in Homework 4. The design of Homework 4’s key-value store did not meet the goals outlined above: if a single replica failed, then all writes might be blocked (since the server, which was coordinating the writes, wouldn’t be able to receive a response from a replica). Moreover, if the server failed, then the entire game was over: there would be no way for the remaining replicas to pick up the pieces and continue operating.

Homework 4 consisted of a key-value store that featured transactions. For this final project, you will not need to worry about transactions, but instead will focus on fault tolerance. For this project, you’ll implement a distributed key-value store with a rotating leader that is in charge of coordinating writes. The primary reason why we have designated a single node as the lock server in the past is because it eliminates the need for all of our replicas to come to some mutual agreement between each other of who the lock server (leader) is. Now, we will allow any replica to be the leader, and use ZooKeeper to perform this coordination between replicas so that they can establish which is the leader at any given point. Of course, if two replicas were to simultaneously consider themselves leader, then this would be bad, since each would assume it had free rein to perform updates.

The primary role of the leader is to keep track of (1) all of the keys and their values and (2) which nodes have a copy of each key. When a key’s value is updated, the leader will notify all clients that have a copy of that key to invalidate their cache.

Your key-value store will expose just two simple operations to the end-user:

  1. getValue(String key)
  2. setValue(String key, String value)

Overview

Reading values:

To read a key K, if node N currently has the value for that key cached (e.g. from previously reading it or writing it), then it will directly return it, without contacting any other node. Otherwise: To read a key K, a node N will contact the leader and retrieve the value. If the key is unknown to the leader, it will return  null. Clients must not cache  null values. If the key is known to the leader, it will record the fact that node N now has a cached copy of this key’s value.

If the node is disconnected from ZooKeeper: If node N does not currently hold a live ZooKeeper session (e.g. it timed out), then it will throw an IOException for any read operation. It is OK to return a stale, cached value if node N is disconnected from ZooKeeper but has not detected that disconnection yet.

If the node is not able to reach the leader and the leader is disconnected from ZooKeeper: If node N is unable to contact the leader but does currently hold a live ZooKeeper session (and the leader does not have a live ZooKeeper session), then it will first participate in the leader election described below, and then complete the read as described above.

If the node is not able to reach the leader, and the node does not have the value cached, but the leader is still active in ZooKeeper: If node N is unable to contact the leader, but ZooKeeper indicates that the leader and client are both still active in ZooKeeper, then the client must wait for the leader to become available (this is the default RMI behavior when an RMI endpoint is not reachable). You do not have to consider the case of the leader being available on ZooKeeper at the start of an operation, and then disconnecting from ZooKeeper while an operation is in progress. If you have the value cached, you can serve it directly.

Writing values

To write a key K, node N will ask the leader to update the value. The leader will:

  1. Notify all clients to invalidate their cache, and clear its list of clients which have this key cached
  2. Update the value
  3. Add node N to the (now emptied) list of clients with this key cached

It is very important that only one write to a key can occur at a time, and no reads can occur during a write.

If the node that wants to write the value is disconnected from ZooKeeper: If node N does not currently hold a live ZooKeeper session (e.g. it timed out), then it will throw an IOException for any write operation.

If the leader is disconnected from the client: If node N is not able to contact the leader, but does currently hold a live ZooKeeper session (and the leader does not), then it will first participate in a leader election described below, and then complete the write as described above.

If the node is able to reach the leader but the leader is disconnected from ZooKeeper: The leader must not begin any writes until it validates that it holds a valid ZooKeeper session. If it finds it does not hold a valid ZooKeeper session, it must throw an  IOException

If the leader is unable to contact a client with a cached version of that key: The leader must wait until all  invalidate messages are acknowledged. However, if a client becomes disconnected from ZooKeeper, and the leader detects this, it should ignore the failure of the  invalidate message (since ZooKeeper agrees that that node has failed). You do not have to consider the case of the client being available on ZooKeeper at the start of an operation, and then disconnecting from ZooKeeper while an invalidate is in progress.

If the node is not able to reach the leader, but the leader is still active in ZooKeeper:  If node N is unable to contact the leader, but ZooKeeper indicates that the leader and client are both still active in ZooKeeper, then the client must wait for the leader to become available (this is the default RMI behavior when an RMI endpoint is not reachable). You do not have to consider the case of the leader being available on ZooKeeper at the start of an operation, and then disconnecting from ZooKeeper while an operation is in progress.

Tracking node membership

Key to the successful operation of your key-value store will be a global, shared understanding of which nodes are currently alive and connected to the system. Otherwise, reads and writes might become stalled by partitioned or crashed nodes. You’ll use Curator’s PersistentNode abstraction to track which nodes are currently active participants.

Leader election

If there is currently no leader (e.g. when the system starts up, or when the leader becomes disconnected), then any nodes who can still contact a quorum of ZooKeeper nodes will participate in a leader election. You will use Curator’s LeaderLatch to perform the election.

Leader initialization

If this is the first leader, then the leader won’t need to do anything to initialize itself – there will be no data at the start, and hence no nodes will have any copies of any values, and no nodes will currently hold any ownership of any keys. However, if this is a leader being promoted after a prior leader failed or was disconnected, then it will simply initialize itself using whatever cached data it had at the time it was promoted. Hence, some key/value pairs might be lost (if only the leader had them). When a node detects that the leader has changed (and that it is not the leader), it must flush its entire cache.

Disconnection and Reconnection

When a node is reconnected to a quorum of ZooKeepers, it will check to see how many other nodes are there. If there are none, it will perform the leader election described above and then assume a leader role, and can initialize itself using any cached data that it has. If, however, there are other nodes present at the moment that it reconnects (regardless of who the leader is/was), the node will flush its entire local cache, including all values and cache information.

General Instructions

We have provided you a baseline implementation of the Key/Value server that handles all user interaction, creates a ZooKeeper server and connects to it, but does not do anything (that is, you will not need to implement any UI or command processors; you will not need to add much RMI boilerplate). You may not include any additional libraries (e.g. download and and require additional JARs), although feel free to use any of the various libraries included with Java 8 or already included by the starter project.

You must use exclusively reentrant locks (e.g. synchronized or ReentrantReadWriteLock): no StampedLocks.

Your KV Store will be compiled and tested using apache maven, which automatically executes the provided JUnit tests. Please install Maven on your computer. Unfortunately, Maven is not installed on Zeus, however you can download the most recent version (e.g. apache-maven-3.6.1-bin.zip) and unzip it in your home directory. Then, to run maven, type ~/apache-maven-3.6.1/bin/mvn in the snippets below (instead of just mvn). Note that you can easily import maven projects into eclipse and IntelliJ (we suggest IntelliJ over eclipse).

To compile and run your shell, run mvn package  in the top level directory and then run  java -jar target/kvstore-2019.3.5-SNAPSHOT.jar to start the KV server. Just run the jar once: it will create a ZooKeeper and a client. To create more clients, use the new-client command. You’ll notice that the text-mode interface we’ve provided for you has a handy help command. Since this single CLI will be a front-end for multiple clients, you’ll need to include the client ID in each command, e.g.  get 0 foo will issue the  get command to client  put 1 foo bar will invoke  setValue on client 1. We’ve also provided commands that you can use to simulate a service failing: either disabling a client’s connection to ZooKeeper, or blocking other clients from talking to it over RMI.

To build the jar file without running the tests, run  mvn -DskipTests package.

Your assignment will be automatically graded for correctness (note that there will be a manual grading phase to check hard-to-automatically-catch and concurrency issues). Included with your handout is part of the test script that we will use to grade your assignment. Upon submitting your assignment, our server will automatically compile and test your assignment and provide you with test results. You can resubmit only ten times until the deadline. To run the portion of the tests that we are providing you with, simply execute mvn test (of course, if you do this first off, you’ll see that they all fail!)

Note: Your code must compile and run on the autograder, under Java 8. It is unlikely that you will have any difficulties developing on a Mac, Linux or Windows, but please keep in mind the possibility of portability problems. When you feel satisfied with implementing one phase of the assignment, submit to AutoLab and verify that AutoLab agrees.

Academic honesty reminder: You may NOT share any of your code with anyone else. You may NOT post your code in a publicly viewable place (e.g. in a public GitHub repository).  You may face severe penalties for sharing your code, even “unintentionally.” Please review the course’s academic honesty policy.

General coding/grading requirements:

  1. You must use exclusively reentrant locks (e.g. synchronized or ReentrantReadWriteLock).
  2. You should feel free to add whatever additional classes you wish, or any additional methods to the existing edu.gmu.cs475.KVStore. You must not modify the edu.gmu.cs475.internal.IKVStore  interface, the edu.gmu.cs475.AbstractKVStore, any of the tests, or any of the  internal classes.
  3. Your code should be thread-safe: concurrent calls to any of these methods (or any other method in KVStore) should not have any races. It should now be clearer how this can occur — you will potentially have multiple clients attempting to interact with the server simultaneously.
  4. You must not store any state in static fields

Part 1: ZooKeeper Setup (20/150 points)

Your KVServer will automatically get passed a Curator (ZooKeeper) client object. It’s up to you to manage that connection. To satisfy the requirements above, you’ll need to

  1. Create an Ephemeral, PersistentNode to represent your KVServer in the pool of all active servers
  2. Create a LeaderLatch to elect and maintain a leader of the group. The best way to find out who the leader is is to call getLeader() on your LeaderLatch – not using the LeaderLatchListener, since the LeaderLatchListener will notify you only if you become the leader or no longer are the leader – it will not tell you if you are not the leader, and the leader changes.

Your first entry point will be initClient:

You should use the ZooKeeper client  zk, should create your ephemeral node at the path  ZK_MEMBERSHIP_NODE + "/" + getLocalConnectString(). You should create your LeaderLatch at the path ZK_LEADER_NODE. Make sure to specify the id of your LeaderLatch to identify it (in the constructor), specifically passing  getLocalConnectionString() as the id.

You can detect when your client becomes connected to and disconnected from ZooKeeper in the following callback:

Frequently asked question: Why can’t we just use the leaderLatch to keep track of all of the connected clients? Unfortunately, the Curator API does not guarantee that the list of participants in the leaderLatch is updated (it is cached, and refreshed at arbitrary intervals).

Warning: If you want to clean up these resources (e.g. call close, in the _cleanup method), make certain that you only call close if you are actually connected to ZooKeeper, otherwise you will see timeouts!

Grading:

  • 16pts JUnit tests (8 pts x 2 tests)
  • 4pts Manual inspection

Part 2: Key-Value Server with no failover (80/150 points)

Implement the external API (which is used by the text interface and the tests) as well as the peer-to-peer API that is used by non-leaders to invoke functions on the leader, and by the leader to invoke functions on the followers. You should use the connectToKVStore method (in AbstractKVStore) to connect from one client to another (e.g. to the leader or to a follower). The parameter that you pass to it is the same value that gets returned from getLocalConnectString().

The client-facing API:

To communicate between nodes, you’ll implement the following RMI-based API:

It is very important that you consider concurrency here: multiple clients might be attempting to write the same or different keys concurrently. A single client might be receiving multiple invalidate requests simultaneously. Your grade for part 2 will not consider any cases where a leader or client fails.

Reminder: When sending invalidate messages, the leader must send these messages to ALL clients with the key cached – even if that includes the very client proposing to update the key!

Grading:

  • 60 points JUnit tests (6 pts x 10 tests)
  • 20 points concurrency

Part 3: Fault Tolerance (50/150 points)

Implement the fail-over protocol described in the project summary above. For each method, consider “what would happen if I was disconnected from ZooKeeper at this point?” Consider: “what would happen if the leader stopped responding?” or “what would happen if a follower stopped responding?” If the system behavior in a specific failure is unclear to you after reading the summary above, then please post a specific question on Piazza.

Your fault-tolerance handling will be graded primarily by a battery of JUnit tests, which you will not have access to. However, every time that you submit your assignment on AutoLab, they will run. Hence, you are very strongly encouraged to reason through the failure cases and your code’s response to them, and test different failure modes (using the text interface, you can simulate ZooKeeper disconnecting from a client, or a client ceasing to respond to RMI messages. You are certainly not required to write actual JUnit tests, but should feel free to do so if you think it will be helpful; we’ve provided one such sample test in your handout. Note 11/21: Please do not try to use the blockedUntilNotLeader method – it is not needed for the test scenarios, and is not correctly implemented. You should feel free to implement such functionality yourself though, if you wish.

Grading:

  • 42 points JUnit tests (6 pts x 7 tests)
  • 8 points manual

Grading

Your assignment will be graded on a series of functional tests, making sure that it implements the specification above.

Hand In Instructions

IF YOU PLAN TO PARTICIPATE IN A TEAM, YOU MUST FORM THAT TEAM IN AUTOLAB BEFORE MAKING ANY SUBMISSION! Each team is limited to 10 submissions (total).

You must turn in your assignment using Autolab (You MUST be on the campus network, or connected to the GMU VPN to connect to Autolab). If you did not receive a confirmation email from Autolab to set a password, enter your @gmu.edu (NOT @masonlive) email, and click “forgot password” to get a new password.

Create a zip file of the root directory in your assignment (please:  .zip, not  .tgz or  .7z etc) — this is the root directory that includes the src directory and the pom.xml file. When you upload your assignment, Autolab will automatically compile and test it. You should verify that the result that Autolab generates is what you expect. Your code is built and tested in a Linux VM. Assignments that do not compile using our build script will receive a maximum of 50%. Note that we have provided ample resources for you to verify that our view of your assignment is the same as your own: you will see the result of the compilation and test execution for your assignment when you submit it.

You can submit your assignment only ten times before the deadline – so it is in your best interests to try to thoroughly understand the requirements of this assignment and make a judicious use of the autograder. For this project, no late submissions will be accepted, not even those only minutes late.

Note – You MUST be on the campus network, or connected to GMU VPN to connect to Autolab.

Decoding the output:

Note, AutoLab will run your code on the tests twice: once without RV-Predict (these are the scores used for parts 1-3), and once with RV-Predict (this is informational only). The outcomes should be the same with or without RV-Predict, but we wanted to make 100% sure that adding the tool doesn’t break your otherwise seemingly functioning code.

Questions

Please post questions on Piazza

 

Contact