CS 475, Fall 2019, Homework 1, Due Sept 18, 9:00 am

Start by downloading the handout:

Download the handout

Please post questions on Piazza

The purpose of this assignment is to get you familiar with basic concurrency control topics in Java. For this assignment, you will build a simple key-value store (KV Store) and client in Java. KV stores are widely used as layers underneath complex distributed systems, and examples include Redis, Voldemort, Dynamo. KV stores are very interesting to study from the perspective of concurrent and distributed software because the interface that they expose to client applications is incredibly simple (appearing effectively to be just a HashMap), but their implementation can be incredibly complex. In particular, once a KV store needs to be able to store more data than can fit in main memory on a single machine, deciding how to swap data to disk can be an interesting problem. If the KV store is then distributed amongst multiple machines, then the specific mechanisms used to keep data in sync can become quite complicated as well.

For this assignment, you will implement a simple KV store that operates entirely within memory, and on a single machine. However, the KV store will be accessed by multiple client threads, and hence, must maintain thread safety.

General requirements:

We have provided you a baseline implementation of the KV store that has various methods stubbed out, which you will implement. You may not include any additional libraries (e.g. download and and require additional JARs), although feel free to use any of the various libraries included with Java or already included by the starter project. Your project will be compiled and tested using Java 8, so please do not use any Java 9+ exclusive features (the first long-term support version of Java > 8 just came out in September; we’ll switch to Java 11 next semester, sorry).

Your KV Store will be compiled and tested using apache maven, which automatically executes the provided JUnit tests. Please install Maven on your computer. Unfortunately, Maven is not installed on Zeus, however you can download the most recent version (e.g. apache-maven-3.6.1-bin.zip) and unzip it in your home directory. Then, to run maven, type ~/apache-maven-3.6.1/bin/mvn in the snippets below (instead of just mvn). Note that you can easily import maven projects into eclipse and IntelliJ (we suggest IntelliJ over eclipse).

To compile your KV store, run mvn package.

Your KV store will be automatically graded for correctness (note that there will be a manual grading phase to check hard-to-automatically-catch concurrency issues). Included with your handout is a simple version of the (large) test script that we will use to grade your assignment. Upon submitting your assignment, our server will automatically compile and test your assignment and provide you with test results. You can resubmit 20 times before the deadline without penalty. To run these tests, simply execute mvn -Ptest test (of course, if you do this first off, you’ll see that they all fail!)

Note: Your code must compile and run on the autograder, under Java 8. It is unlikely that you will have any difficulties developing on a Mac, Linux or Windows. When you feel satisfied with implementing one phase of the assignment, submit to AutoLab and verify that AutoLab agrees.

Academic honesty reminder: You may NOT share any of your code with anyone else. You may NOT post your code in a publicly viewable place (e.g. in a public GitHub repository).  You may face severe penalties for sharing your code, even “unintentionally.” Please review the course’s academic honesty policy.

Tips and References

You might find it useful to reference the official Java tutorial on concurrency, which does a nice job outlining at a high level how you can create threads and locks. If you find any useful references, please feel free to share them on Piazza and we will update this page as well.

Part 1: The Single Threaded Key Value Store (35%)

To get started, you’ll implement the basic functionality of the key value store, and ensure that it works in a normal (single-threaded) environment.  This will act as something of a baseline for Java programming without concurrency. The tricky parts will come when we make this functionality thread-safe (in Part 2). You should implement Part 1 completely before moving on to parts 2 and 3. Within Part 1, we suggest starting with the non-list functionality (get, set, remove, append), and then moving on to support lists.

Your KV store will support two different kinds of values: strings and lists. Hence, a key might be mapped to: (1) nothing, (2) a single string, and (3) a list (containing any number of strings).

You will implement this behavior in  edu.gmu.cs475.KeyValueStore, specifically in the following 10 methods, noting again that to receive full credit for part 1, you can ignore all fo the concurrent specifications:

Note the following requirements:

  1. You should feel free to add whatever methods you would like to the  KeyValueStore class, but you must not change the names or parameters of the methods listed above.
  2. You must not add any fields to any classes.
  3. You can not use your own HashMap, but instead it is almost as easy: You must call  _get, _set, _remove, and  _listKeys in your  KeyValueStore to access the underlying data store. You must not maintain any mapping yourself between keys and values.
  4. All lists must be instances of KeyValueList, and when you create them, you must pass the name of the key that is storing that KeyValueList.

Precise grading breakdown:

  • Automated tests: 28 points
    • 7 JUnit tests, 4 points each
  • 7 points from manual grading

Part 2: Managing Thread Safety (45%)

Once you are satisfied that your key-value store works correctly in a single-threaded environment, your next task will be to add locking to it. You should implement all of your concurrency control through (1) the this.bigLock object (a  java.util.concurent.locks.Lock object), and (2) by using the Lock supplied by the method getLock() of each KeyValueList . Note 9/16: You MUST use the method getLock() of each KeyValueList to handle this synchronization, otherwise you will fail most of the tests.

Note the following concurrency requirements, paraphrased from the specification above:

  1. You must not allow concurrent calls to any of the following methods:  _get_set, _remove, or  _listKeys
  2. For the operations lPush, lPop, lLength, and lPeek: if any of these are concurrently called for two different keys, you must allow them to execute concurrently (subject to requirement 1), but do not allow them to occur concurrently. The overlapping behavior of these two methods is shown in the graphic below.


Precise grading breakdown:

  • Automated tests: 42 points
    • 7 JUnit tests, 6 points each
  • 3 points from manual grading

Part 3: Testing the Tests (20%)

We have included several functional tests with your handout that simply test the behavior of each of these methods. We have not provided any tests (for you to see) that test the behavior of your software under concurrency (the tests for part 2). One of your tasks is to write such tests.

Note that it is generally impossible to automatically prove that concurrent programs have no races, it is common instead to test for low-hanging fruit. For instance, it is possible to write tests that demonstrate that two events can happen concurrently, and tests that show that it’s unlikely (but not impossible) that two other events do not happen concurrently.

We are asking you to implement one of the tests that we are using to test part 2:  testPushOrPopFromDifferentListsShouldBeConcurrent. The goal of this test is to demonstrate that it is possible to call  lPush(key1,value1) and  lPush(key2,value2) concurrently, and for both of the underlying  push operations to actually occur concurrently (and, similarly for lPop). To implement this, you can create new threads (please, no more than 2 though) and use the entirety of Java’s thread API and use flags or locks as you feel appropriate.

The illustration in part 2 above shows the interleaving that we are trying to witness: that while Thread 1’s lPush is still running (that is to say, before lPush returns in Thread 1), Thread 2 successfully starts and adds an item to the same list at the same key as Thread 1. Note that in the graphic we show the two linearization points (from your call to kvStore.get and to list.add, which should both be protected with different locks). Because each list should have its own lock, you should be able to see this scenario. Hint (9/16): There are many ways to implement this using these different APIs, locks, and shared variables. Feel free to start however you like. If you are struggling to decide where to start, we will share that the reference solution uses only thread.start(), thread.join(), and thread.isAlive(). That said, there are many different ways to do this, so please feel free to proceed however you like. 

In order to force this scenario to occur, you may need to insert some code to run before any call to add or remove contents from the list. You can add such code by setting the fields KeyValueList.addCalledCallback and KeyValueList.removeCalledCallback. KeyValueList.addCalledCallback is a BiFunction that is passed as parameters the KeyValueList instance that add is being called on, as well as the value being added. KeyValueList.removeCalledCallback is a Function that is passed as a parameter the KeyValueList instance that remove is being called on. You can experiment by creating a function that, for instance, prints out every time that add is called on the list, like this:

Which will result in the output:

Implement your test in  Part3TestingTheTestTests. Feel free to add any non-static fields to this class.

Your test will be tested on Autolab on three different KeyValueStores: a correct KVStore (your test should pass on this implementation) and two incorrect KVStores (your test should fail on both of these implementations).

No marks will be awarded for empty tests, or tests that have no assertions (note that otherwise, these tests would all pass on our correct implementation). You will receive full marks only if your test successfully passes on the correct implementation and fails on the buggy versions.

Grading

Your assignment will be graded on a series of functional tests, making sure that it implements the specification above. Your test will be graded by running it on purposely buggy code and verifying that it throws some errors, and then manually inspecting them for completeness. We have provided a (not exhaustive) test suite which should help you judge (on your own) how well you have done with the functional requirements. We may add additional tests beyond what we are providing you with now, so please do not rely entirely on them.

In accordance with the “reasonable person principle,” we reserve the right to audit your code and correct any marks that are improperly assigned, for instance, due to your code incorrectly following the specification, but passing the test.  We would encourage you to spend your time correctly implementing the assignment, and not trying to force it to pass the test, even if it is not following the spirit of the assignment.

Hand In Instructions

You must turn in your assignment using Autolab (You MUST be on the campus network, or connected to the GMU VPN to connect to Autolab). If you did not receive a confirmation email from Autolab to set a password, enter your @gmu.edu (NOT @masonlive) email, and click “forgot password” to get a new password.

Create a zip file of only the src directory in your assignment (please: .zip, not .tgz or .7z etc). When you upload your assignment, Autolab will automatically compile and test it. You should verify that the result that Autolab generates is what you expect. Your code is built and tested in a Linux VM. Assignments that do not compile using our build script will receive a maximum of 50%. Note that we have provided ample resources for you to verify that our view of your assignment is the same as your own: you will see the result of the compilation and test execution for your assignment when you submit it.

You can resubmit your assignment up to 20 times before the deadline. Note the course late-submission policy: assignments will be accepted up until 24 hours past the deadline at a penalty of 10%; after 24 hours, no late assignments will be accepted, no exceptions.

Questions

Please post questions on Piazza

 

Contact