CSE 486/586 Programming Assignment 4 Replicated, Distributed Key-Value Storage solved

$34.99

Original Work ?

Download Details:

  • Name: Replicated_Distributed_Key-Value_Storage_Simple_Dynamo.zip
  • Type: zip
  • Size: 4.34 MB

Category: You will Instantly receive a download link upon Payment||Click Original Work Button for Custom work

Description

5/5 - (1 vote)

Introduction
For this assignment you will design and implement a simplified version of the Amazon Dynamo key-value storage system. (The simplifications are such that one would probably not consider it Dynamo any more!) You will have to implement partitioning, replication, and failure handling. Your goal is to implement a distributed key-value storage system that provides both availabilityandlinearizability. Yourimplementationmustbeabletoperformsuccessfulread and write operations even in the presence of failures. Thisdocumentprovidesanimplementationguidelineforachievingasuccessfulproject. However, you are free to come up with your own design in many parts of the project, provided that you provide both availability and linearizability at the same time in a way that the tester can understand. You must, however, implement partitioning and replication in exactly the manner that Dynamo does (so that the tester can understand your implementation). This document assumes that you are already familiar with Dyanmo. If you are not, you should first go and read the Dynamo paper (which is required reading for the course) and review the Dynamo slides from class. This assignment has many similarities to the previous assignment, your Chord ring implementation,andyoumaydrawfromyourChordimplementation(orotherpreviousprojects) asappropriate. Pleaseremembertociteanycodethatistakenfromexternalreferences,such as Android Developers or the Oracle Java tutorials. Please read this document in its entirety before you begin. It is long, but that is only because it is complicated and precise. Revisit the instructions regularly as you implement tobesureyou’reimplementingwhatisrequired. Therearenotesattheendthatmayassist you if you encounter common problems; make sure to review them as you go, as well. This assignment will require you to: • Implement data replication
• Implement data partitioning
• Handle node failures while continuing to provide availability and linearizability
1 GettingStarted
We are providing a project template for you to get started. As in the previous assignments, the template configures a number of things on your behalf. It is important that you do not 1Thisassignmentisbased,withpermission,onanassignmentdevelopedandusedbySteveKoinhisversion of CSE 486/586. Large portions of text are used verbatim from that assignment.
1
change settings that you do not have to change to complete your project. You will need to download the project from GitHub Classroom and open it in Android Studio. YoushouldhavereceivedtheGitHubClassroominvitationforthisprojectviaPiazza. Instructions for cloning the project and opening it in Android Studio are the same as before,exceptthatthebaserepositorynameisSimpleDynamo. YoucanfindYouTubetutorials detailing the process here.
2 TheContentProvider The graded portion of this project is a ContentProvider providing simplified Dynamo functionality to store and retrieve data across multiple devices even under failures. The providedtemplateusesthepackagenameedu.buffalo.cse.cse486586.simpledynamo,anddefines acontentproviderauthorityandclass. Pleasedonotchangethepackagename,andusethe content provider authority and class for your implementation. Like the SimpleDht project, all functionality for the requirements for this project should be provided by your content provider. It should handle availability in the face of failures via replication, as well as linearizability, independent of any Activity you may implement. The Activity in this project is used for your testing purposes only. YouwilluseSHA-1asthehashfunctiontogenerateIDsforthesystem,thesameaswith our Chord implementation. You should use the genHash() function from Project 3 for this purpose. For this project, you will need to run five emulators. We will be using the multiportconfigurationasdescribedinproject1. Yourappwillopenoneserversocketthatlistens on port 10000, but it will connect to a different port number on the IP address 10.0.2.2 for each emulator, as follows: emulatorserial port emulator-5554 11108 emulator-5556 11112 emulator-5558 11116 emulator-5560 11120 emulator-5562 11124
2.1 ImplementingtheContentProvider In implementing your content provider, you may make the following assumptions:
1. There will always be 5 nodes in the system (except for failures). There is no need to implementjoiningordepartureofnodes. Unlikethepreviousproject,yournodesmay hard-code the ring structure, as long as they handle failure.
2. Therewill be atmostonefailureatatime. Wewillemulate a failure byforce-closingan app on one emulator. We will not close the emulator itself.
3. All failures are temporary. You can assume that a failed node will recover soon. It will not be permanently unavailable during a run.
2
4. Correctness is more important than performance! Once you handle failures correctly, if you have time, you can improve performance.
5. You do not need to implement virtual nodes. (These are part of the Dynamo protocol, see the Dynamo paper!) All partitions are static and fixed.
6. Youdonotneedtohandlehintedhandoff. Thismeansthat,duringafailure,itisOKto replicate on only two nodes (if the third replica would be on a failed node). (Again, see the Dynamo paper.)
7. Linearizability will be checked only on a per-key basis. You do not need to provide any consistency guarantees between keys. Formally, you must implement per-key linearizability.
2.1.1 Requirements 1. The URI of your content provider must be: content://edu.buffalo.cse.cse486586.simpledynamo.provider 2. ThecontentprovidershouldimplementtheDynamo-likefunctionalitydescribedhere. Like the previous assignment, this includes all communication and functionality for the replicated store, including creation of appropriate tasks and sockets, response to requests from other AVDs, etc. No functionality for the Dynamo store should be provided by your Activity, which is used only for your own testing.
3. Whenanoderecoversfromfailure,itshouldcopyalloftheobjectwritesthatitmissed during the failure. This can be done by making requests to the appropriate nodes and copying their data.
4. Your content provider must support concurrentreadandwriteoperations.
5. Your content provider must handle a failure occurring during read and write operations.
6. Replication in your content provider should be handled just as it is in Dynamo. Each key-valuepairshouldbereplicatedoverthreeconsecutivepartitions,startingfromthe partition that the key belongs to (based on its consistent hashing).
7. All replicas must storethesamevalueforeachkey. This is per-keylinearizability.
8. EachcontentproviderinstanceshouldhaveanodeIDderivedfromitsemulatorserial number. This node ID must be obtained by applying the specified hash function (named genHash(), above) to the emulator serial number. For example, the node ID of the content provider running on emulator-5554 should benode_id = genHash(“5554”). The ID stringyoushoulduseisexactlythestringfoundinportStrinthetelephonyhack. This is necessary to place each node in the correct place in the Dynamo ring.
3
9. Anyappusingyourcontentprovidercansubmitarbitrarykey-valuepairsofJavastrings (e.g., <“I want to”, “store this”>), query arbitrary keys and retrieve their values, or delete arbitrary keys.
10. AsintheSimpleDhtassignment,yourappmustsupportqueryoperationsforthespecial keys “@” and “*”, which are defined as before.
11. As in the previous assignments, your content provider should have two columns.
• Thefirstcolumnshouldbenamed“key”(alllowercase,noquotationmarks). This column is used to store keys. • The second column should be named “value” (all lowercase, no quotation marks). This column is used to store the values associated with keys. • All keys and values stored by your provider should be Java strings.
12. Any app should be able to access (both read and write) your content provider. As in previous assignments, please do not require any permissions to access your content provider.
Note that each content provider must store only the key-value pairs local to its own partition oftheIDspace,andkey-valuepairsitisrequiredtoreplicatefromotherpartitions!
2.2 Writingthemainactivity Theprovidedtemplatecodeincludesanactivitytobeusedforyourowntestinganddebugging. It will not be specifically graded or evaluated in any way. You can (and should) use it to evaluate your Dynamo implementation as you see fit.
2.3 ImplementationGuidelines Thefollowingaresomeguidelinesforimplementingyourcontentprovider. Rememberthat you are free to implement your project however you wish, as long is it meets the project requirements! These guidelines are provided only to assist you in designing a successful implementation. In particular, many parts of these guidelines can be improved upon using techniques from the lectures or literature, if time allows. Remember that your partitioning and replication behaviors must be exactly as specified above, whatever techniques you choose.
Membership. JustasintheoriginalDynamopaper,everynodecanknowabouteveryother node,includingitslocationandpartitionofthering. Thismeansthatanynodecanforward a request that it cannot handle locally directly to a node that can without using ring-based routing.
Requestrouting. Whentherearenofailures,arequestcanbeforwardeddirectlytothecoordinatorforthekey(i.e.,thekey’ssuccessor),andthecoordinatorshouldhandleread/write operations for that key.
4
Quorum replication. In order to implement linearizability, you can implement quorumbased replication. Note that the original design does not handle linearizability, and you will need to adapt it. If you implement quorum replication, the replication degree (N) should be 3, and the reader and writer quorum sizes (R and W) should be 2. This means that, given a keyandnofailures,thekey’scoordinatoraswellasthenexttwosuccessornodesinthering shouldstorethekeyanditsvalue,andthatthecoordinatorshould,foreveryrequestforaget or put, contact two other nodes and get a vote from each. You can version objects in order to distinguishstaleobjectsfromthemostrecentcopy(e.g.,followingfailureandre-joiningofa node). Upon read, if more than one version of an object is returned, the coordinator should use the most recent version.
Chain replication. Another alternative for implementing linearizability is chain replication. ThedetailsforthiscanbefoundinChainReplicationforSupportingHighThroughputand Availability, by Robbert van Renesse and Fred B. Schneider, found athttps://www.cs.cornell. edu/home/rvr/papers/OSDI04.pdf. Inchainreplication,awriteisalwaysmadetothefirstpartition,andthenpropagatestothenexttwopartitionsinorder. Thelastpartitionreturnsthe result of the write. Read operations are always made to the last of the chain of partitions.
Failure handling. Failures must be handled very carefully, as there may be many corner cases to consider and cover. Just as described in the Dynamo paper, all messages between nodes can be used to detect failures. You can use a timeout on socket operations (in particular, read operations), just as we did in the GroupMessenger2 project for this, using a reasonable timeout period (e.g., 100 ms) and considering a node failed if it does not respond beforethetimeout. Justasinthepreviousproject,donotrelyonsocketcreationorconnection status to detect failure, as our emulated environment may distort this signal. When the coordinatorforarequestfailsanditthereforedoesnotrespondtoarequest,itssuccessorcan becontactednexttofulfilltherequest. Rememberthatyoucanassumeatmostonefailureat atime.
2.4 Testing We have testing programs to help you see how your code does with our grading criteria. If youfindroughedgesinthetestingprograms,pleasereportitsotheteachingstaffcanfixit. You should also test your program yourself by writing your own tests and drivers. Thetestingprogramforthisprojectisdividedintosixphases. Thefirstthreephaseswill not take much time compared to the second three, so you should focus on them first and finish them as quickly as possible. This will allow you to spend as much time as needed on the last three phases. The phases are:
1. Testing basic operations
• This phase will test insert, query, and delete (including the special keys @ and *). This will ensure that all data is correctly replicated. It will not perform any concurrent operations or test failures
5
2. Testing concurrent operations with differing keys
• Thisphasewilltestyourimplementationunderconcurrentoperations,butwithout failure. • Thetesterwillusedifferentkey-valuepairsinsertedandqueriedconcurrentlyon various nodes.
3. Testing concurrent operations with like keys
• This phase will test your implementation under concurrent operations with the same keys, but no failure. • The tester will use the same set of key-value pairs inserted and queried concurrently on all nodes.
4. Testing one failure
• This phase will test every operation under a single failure. • For each operation, one node will crash before the operation starts. After the operation is done, the node will recover. • This will be repeated for each operation.
5. Testing concurrent operations with one failure
• This phase will execute various operations concurrently and crash one node in the middle of execution. After some time, the node will recover (also during execution).
6. Testing concurrent operations with repeated failures
• This phase will crash one node at a time, repeatedly. That is, one node will crash and then recover during execution, and then after its recovery another node will crash and then recover, etc. • There will be a brief period of time between each crash-recover sequence.
Each testing phase is quite intensive, and will take quite some time to finish. Therefore, the tester allows you to specify which phase you want to test on the command line so that youneednotwaitforalltestseverytime. However,youwillstillneedtoensurethatyourun the tester in its entirety before you submit. We will not run the testing phases separately in our grading. • Youcanspecifywhichphaseofthetestyouwishtorunwiththe-por–phaseargument to the tester. • You can see the available arguments for the tester with the-hoption.
6
Notethatifyourunanindividualphasewith-por–phase,thetesterwillalwaysperform afreshinstall. However,whenrunningallphases,thetesterwillperformafreshinstallonly before phases 1 and 2. This means that from phase 2 onward, all data from previous phases willremainintact. The instructions for using the testing programs are the following: • Downloadatestingprogramforyourplatform. Ifyourplatformdoesnotrunit,please report it.
1. Windows: Tested on 64-bit Windows 8. 2. Linux: Testedon64-bitDebian9and64-bitUbuntu17.10(seebelowforimportant information about 64-bit systems). 3. Mac OS: Tested on 64-bit Mac OS 10.9 Mavericks.
• Before you run the program, please make sure that you are running all five AVDs. You can usepython run_avd.py 5to start them. • Remember to start the emulator network by runningset_redir.py 10000. • Like the previous testing program, this test will require you to provide your APK as a command line argument, and will take care of installing and uninstalling it on the emulators.
• Run the testing program from the command line.
• It may issue some warnings or errors during execution. Some of them are normal, some may indicate errors in your program. Examine them to find out!
• The testing program will give you partial and final scores. Remember during your own testing that you may see strange contents in your content provider if you do not uninstall your application between tests. In particular, updating the app from Android Studio does not uninstall the existing app first, so the content provider’s state will not be cleared. You can uninstall your app by hand with the following command: adb -s uninstall edu.buffalo.cse.cse486586.simpledht
Notesfor64-bitLinux: Thetestingprogramiscompiled32-bit. Ifyougetanerrorlikethe following, or the shell reports command not found when you run the executable, install the 32-bitlibzfor your system: ./simpledht-grading.linux: error while loading shared libraries: libz.so.1: cannot open shared object file: No such file or directory OnDebian-baseddistributions,youcanaccomplishthiswiththecommandapt-get install zlib1g:i386 as root (you may need to use sudo or su). If apt-get reports an error about the architectureorsaysthepackageisnotfound,youmayneedtoenablemultiarch. Todothis, rundpkg –add-architecture i386as root, then update your APT repositories withapt-get updateas root. Once this is done, you should be able to install the 32-bitlibz. For other distributions you will need to consult your distribution documentation.
7
3 Submission We use UB CSE autograder for submission. You can find autograder athttps://autograder. cse.buffalo.edu/, and log in with your UBITName and password. Once again, it is critical that you follow everything below exactly. Failure to do so willlead tonocreditforthisassignment. Zip up your entire Android Studio project source tree in a single zip file. Ensure that all of the following are true: 1. You didnot create your zip file from inside theGroupMessenger1directory. 2. Thetop-leveldirectoryinyourzipfileisSimpleDynamoorSimpleDynamo-,and it containsbuild.gradleand all of your sources. 3. Youusedaziputilityandnotanyothercompressionorarchivetool: thismeansno7-Zip, no RAR, no tar, etc.
4 Deadline
This project is due 2018-05-11 11:59:00 AM. This is one hour before our class. This is a firm deadline. If the timestamp on your submission is 11:59:01, it is a late submission. You are expected to attend class on this day! Note that, like the previous project, this deadline has been pre-extended, and it will not be extendedanyfurther. This deadline cannotbeextended due to the end of the semester!
5 Grading
This assignment is 20% of your final grade. Credit for this assignment will be apportioned as follows:
• Phase 1: 3%
• Phase 2: 4%
• Phase 3: 3%
• Phase 4: 5%
• Phase 5: 5%
• Phase 6: 3%
Thus, an application passing all six phases would receive 23 total points, representing 20% of the course grade as full credit for the project plus 3% of “extra credit”.
8
Notes
• Pleasedonotuseaseparatetimertohandlefailures,asthiswillmakedebuggingvery difficult and is likely to introduce race conditions. Use socket timeouts and handle all possible exceptions that may be thrown when there is a failure. They are: SocketTimeoutException,StreamCorruptedException,EOFException,andtheparentIOException. • PleasereuseyourTCPconnectionswithasingleremotehost,insteadofcreatinganew socket every time you send a message. You can use the same socket for both sending to and receiving from a remote AVD. Think about how you will coordinate this. • Please do not use Java object serialization (i.e., implementing Serializable). This will cause very large messages to be sent and received, with unnecessarily large message overhead.
• Please do not assume that a fixed number of DHT operations will be invoked on your system. Yourimplementationshouldnothardcodethenumberortypesofoperations in any way.
• Remember that there is a cap on the number of AsyncTasks that you can execute simultaneously. Plan accordingly, to keep the total number of threads that you require under this limit (which is about five).
• If the testing program cannot install your APK on one or more emulators, try each of the following steps:
1. Restart the problematic emulator. 2. Clean your build in Android Studio, then invoke the “Build APK” menu item and try again. 3. Manually install your APK usingadb installto ensure that it is installable.