The current plan is to store all our code in this one repo, with separate directories for nameNode, dataNode, and code is needed by both (such as various protocols). We'll add more as we need them.
Check the wiki for documentation!
- Install Virtualbox. Works with 5.1.
- Install Vagrant. Works with 1.8.5.
- Clone the repo:
git clone https://github.com/Rice-Comp413-2016/Rice-HDFS.git cd Rice-HDFSvagrant up(takes 17 minutes from scratch for me)- I (Stu) had to "sudo" these commands
- Make sure to do this from the repo directory (otherwise it asks for vagrant install)
vagrant ssh.- You should be in the development environment. Things to know:
- The username is
vagrantand the password isvagrant. - The machine has 1G of memory allocated. Change Vagrantfile if you need more.
- The folder /home/vagrant/rdfs is synced from here (here being the location of this readme), meaning that all edits you make to files under the project are immediately reflected in the dev machine.
- Hadoop binaries such as
hdfsare on the PATH. - Google protobuf 3.0 is installed, you can run
protocto generate C++ headers from .proto specifications. - If you need external HTTP access, the machine is bound to the address 33.33.33.33.
- The username is
sudo apt-get install libboost-all-dev
sudo apt-get install libasio-dev
mkdir build
cd build
cmake ..
make
You will see a sample executable placed in build/rice-namenode/namenode. The
compiled protocols are in build/proto.
The Google Test framework is now included in the development environment. You may need to do vagrant destroy and vagrant up to install it.
Tests should be placed in the home/vagrant/rdfs/test directory.
After creating a new test file, you can modify the CMakeLists.txt file to create an executable
to run those tests.
There is a file, tests/run-all/run-all-tests.cc, that creates an executable running all tests.
If you create a new test executable, modify this to add yours.
There is currently a file in the test directory, tests.cc, with a sample test. You can run it by
executing
cmake CMakeLists.txt
make
./runTests
in the test/ directory. A beginner's guide to using Google Test is located here
A githook has been added at rdfs/test/pre-commit. It's a shell script that will build and run the unit tests. To use it, copy the file to rdfs/.git/hooks. Then, before each commit is made the tests will run, and a failure will halt the commit. If this is too restrictive, renaming the file to pre-push will do the same thing only when you try to push.
Namenode:
Run the namenode executable from build/rice-namenode.
Then run something like hdfs dfs -fs hdfs://localhost:port/ -mkdir foo
where port is the port used by the namenode (it will print the port used)
Datanode:
Run the datanode executable from build/rice-datanode.
Then run something like hdfs dfsadmin -shutdownDatanode hdfs://localhost:port/
where port is the port used by the datanode (it will print the port used)
If you want to do a quick end-to-end test, try the following to cat the file:
- Pull the code and build (as explained above).
- Run zookeeper (from ~, it’s
sudo zookeeper/bin/zkServer.sh start). This will run in the background. - Run namenode (
rdfs/build/rice-namenode/namenode). This will run in the foreground. - Run datanode (
rdfs/build/rice-datanode/datanode). This will run in the foreground. - Create a file with
hdfs dfs -fs hdfs://localhost:5351 -copyFromLocal localFile /filename - Try to cat that file with
hdfs dfs -fs hdfs://localhost:5351 -cat /filename
Whether you use Google Mock in conjunction with Google Test is up to you.
Google Mock should be used in conjunction with Google Test.
Google Mock is not a testing framework, but a framework for writing C++ mock
classes. A mock class is simplified version of a real class that can be
created to aid with testing. However, Google Mock does do an automatic
verification of expectations.
The typical flow is:
-
Import the Google Mock names you need to use. All Google Mock names are in the
testingnamespace unless they are macros or otherwise noted. -
Create the mock objects.
-
Optionally, set the default actions of the mock objects.
-
Set your expectations on the mock objects (How will they be called? What will they do?).
-
Exercise code that uses the mock objects; if necessary, check the result using Google Test assertions.
-
When a mock objects is destructed, Google Mock automatically verifies that all expectations on it have been satisfied.
You should read through all of the Google Mock documentation located at (/googletest/googlemock/docs/) before using it:
- ForDummies -- start here if you are new to Google Mock.
- CheatSheet -- a quick reference.
- CookBook -- recipes for doing various tasks using Google Mock.