-
Notifications
You must be signed in to change notification settings - Fork 0
Running REEF examples
I suggest to use the persistent evaluators demo instead of the distributed shell - it has more functionality, and is interactive, which makes a great demo.
There's a script REEF/bin/persistent_eval.sh that can run PE demo in interactive or batch mode. Here are some examples:
-
./persistent_eval.sh- run interactively in local mode -
./persistent_eval.sh -local false- run interactively on YARN -
./persistent_eval.sh -cmd date- run commanddateonce on all evaluators in local mode (i.e. act as a distributed shell) -
./persistent_eval.sh -cmd date -local false- same as above, on YARN -
./persistent_eval.sh -cmd date -local false -num_runs 10- run commanddateon YARN ten times, reusing the evaluators.
Try it out on vnectar-21
P.S. few more things:
- press
^Dor typeexitto quit the interactive shell. - the script runs on Linux only;
- on Windows,
mvn -PPersistentEvalin reef-examples will start the local interactive shell. - I output results to the log (i.e. stderr) and to stdout. If you send stderr to /dev/null, the demo looks quite pretty, e.g. try
mvn -PPersistentEval 2> /dev/null - use
rlwrap ./persistent_eval.sh(rlwrap is installed on vnectar-21) to have command line history and other readline functionality in interactive mode
Terminal 1:
#
./suspend.sh -local false -cycles 120 -delay 1 -port 7008
#
Terminal 2:
#
./suspend-control.sh -port 7008 -cmd suspend -task <your task ID here>
#
./suspend-control.sh -port 7008 -cmd resume -task <your task ID here>
#
This demo starts a bunch of Tasks that do nothing but cycle from 0 to N and sleep for 1 second on each iteration. Tasks can be suspended and resumed; on resume, task restores its state and starts its idle cycle from the state where it had been suspended before.
To run the job:
-
./suspend.sh -cycles 600 -delay 1 -port 7008- run in local mode, and cycle from 0 to 600 with 1 second delay between iterations; it will listen onlocalhost:7008for suspend/resume commands. -
./suspend.sh -local false -cycles 600 -delay 1 -port 7008- same as above, on YARN.
Look for task IDs in the log - you will need them to suspend/resume the tasks.
To suspend/resume the tasks:
- run
./suspend-control.sh -port 7008 -cmd suspend -task <your task ID here>to suspend the task -
./suspend-control.sh -port 7008 -cmd resume -task <your task ID here>to resume
you should see suspend/resume messages in the job logs.
Few things to notice:
-
./suspend-control.shconnects to the localhost only; - shell scripts run only on Linux;
- to run suspend demo on Windows, do
mvn -PSuspendDemo- this will run the tasks in local mode for 10 cycles without suspending/resuming them; - if the task ID is not found, the entire job fails;
- because of issue #235, currently there are no update messages from the tasks in the client log; however, there are messages when tasks complete or get suspended or resumed.
So the general strategy for the suspend/resume demo would be:
- open two windows with ssh to vnectar-21;
- in one, run
./suspend.sh -cycles 120(i.e. roughly for 2 minutes); - look for the task id in the job logs;
- does not work: watch for updates from running tasks
- in other window, run
./suspend-control.shto suspend that task - see the message in the job logs about task being suspended
- does not work: watch for updates from all tasks except the suspended one
- run
./suspend-control.shagain to resume the task - see the resumed task message in the job logs
- does not work: see updates from all tasks again
- wait to see some tasks complete
- does not work: see the resumed task catching up
- wait more to see the resumed task complete
On vnectar: To run, execute bin/bgd.sh. There isn't much to look at at this point beyond nice log files flying by.
Locally: cd reef-examples followed by mvn -PBGD
To run, execute bin/pagerank.sh.
grep "vertex\[" the evaluator logs to see the final un-normalized ranks of all the vertices. grep "update counts" the evaluator logs to see the number of updates performed per vertex.
Run pagerank on other graphs with
bin/pagerank.sh -graphfile="/grid/0/rsync/t-branm/p2p-Gnutella08.txt"
-sourceStr="com.microsoft.reef.examples.pagerank.FileGraphPartitionSource"You should change to .level=INFO in logging.properties and rebuild reef-common, or else this will be very slow.
cp /grid/0/rsync/t-branm/logging.properties /grid/0/rsync/reef-sailfish/REEF/reef-common/src/main/resources/com/microsoft/reef/logging.properties
cd reef-common
mvn clean install -DskipTests
# undo the above
git checkout reef-common/src/main/resources/com/microsoft/reef/logging.properties
cd reef-common
mvn clean install -DskipTestsRun on YARN on vnectar with
bin/pagerank.sh -local false -graphfile="/grid/0/rsync/t-branm/p2p-Gnutella08.txt"
-sourceStr="com.microsoft.reef.examples.pagerank.FileGraphPartitionSource"#
bin/pagerank.sh -local false -graphfile="/grid/0/rsync/t-branm/p2p-Gnutella08.txt" -sourceStr="com.microsoft.reef.examples.pagerank.FileGraphPartitionSource" 2>&1
#cd /grid/0/rsync/reef-sailfish/reef-applications/pez/bin
./hadoop-examples.sh randomtextwriter test.txt
./hadoop-examples.sh pi 10 100 # 3.148
./hadoop-examples.sh wordcount
./hadoop-examples.sh grepKillall Reef Launchers:
kill -9 `jps | grep Launcher | cut -f1,1 -d' '`