wavefront_master uses the Work Queue system to distribute tasks among processors. After starting wavefront_master, you must start a number of work_queue_worker(1) processes on remote machines. The workers will then connect back to the master process and begin executing tasks.
-h, --help | Show this help screen | ||||||||||||||||||||
-v, --version | Show version string | ||||||||||||||||||||
-d,--debug <subsystem> | |||||||||||||||||||||
Enable debugging for this subsystem. (Try -d all to start.) | |||||||||||||||||||||
-N,--project-name <project> | |||||||||||||||||||||
Set the project name to -o,--debug-file <file> | Write debugging output to this file. By default, debugging is sent to stderr (":stderr"). You may specify logs be sent to stdout (":stdout"), to the system syslog (":syslog"), or to the systemd journal (":journal").
| -p,--port <port> | Port number for queue master to listen on.
| -P,--priority <num> | Priority. Higher the value, higher the priority.
| -Z,--port-file <file> | Select port at random and write it to this file. (default is disabled)
| --work-queue-preferred-connection <connection> | Indicate preferred connection. Chose one of by_ip or by_hostname. (default is by_ip)
| |
Before running wavefront_master, you need to create a file, say input.data, that lists initial values of the matrix (values on the left and bottom edges), one per line:
0 0 value.0.0 0 1 value.0.1 ... 0 n value.0.n 1 0 value.1.0 2 0 value.2.0 ... n 0 value.n.0To run a Wavefront workflow sequentially, start a single work_queue_worker(1) process in the background. Then, invoke wavefront_master. The following example computes a 10 by 10 Wavefront matrix:
% work_queue_worker localhost 9123 & % wavefront_master function 10 10 input.data output.dataThe framework will carry out the computations in the order of dependencies, and print the results one by one (note that the first two columns are X and Y indices in the resulting matrix) in the specified output file. Below is an example of what the output file - output.data would look like:
1 1 value.1.1 1 2 value.1.2 1 3 value.1.3 ...To speed up the process, run more work_queue_worker(1) processes on other machines, or use condor_submit_workers(1) or sge_submit_workers(1) to start hundreds of workers in your local batch system.
The following is an example of adding more workers to execute a Wavefront workflow. Suppose your wavefront_master is running on a machine named barney.nd.edu. If you have access to login to other machines, you could simply start worker processes on each one, like this:
% work_queue_worker barney.nd.edu 9123If you have access to a batch system like Condor, you can submit multiple workers at once:
% condor_submit_workers barney.nd.edu 9123 10 Submitting job(s).......... Logging submit event(s).......... 10 job(s) submitted to cluster 298.