For large sets of objects, allpairs_multicore will use as many cores as you have available, and will carefully manage virtual memory to exploit locality and avoid thrashing. Because of this, you should be prepared for the results to come back in any order. If you want to further exploit the parallelism of executing All-Pairs workflows on multiple (multicore) machines, please refer to the allpairs_master(1) utility.
-b,--block-size <items> | |
Block size: number of items to hold in memory at once. (default: 50% of RAM) | |
-c,--cores <cores> | |
Number of cores to be used. (default: # of cores in machine) | |
-e,--extra-args <args> | |
Extra arguments to pass to the comparison program. | |
-d,--debug <flag> | |
Enable debugging for this subsystem. | |
-v, --version | Show program version. |
-h, --help | Display this message. |
a b are 45 percent similarTo use the allpairs framework, create a file called set.list that lists each of your files, one per line:
a b c ...Then, invoke allpairs_multicore like this:
% allpairs_multicore set.list set.list compareitThe framework will carry out all possible comparisons of the objects, and print the results one by one (note that the first two columns are X and Y indices in the resulting matrix):
1 1 a a are 100 percent similar 1 2 a b are 45 percent similar 1 3 a c are 37 percent similar ...