- français
- English
Hypber Cube Partitioning
Source https://docs.google.com/document/d/1Gv4Smxgh6r9GpgIl8ZaukXpcQht4i8-mxwLE_II24OI/edit?usp=sharing
I. Third implementation:
CubeNAssignment.java: brute implementation for n-way partitioning
Principle: find best partition as hypercubes
Idea:
-
Get the prime factors of r: the number of reducers
-
Get the power set of these prime factors
-
Generate the combination of n dimensions.
-
In each combination, each dimension is assigned to an element of the power set (i.e. a subset of the prime factors). The number of reducers of each dimension is the multiply of its prime factor (1 if empty)
-
Evaluate only the combination in that the multiply of all dimensions equal to r.
-
Assess the computation cost and communication cost
-
Maintain the combination with smallest comp and comm costs.
-
Testcase:
input ([] sizes, r) --> output (int[] _rd) where sizes are the size each dimension, r is the number of reducers, _rd is the #reducers by each dimension
-
13, 7, 1 → 1, 1
-
4, 4, 4, 8 → 2, 2, 2
-
[4, 4, 4, 4], 16 → 2-2-2-2
-
[8, 4, 10, 7], 1000 → 5-4-10-5
https://wiki.epfl.ch/bigdata2015-hypercubejoins/documents/CubeNAssignment.java
II. Second implementation:
CubeAssignmentAdapter.java: adapter interface for brute-force 2,3,4-way partitioning
https://wiki.epfl.ch/bigdata2015-hypercubejoins/documents/CubeAssignmentAdapter.java
III. First implementation: Cube3Assignment
+ Define 3 dimensions:
private int _r0, _r1, _r2, _r;
private long size0, size1, size2;
private int[][][] regionsCube;
+ Approximate algorithm: find best partition as cubes (similar to MatrixAssignment)
- Find prime factors of r (i.e. to find all possibles combination of r0,r1,r2 such that r0 x r1 x r2 = r)
- Consider each possible combination: compare communication cost and computation cost
Computation cost = size0 / r0 * size1 / r1 * size2 / r2;
Commutation cost = size0 / r0 + size1 / r1 + size2 / r2;
- Get the combination with minimum computation cost and minimum communication cost.
+ Testcases: (size0, size1, size2, r)
222 [main] INFO ch.epfl.data.plan_runner.thetajoin.matrix_mapping.Cube3Assignment - Input: [4, 4, 4, 8]
225 [main] INFO ch.epfl.data.plan_runner.thetajoin.matrix_mapping.Cube3Assignment - #Reducers each dimension: [2, 2, 2]
225 [main] INFO ch.epfl.data.plan_runner.thetajoin.matrix_mapping.Cube3Assignment - Input: [8, 4, 10, 1000]
225 [main] INFO ch.epfl.data.plan_runner.thetajoin.matrix_mapping.Cube3Assignment - #Reducers each dimension: [8, 4, 10]
Full code:
https://wiki.epfl.ch/bigdata2015-hypercubejoins/documents/Cube3Assignment.java