Meeting Notes 2015.03.31

Source: https://docs.google.com/document/d/1IpDkgh6jWatBSOo6Dh3cAhQmjASCQjw0-bxHgWpQ_38/edit?usp=sharing

I. Progress done this week:

1. Hypercube equal-size partitioning (inspired from Zhang paper):

+ Details: https://wiki.epfl.ch/bigdata2015-hypercubejoins/hcp-equal-size

+ Runtime: < 5ms with 8 relations and 1000 machines

 

2. Squal integration:

+ Select for tuples is partially ready only index part is left

+ Join - waiting for applying predicate

...

 

II. Discussions in the meeting:

0. Squall codebase: is being changed (rename, refactor, …)

1. Hypercube partition document:

2. Storm integration:

3. Deployment of squall on Microsoft Azure:

4. Nested query (TPCH):

SELECT A FROM R

WHERE R.A = SUM (S.B)

  1. local storage and send by batch (e.g batch size = 100 tuples)

  2. communicate the PARTIAL aggregation result to machines

  3. re-partition

  4. query plan will specify the communication before-hand

 

III. Plan for next week:

0) Hypercube partitioning code: documentation and beautify

1) Azure: deployment try

2) Local join (without index): going on with Storm integration with naive implementation

3) Khue: SquallToaster

4) Nested query discussion & preparation for experiments