Meeting Notes 2015.04.14

I. Progress done this week:

 

1. Hypercube partitioning:

+ Integrated into internal codebase with factory design pattern

+ Provided documentation for all java files

+ Runtime experiment: < 1s (from 2 to 20 relations and from 500 to 2000 machines)

 

2. Window Azur Set up task:

- The cluster machines run on Windows. It appears to have a preview option with Ubuntu but this option is disabled at the moment.

- The cost is expensive. The charge is based on “computing hours” which means as long as the cluster is up, cost incurred. For around 2 days with 1-node cluster doing nothing, it costed ≈ 50chf.

- There is no stop and suspend operation on the HDInsight cluster. The only way to stop the cluster is to delete it which is not reasonable given the amount of time it takes for re-provisioning of the cluster (20 - 30 mins). Also local data can’t be persisted after “suspending” the cluster in this way.

- Not flexible. (i.e, open ports, create squall tmp folder, submit jars…)

 

so we need to find out how Microsoft Azure provide HDFS like service:

http://www.codatlas.com/github.com/apache/hadoop/trunk/hadoop-tools/hadoop-azure/src/main/java/org/apache/hadoop/fs/azure/

 

3. SquallToast Integration with Hypercube partitioning:

 

  1. Hyracks query:

Task 1

HOUSEHOLD = 916

BUILDING = 1237

MACHINERY = 852

AUTOMOBILE = 1016

FURNITURE = 1030

 

Task 2

HOUSEHOLD = 931

BUILDING = 1201

MACHINERY = 800

AUTOMOBILE = 965

FURNITURE = 1005

 

Task 3

HOUSEHOLD = 925

BUILDING = 1268

MACHINERY = 884

AUTOMOBILE = 998

FURNITURE = 972

b. TPCH3 queries:

Only part of the result is put here. Task1 produces no output tuple.

Task1: (None)

Task2: ...

16322|19941220|0 = 92848.15920000001

44102|19950114|0 = 70760.14369999999

Task3: ….

44102|19950114|0 = 93895.6756

c. TPCH5 queries:

Task1: (None)

Task2:

VIETNAM = 706598.5371000001

INDONESIA = 383966.5044999999

CHINA = 439244.43220000004

JAPAN = 257033.24610000002

INDIA = 205155.0176

 

Task3:

VIETNAM = 294328.1628

INDONESIA = 182413.0231

CHINA = 300966.3248

JAPAN = 403617.9964

INDIA = 217719.6668

 

4. HyperCube Join implementation:

 

5. Indexing
+ Decided about a pattern

+ Implementation of Create Index

 

II. Discussions in the meeting:

(to be continued)

 

III. Plan for next week:

(to be continued)