- français
- English
Meeting Notes 2015.04.24
Remaining requirements:
0) Writing query plans, nested part, Azure (cost, login, run scripts)
1) Experiments (e.g. 3 cases/queries per person)
2) Local join implementation
3) Integration with main branch
Next steps:
1) Unit tests:
+ Tuples
+ Distributed query plans
2) Local query plan (order of relation, cost optimization)
Local query plan - Cost optimization - Ordering heuristics:
1) Look at comparison predicate:
+ e.g. equality join output might be smaller than inequality join output.
2) Size lookup (relation size, selectivity).
+ e.g. with equality join we can know the output size before-hand
3) Try different orders (randomly), choose one which is cheapest:
+ Different orders = different paths (depth-first search) of the join dependency graph (each edge indicates there is a join condition between two relations). Worst case: complete graph (n! paths).
e.g. with R-S-T-V we have other possible orders: T-V-S-R, T-S-R-V. But T-R-S-V is invalid because we have no join condition between T and R.
+ Online setting: current tuple is T's and we have current sizes of other relations (received tuples).
+ Keep track of average running time (e.g. <1ms, invoke1>). Maybe normalized with the current sizes of relations.
e.g.
T: (T-V-S-R, 1ms), (T-S-R-V, 2ms)
S: (S-T-V-R, AVG), (S-R-T-V, AVG)