- français
- English
Query Understanding and Optimization
Project Plan
Description :
Database queries are usually formulated using a rigid syntax that can be tedious to specify. The goal of this project is to create a system that allows to specify queries partially in natural language, thereby lowering the entry barrier for non-computer scientist users. Parsing natural language is however still very difficult for computers. This is why we plan to use crowdsourcing in order to integrate human workers into the query parsing and optimization process.
The team :
-
Florian Chlan
-
François Farquet
-
Joachim Hugonot (team leader)
-
Simon Rodriguez
-
Kristof Szabo
-
Florian Vessaz
-
Guo Xinyi
-
Vincent Zellweger
Milestones :
-
Milestone 1 : 30.03.2015 : Working prototype for some fixed queries to define.
-
Milestone 2 : 20.04.2015 : Working prototype for the general case.
-
Milestone 3 : 12.05.2015 : Performance analysis and improvement of the prototype. (Possible improvements : Caching previous queries, Strategies optimization, GUI. To re-discuss after milestone 2)
Project goals :
-
Prototypical implementation in Java, and use of Amazon Mechanical Turk to handle the parts of queries written in natural language.
-
Experimental evaluation of the implemented system.
-
Analysis of the results and proposals for future improvements.
How we will achieve these goals :
-
Definition of a language for the queries.
-
Creation of java methods, with AMT API, to interact with AMT platform.
-
Parsing of queries and creation of an associated “tree”.
-
Definition of a strategy to split the “tree” into smaller parts that can be handled by AMT workers.
-
Definition of a strategy to merge the results from smaller parts to answer the query, this will also be handled by AMT workers.
-
Analysis of performance. Metrics to consider : time, cost, quality of the answer.
Required resources :
Maybe a computer, if our application needs to run for a while.
Work Packages for the first 2 milestones:
-
Creation of an API to interact with AMT.
-
Language definition, implementation, parsing and tree creation.
-
Definition and implementation of strategies to split the query into subtasks, merging of results to answer the query.
The risks to the success of the project:
Not identified yet.
Useful links :