Meeting Reports

Date: 06.03.2015

Time: 15.00 - 17.00
Attendees: Christian, Matthieu, Nicolas V., Nicolas H., Pascal
Author: Christian
Agenda: Create proposal

 

Done since the last meeting

 

Items discussed

 

 


 

Date: 13.03.2015
Time: 15.00 - 17.00
Attendees: All except Bastien, Mateusz, Nicolas V., Pierre and Raphael
Author: Nicolas Hubacher
Agenda: Define more precisely the first tasks

 

Items discussed

 

 


 

Date: 17.03.2015
Time: 14:30-16:00
Attendees: Everybody but Julien Graisse
Author: Damien Engels
Agenda: Summary of research tasks & planning for the week

 

Done since the last meeting

 

Items discussed

With the knowledge gained these passed days we decided to move the first Milestone to Friday March 27th.

Minimum viable project doesn't need machine learning, so we focus first on getting the search working with a simple Page Rank and simple feature intersection and we will add machine learning afterwards.

Page rank: Mathieu and Bastien

Look at what the devmine team did, and find a way to get a way to rank repositories and files.

Parsing java: Nicolas V.

The parsers from devmine are too simple to do anything useful, we will write our own to start with.

Distribute tree parsing: Nicolas H. Damien

Implement a Spark app to parse all repositories. We will need to ask for some space on hdfs to store the parsed trees

Feature extraction & build index: Mateusz, Pierre, Raphael, Julien

Given the AST, implement a Spark app to extract all features.

Set up datastore for index : Pascal, Christian

HDFS has no index, unpractical to search. See how we can use MongoDB to store index, we will need computing resources for this (Azure)

 

 


 

Date: 27.03.2015
Time: 15.00 - 17.00
Attendees: Nicolas H, Nicolas V, Pierre, Damien, Julien, Matthieu, Pascal
Author: Pascal Lau
Agenda: Milestone I

 

Items discussed

 

 


 

Date: 02.05.2015
Time: 11.00 - 13.00
Attendees: Bastien, Christian, Matthieu, Nicolas H, Nicolas V, Julien, Pierre
Author: Nicolas Hubacher
Agenda: Discuss architecture, Define and distribute tasks for Milestone II

 

Done since the last meeting

The devsearch-concat script has finished its job. The output folder is of size 386.6GB.

 

Items discussed

Architecture: The discussion about the architecture of our project resulted in a more detailed diagram. Some of the most important points:

 

Tasks for Milestone II (due by 15.04.15):
Our goal for Milestone II is to get everything integrated. The actual tasks and the responsible team members can be found on our Trello board.
Our slogan for this milestone: Don't make your code perfect from the beginning! Create something that works and make it better later...

 

 


 

Date: 21.04.2015
Time: 13.00 - 14.30
Attendees: Everybody
Author: Nicolas Hubacher
Agenda: Machine Learning, Implementation of Feature Matching, Inverted Index, Task Review, Task Assignment, Code Formatting

 

Done since the last meeting

We have managed to integrate the whole system. During a meeting with Amir we could demonstrate that the code search basically works.

 

Items discussed

Remember: M3 is due by 12.5!