Saturday, February 08, 2020

On Data Analysis of Software Repositories

This article discusses the analysis of software repositories using data analysis methods. A review is made of methods for analyzing programs based on information retrieved from the program code stored in code repositories. A review is made of methods for analyzing programs based on information retrieved from the program code stored in repositories. The article reviews the works that apply methods of classification, clustering and depth learning in software development. For example, for classifying and predicting errors, changing the properties of code in the process of its evolution, detecting design flaws and debts, assist for code refactoring. The main ultimate goal for all models is, of course, an automation of programming. In practice, we are talking about more simple tasks. This includes, for example, information retrieval (program code), error prediction, clone detection, link analysis, evolution analysis, etc. Firstly, we discuss recurrent neural networks and their deployment for the analysis of software repositories. In the simplest case, recurrent networks model a programming language as a sequence of characters. Also, the paper covers clustering and topic modeling. - from our new paper

No comments: