close
close

Graph perspective increases semantic similarity when searching for programs

Program recovery remains a cornerstone of software development and is critical to increasing productivity throughout the development lifecycle. Among the variety of program search models, many people ignored the discrepancies between natural language queries and code, which resulted in a clear semantic gap. Moreover, programs and queries carry rich structural and semantic information. However, dominant approaches often ignore consistency between different aspects of the source code and treat queries as sequences, neglecting their inherent structural features.

To address these issues, a research team led by Yunwei DONG published their new research on June 15, 2024, in the journal Frontiers of Computer Science, co-published by Higher Education Press and Springer Nature.

The team proposed a framework that formulates program search as a multi-relational graph similarity problem. Moreover, two-level attention is used to assign weights to nodes in multi-relational graphs through attention at the level of relationships within relationships and relationships among relationships.

To start, the multi-relational graph construction module focuses on representing programs and queries using code property graphs (CPG) and abstract meaning representations (AMR). This strategic approach facilitates a more comprehensive and differentiated representation of program and query semantics. A neural network with a two-level attention graph is then used to learn semantic information for AMR and CPG. Finally, the semantic similarity computation module is designed to compute the similarity of query-program pairs. Compared with existing research results, the proposed method performs relatively well among all output values.

Future research efforts could focus on optimizing multi-relational graphs by minimizing redundant information, thus reducing graph complexity. Additionally, a promising avenue is the deliberate integration of external knowledge, such as knowledge graphs, with the goal of improving the representation of program semantics.

DOI: 10.1007/s11704-023-2678-8

/Publication. Contributions from the contributing organizations/authors may be bullet-pointed and edited for clarity, style and length. Mirage.News does not adopt institutional positions or parties, and all views, positions and conclusions expressed herein are solely those of the authors. See the whole thing here.