07/2023: ASE! Our paper: “DiffSearch: A Scalable and Precise Search Engine for Code Changes” has been accepted for the Journal-first track at ASE 2023 in Luxembourg.
06/2023: Won a Uber competition on Generative AI for developer productivity among 103 teams worldwide.
05/2023: Generative AI at Uber! I am joining Uber for a research internship in Amsterdam for summer 2023.
04/2023: Dagstuhl Seminar! I was invited to the world’s leading researchers Dagstuhl Seminar about Code Search for spring 2024.
03/2023: I was invited by JetBrains in their Munich office to discuss our paper “DiffSearch: A Scalable and Precise Search Engine for Code Changes”.
02/2023: Start serving as a reviewer for the prestigious journal ACM Transactions on Software Engineering and Methodology (TOSEM).
12/2022: Distinguished Paper Award! Our paper “The Evolution of Type Annotations in Python: An Empirical Study” received an ACM SIGSOFT Distinguished Paper Award at ESEC/FSE 2022 in Singapore.
11/2022: IEEE TSE! Our paper: “DiffSearch: A Scalable and Precise Search Engine for Code Changes” has been accepted for the IEEE Transactions on Software Engineering journal.
10/2022: ACM CSUR! Our paper: “Code Search: A Survey of Techniques for Finding Code” has been accepted for the ACM Computing Surveys journal.
09/2022: ESEC/FSE! Our paper: “The Evolution of Type Annotations in Python: An Empirical Study” has been accepted for the ESEC/FSE 2022 conference.
05/2022: ICSE SRC 2022! Winner of the second prize ($300) for the ICSE ACM Student Research Competition (SRC) with the submission “Efficiently and Precisely Searching for Code Changes with DiffSearch”.
We extract 1.4 million type annotation changes from 9,655 Python repositories. Our results show that type annotations are clearly gaining traction, yet the large majority of code elements that could be annotated currently remains unannotated. We see a huge potential for techniques that automate the process of adding types into an existing code base, such as neural type prediction models. Finally, many developers seem to not regularly check their code for statically detectable type errors, or if they do, commit the code despite such errors.
We present a scalable and precise search engine for code changes. Given a query, the approach retrieves within seconds relevant examples from million code changes. Our query language extends the underlying programming language, providing an intuitive way of formulating queries to search for code changes. DiffSearch guarantees that every returned search result fits the query.
This article provides a comprehensive overview of 30 years of research on code search. Given the huge amounts of existing code, searching for specific code examples is a common activity during software development. We discuss what kinds of queries code search engines support, and give an overview of the main components used to retrieve suitable code examples. In particular, the article discusses techniques to pre-process and expand queries, approaches toward indexing and retrieving code, and ways of pruning and ranking search results.