Machine Learning for Programming (Seminar)
Quick Facts
Organizer | Michael Pradel |
Teaching assistants | Daniel Lehmann, Islem Bouzenia, Luca Di Grazia, Matteo Paltenghi |
Course type | Advanced seminar |
Language | English |
Ilias | Ilias course (for discussions, etc.) |
Place | Universitätstr. 38, room 0.463 |
Content
This seminar is about recent research on improving software and increasing developer productivity by using machine learning, including deep learning. We will discuss research papers that present novel techniques for improving software reliability and security, such as program analyses to detect bugs, to complete partial code, or to de-obfuscate code, based on machine learning models of code.
After the initial kick-off meeting, each student is assigned a research paper. Each student presents her/his paper in a talk during the weekly meetings. Talks are given twice, where the purpose of the first talk is to get constructive feedback for improving the second talk and the student's general presentation skills. Moreover, each student prepares a term paper that summarizes the original research paper.
Organization
The course will be classroom-first, i.e., to the extent possible, all activities will be in a physical classroom or based on physical meetings.
Schedule
This is a preliminary schedule and may be subject to change.
Date | Event |
Oct 21, 2021, 2:00pm |
Kick-off meeting Slides: Motivation and Organization, DeepBugs |
Oct 27, 2021, 11:59pm | Deadline for choosing topics |
Nov 18, 2021, 2:00pm |
First round of talks: Swarnendu Sengupta (topic 8) Deepanshu Sonparote (topic 7) |
Nov 25, 2021, 2:00pm |
First round of talks: Sinan Kurtyigit (topic 1) Simon Weiler (topic 14) |
Dec 2, 2021, 2:00pm |
First round of talks: Koushik Ragavendran (topic 2) Rohit G Hegde (topic 5) |
Dec 9, 2021, 2:00pm |
First round of talks: Keyuriben Patel (topic 6) Yiu Wai Chow (topic 9) |
Jan 13, 2022, 2:00pm |
Second round of talks: Simon Weiler (topic 14) Deepanshu Sonparote (topic 7) |
Jan 14, 2022, 11:59pm | Deadline for first version of term papers |
Jan 27, 2022, 2:00pm |
Second round of talks: Sinan Kurtyigit (topic 1) Koushik Ragavendran (topic 2) Rohit G Hegde (topic 5) |
Jan 28, 2022, 11:59pm | Deadline for peer feedback on term papers |
Feb 3, 2022, 2:00pm |
Second round of talks: Swarnendu Sengupta (topic 8) Keyuriben Patel (topic 6) Yiu Wai Chow (topic 9) |
Feb 11, 2022, 11:59pm | Deadline for term papers |
Topics
The following research papers are available for discussion. Use Google Scholar to find a copy of a paper. After the kick-off meeting, each student gets assigned one paper for presentation.
[1] | Elizabeth Dinella, Hanjun Dai, Ziyang Li, Mayur Naik, Le Song, and Ke Wang. Hoppity: Learning graph transformations to detect and fix bugs in programs. In ICLR, 2020. |
[2] | Miltiadis Allamanis, Earl T. Barr, Soline Ducousso, and Zheng Gao. Typilus: Neural type hints. In PLDI, 2020. |
[3] | Yaniv David, Uri Alon, and Eran Yahav. Neural reverse engineering of stripped binaries using augmented control flow graphs. In OOPSLA, 2020. |
[4] | Yu Wang, Fengjuan Gao, Linzhang Wang, and Ke Wang. Learning semantic program embeddings with graph interval neural network. In OOPSLA, 2020. |
[5] | Baptiste Rozière, Marie-Anne Lachaux, Lowik Chanussot, and Guillaume Lample. Unsupervised translation of programming languages. In NeurIPS, 2020. |
[6] | Jian Zhang, Xu Wang, Hongyu Zhang, Hailong Sun, Yanjun Pu, and Xudong Liu. Learning to handle exceptions. In ASE, 2020. |
[7] | Fang Liu, Ge Li, Yunfei Zhao, and Zhi Jin. Multi-task learning based pre-trained language model for code completion. In ASE, 2020. |
[8] | Nghi D. Q. Bui, Yiyun Yu, and Lingxiao Jiang. Infercode: Self-supervised learning of code representations by predicting subtrees. In ICSE, 2021. |
[9] | Roei Schuster, Congzheng Song, Eran Tromer, and Vitaly Shmatikov. You autocomplete me: Poisoning vulnerabilities in neural code completion. In USENIX Security, 2021. |
[10] | Seohyun Kim, Jinman Zhao, Yuchi Tian, and Satish Chandra. Code prediction by feeding trees to transformers. In ICSE, 2021. |
[11] | Md. Rafiqul Islam Rabin, Vincent J. Hellendoorn, and Mohammad Amin Alipour. Understanding neural code intelligence through program simplification. In ESEC/FSE, 2021. |
[12] | Berkay Berabi, Jingxuan He, Veselin Raychev, and Martin Vechev. Tfix: Learning to fix coding errors with a text-to-text transformer. In ICML, 2021. |
[13] | Qihao Zhu, Zeyu Sun, Yuan-an Xiao, Wenjie Zhang, Kang Yuan, Yingfei Xiong, and Lu Zhang. A syntax-guided edit decoder for neural program repair. In ESEC/FSE, 2021. |
[14] | Bo Li, Qiang He, Feifei Chen, Xin Xia, Li Li, John C. Grundy, and Yun Yang. Embedding app-library graph for neural third party library recommendation. In ESEC/FSE, 2021. |
[15] | Yi Li, Shaohua Wang, and Tien N. Nguyen. Fault localization with code coverage representation learning. In ICSE, 2021. |
Template for Term Paper
Please use this LaTeX template for writing your term paper. The page limit is six pages (strict).
Grading
Grading is based on the term paper, the talk, and active participation during the meetings.