编程 LLM
Code LM 很重要,是因为两个原因:1)代码是一种很好的数据,能够用来训练模型的逻辑运算能力;2)Code LM 是一个强大的工具:我们想要的“自动化”,不就是生成程序么?
课程材料
-
Berkeley CS294/194-196: Responsible GenAI, Google Brain, Application Domains II: Code Generation (slides)
-
斯坦福 CS224n Code Generation PPT
-
华盛顿大学 CSE 599 同学 Slides
- 约翰霍普金斯 UA 2024 Lec 14 Connecting language to outside world
- LMs for coding
- Berkeley Summit 2023, Replit, ReplitLM: Using Open-source from Training to Production for a Code Completion LLM, Slides, Video, Replit 在 MosaicML LLM 训练平台上,训练了它们的 CodeLM, 在它的基础上,出现了 CodeInstruct。
论文
普林斯顿课程推荐论文
Refer:
- A Conversational Paradigm for Program Synthesis
- InCoder: A Generative Model for Code Infilling and Synthesis
- A Systematic Evaluation of Large Language Models of Code
- Language Models of Code are Few-Shot Commonsense Learners
- Competition-Level Code Generation with AlphaCode
斯坦福 CS224n 推荐论文
- Program Synthesis with Large Language Models
- Competition-level code generation with AlphaCode
- Evaluating Large Language Models Trained on Code
约翰霍普金斯推荐论文
Pretraining Coding Models
- Evaluating Large Language Models Trained on Code
Additional Reading(s):
- Competition-Level Code Generation with AlphaCode
- InCoder: A Generative Model for Code Infilling and Synthesis
- Solving Quantitative Reasoning Problems with Language Models
- Copilot’s impact on developer productivity
- Grounded Copilot: How Programmers Interact with Code-Generating Models
- Expectation vs. Experience: Evaluating the Usability of Code Generation Tools Powered by Large Language Models
Demo
谷歌的 Gemini 模型可以读论文中的线图,给出画出该图的代码 Youtube 视频,甚至参加编程比赛 Youtube 视频。
https://www.codacy.com/
Tool
- Aider: AI pair programming in your terminal, https://github.com/paul-gauthier/aider
Index | Previous | Next |