“GitHub Copilot: Your AI pair programmer” (media); “Codex: Evaluating Large Language Models Trained on Code”, Chen et al 2021 (on small versions)
New GPT-3-based code completion for GitHub; like TabNine or IntelliCode, but more so, and a tasty lollipop indeed; puzzlingly, OA/GH appear to do no checks like n-grams for possible copying, but copying is rare anyway.
It is darkly hilarious to see programmers react little better than artists did in peddling misinformation about Copilot like unlabeled “humor”, instantly turning into substance dualists insisting that “computers can never truly understand code or be creative unlike us humans with souls
^Wminds”, Internet IP lawyers who have never heard of the word “transformative”, and infosec experts engaging in histrionics about it “leaking secrets”—from public Github repos, y’know, made up of public commits to public repos you really should not be uploading any passwords or keys to because attackers have been actively monitoring in realtime for credentials to steal for over a decade now? In a year, who will remember any of this BS? A few months later (also like TFDNE), it looked like the world hadn’t ended and people moved on. Machiavelli had it right: “…there is nothing more difficult to take in hand, more perilous to conduct, or more uncertain in its success, than to take the lead in the introduction of a new order of things. Because the innovator has for enemies all those who have done well under the old conditions, and lukewarm defenders in those who may do well under the new. This coolness arises partly from…the incredulity of men, who do not readily believe in new things until they have had a long experience of them.” (What if there was a revolution and no one cared?)
Regardless, “attacks only get better”, and Copilot surely will. I’m a little surprised that Copilot/Codex appear to be trained only on entire source code files, when patches are the perfect training data for making a promptable edit-capable LM: a patch is a human-readable summary/explanation of the following changes, provides an immediately promptable description of quality before/after, and a compact word-diff format is an ideal output format for a LM to bugfix/update the context without the overhead of generating the entire module, particularly if done repeatedly as part of an “inner monologue” approach to incrementally update a program towards correctness rather than attempting to brute-force program writing in a single shot. (Plus, who has more Git repos or expertise in parsing Git patches than Github?) I look forward to seeing what better Codex models can do—Sam Altman mentioned in the 2021-09-05 SSC Q&A that the next version will be much better, and that one underappreciated benefit of the OA/MS partnership is the sheer volume of human feedback that OA can train models on.
Incidentally, Perlis also remarked that “In a 5 year period we get one superb programming language. Only we can’t control when the 5 year period will begin.” A fancy tab-complete can do wonders in making an enormously-overcomplicated ecosystem ‘discoverable’ and getting one unstuck, but it is still just a tab-complete. If you were designing a language/IDE/ecosystem on the presumption of a GPT-3-level model, would you really settle on “Visual Studio IDE for an OO-glazed ALGOL language but with a fancier tab-complete”? Given the importance of prompt engineering, a native DL programming solution would probably emphasize writing docs/comments explaining the intent to the paired model—“au pair programming”.