How GitHub Copilot Works

Ryan Salva is the VP of Product at GitHub and he gave a fantastic talk delving into how Copilot works. Plus, principles from The Effective Executive, a guide to Stock Options Conversations and more.

April 04, 2024

Hey Everyone!

Today we’ll be talking about

Under the Hood of GitHub Copilot
- Ryan Salva is the VP of Product at GitHub and he recently gave a fantastic talk delving into how Copilot works
- We’ll talk about how Copilot takes context from your editor, cleans it and then feeds it to the ChatGPT API. The LLM’s response is cleaned and then sent back to the user.
- Copilot can also use Retrieval Augmented Generation (RAG) and plugins to add context to the input prompt
Principles from The Effective Executive by Peter Drucker
- Being effective is a skill that needs to be learned and practiced
- Spend a few weeks understanding exactly where your time goes
- Shift your mindset to finding strengths and helping people utilize them.
- Make effective decisions by following a systematic process and honing it over time.
Tech Snippets
- The Guide to Stock Options Conversations
- Building a Data Infrastructure Business in 2024
- How Canva Built their Draw Tool
- Write Clean Code to Reduce Cognitive Load

Under the Hood of GitHub Copilot

GitHub copilot is a code completion tool that helps you become more productive. It analyzes your code and gives you in-line suggestions as you type. It also has a chat interface that you can use to ask questions about your codebase, generate documentation, refactor code and more.

Copilot is used by over 1.5 million developers at more than 30,000 organizations. It works as a plugin for your editor; you can add it to VSCode, Vim, JetBrains IDEs and more,

Ryan J. Salva is the VP of Product at GitHub, and he has been helping lead the company’s AI strategy for over 4 years. A few months ago, he gave a fantastic talk at the YOW! Conference, where he delved into how Copilot works.

We will be summarizing the talk and adding some extra context.

How GitHub Copilot Works

GitHub partnered with OpenAI to use the GPT-3.5 and GPT-4 APIs for generating code suggestions and handling question-answering tasks.

The key problem the GitHub team needs to solve is how to get the best output from these GPT models.

Copilot goes through several steps to do this

Create the input prompt using context from the code editor: Copilot needs to gather all the relevant code snippets and incorporate them into the prompt. It continuously monitors your cursor position and analyzes the code structure around it, including the current line where your cursor is placed and the relevant function/class scope.

Copilot will also analyze any open editor tabs and identify relevant code snippets by performing a Jacobian difference algorithm.
Send the input prompt to a proxy for cleaning: Once Copilot assembles the relevant content for the prompt, it sends it to a backend service at GitHub. This service sanitizes the input by removing any toxic content from the user, blocking prompts irrelevant to software engineering, checking for prompt hacking/injection, and more.
Send the cleaned input prompt to the ChatGPT API: After sanitizing the user prompt, Copilot passes it to ChatGPT. For code completion tasks (where Copilot suggests code snippets as you program), GitHub requires very low latency, targeting a response within 300-400 ms. Therefore, they use GPT-3.5 turbo for this.
For the conversational AI bot, GitHub can tolerate higher latency and they need more intelligence, so they use GPT-4.
Send the output from ChatGPT to a proxy for additional cleaning - The output from the ChatGPT model is first sent to a backend service at GitHub. This service is responsible for checking for code quality and identifying any potential security vulnerabilities.

It’ll also take any code snippets longer than 150 characters and check if they’re a verbatim copy of any repositories on GitHub (to check that they’re not violating any code licenses). GitHub built an index of all the code stored across all the repositories so they can run this search very quickly.
Give the cleaned output to the user in their code editor - Finally, GitHub returns the code suggestion back to the user and it’s displayed in their editor.

Expanding the Context Window

In the previous section, we just talked about using context from the different tabs you have open in your code editor.

However, there’s a ton of other information that can be added to the prompt in order to generate better code suggestions and chat replies. Useful context can also include things like

Directory tree - hierarchy and organization of files and folders within the project
Terminal information - commands executed, build logs, system output
Build output - complication results, error messages, warnings

Copilot allows you to use tagging with elements like @workspace to pull information from these sources into your prompt.

There’s also a huge amount of additional context that could be helpful, such as documentation, other repositories, GitHub issues, and more.

To incorporate this information into the prompting, Copilot uses Retrieval Augmented Generation (RAG).

With RAG, you take any additional context that might be useful for the prompt and store it in a database (usually a vector database).

When the user enters a prompt for the LLM, the system first searches the database corpus for any relevant information and combines that with the original user prompt.

Then, the combined prompt is fed to the large language model. This can lead to significantly better responses and also greatly reduce LLM hallucinations (the LLM just making stuff up).

Another feature Copilot has planned is to expand the context window is plugins. This allows Copilot to call another API/service to gather data and perform actions.

For example, let’s say you receive a notification about an outage in your service. You can ask Copilot to check Datadog and retrieve a list of the critical errors from the last hour. Then, you might ask Copilot to find the pull requests and authors who contributed to the code paths of those errors.

Note - this is what Ryan was talking about in the talk but I’m not sure about the current status of agents in Copilot. I haven’t been able to use this personally and wasn’t able to find anything in the docs related to this. Let me know if you have experience with this and I can update the post!

Custom Models with Fine Tuning

Previously, what we talked about was with prompting - generating better prompts that ChatGPT can use to generate more relevant responses.

The other lever GitHub offers for Enterprises is custom models. More specifically, they can fine-tune ChatGPT to generate better responses.

Some scenarios were fine-tuning is useful include

Stylistic Preferences - a team might have specific coding styles, naming conventions, formatting guidelines, etc. Using a fine-tuned version of ChatGPT will enable Copilot to follow these rules.
API/SDK Versions - a team might be working with a specific version of an API/SDK. The ChatGPT model can be finetuned on a codebase that utilizes the targeted version to provide suggestions that are compatible and optimized for that specific development environment.
Proprietary Codebases - some companies have proprietary codebases that use technologies not available to the public. Fine-tuning ChatGPT allows it to learn the patterns of these codebases for more relevant suggestions.

Tech Snippets

The Guide to Stock Options conversations

This is a terrific guide on how managers should talk about stock options with their employees. This can help prevent the mentality of “assume your equity will be worth nothing“ which can be very inaccurate, especially for late-stage startups.

Basic details employees should know include “what is the exercise price“, “what is the current price of each share“, “what do they gain in a liquidation event“ and more.

zaidesanton.substack.com/p/the-guide-to-stock-options-conversations

Building a Data Infrastructure Business in 2024

The engineering time required to build a distributed data system has never been so low. As a result, you’re seeing a huge amount of projects around storage layers, table formats, queues and other data system components.

Many of these components are becoming components which means companies have to figure out a new way to differentiate themselves. Jack Vanlightly wrote a great blog post delving into how data infra startups can build a moat in 2024.

jack-vanlightly.com/blog/2024/3/26/the-sisyphean-struggle-and-the-new-era-of-data-infrastructure

How Canva Built their Drawing Tool

Canva is an online platform that lets you easily create graphics like posters, images, social media banners, videos and more. Their draw tool lets users create visually appearing drawings by just using their mouse to click and drag a pen tool.

They published an in-depth article delving into how they built this tool and the engineering problems that came up. Some of the issues they had to deal with include performance challenges with vector graphics on mobile devices, optimizing SVG paths to reduce storage requirements and more.

www.canva.dev/blog/engineering/behind-the-draw

Write Clean Code to Reduce Cognitive Load

This is a good blog post from Google’s Testing blog with tips on writing cleaner code.

Tips include minimizing mutable state, only including relevant details in tests, avoiding mocks in testing and more.

Each tip also comes with code examples so you can see what good and bad practices look like.

testing.googleblog.com/2023/11/write-clean-code-to-reduce-cognitive.html

Premium Content

Subscribe to Quastor Pro for long-form articles on concepts in system design and backend engineering.

Past article content includes

System Design Concepts

Measuring Availability
API Gateways
Database Replication
Load Balancing
API Paradigms
Database Sharding
Caching Strategies
Event Driven Systems
Database Consistency
Chaos Engineering
Distributed Consensus

Tech Dives

Redis
Postgres
Kafka
DynamoDB
gRPC
Apache Spark
HTTP
DNS
B Trees & LSM Trees
OLAP Databases
Database Engines

When you subscribe, you’ll also get Spaced Repetition (Anki) Flashcards for reviewing all the main concepts discussed in past Quastor articles

Principles from The Effective Executive

The Effective Executive is a fantastic book by Peter Drucker that’s packed with timeless wisdom on how professionals can maximize their effectiveness. The book was first published in the 1960s and has been recommended and read over the past 50 years.

We’ll be summarizing the principles from the book and giving some additional context.

“What gets measured gets done”

Peter Drucker

Effectiveness can be Learned

One of the first principles of the book is on “Effectiveness“ and how it’s something that isn’t innate. Instead, “being effective“ is something that needs to be learned and developed, like any other habit.

Being effective comes down to several key abilities

Understanding where your time goes
Gearing your actions towards results
Focusing on strengths
Concentrating on areas with out-sized returns

Understand Where Your Time Goes

Instead of starting out with your tasks/planning, begin with your time and figuring out where it goes.

The three step process is

Recording - Keep a log of your daily activities and note down how much time you’re spending on each task throughout the day. Try to do this in real-time since trying to recall at the end of the day can be unreliable. Try to do this for at least a week.
Managing - After recording your time, analyze the data and identify patterns, time-wasters and areas for improvement. This can be quite tough but try to find areas that can be delegated and see why interruptions in your day are arising.
Consolidating - Consolidate your time into larger, uninterrupted blocks. Most important work requires sustained concentration and this is only possible if you have significant chunks of time to devote to it. Constantly switching between tasks makes it extremely difficult to make meaningful progress on your important priorities.

Focus on Strengths

It’s crucial to shift your mindset into finding strengths and helping people utilize them. Don’t make staffing decisions based on minimizing weakness but instead on maximizing strengths.

Making strengths productive is a mindset and a continuous process, it’s not a single insight or technique. You have to practice this diligently.

Concentrate on One Task

There will always be more opportunities and tasks than there is time available. Your job is to determine what is truly important.

You need to avoid getting caught up in the flow of events and letting short-term demands dictate all your priorities. You should challenge assumptions about what is truly important or urgent. Many “crises” and fires can be anticipated and prevented with foresight.

Over time, you should develop the discipline to concentrate on a few vital priorities and give them the time and attention they need.

Making Effective Decision

The higher up you go, the more your effectiveness depends on your decision-making ability. Being good at decision-making is a systematic process and needs to be honed over time.

In the book, Drucker talks about the five elements of the decision process.

Generic or Exceptional - identify if this situation is generic or if it’s exceptional. If it’s generic, then you can rely on past experience or rules around this situation. Otherwise, you’ll have to put more thought into the right course of action.
Objectives - determine what objectives the decision has to accomplish. Are there any constraints (budget, time, etc.)?
Establish the Right Action - Based on the first two elements, determine the right course of action. You should focus on long-term impact and make sure that it’s the correct course of action and not just something that’s acceptable and “sounds good”.
Determine how it will be Implemented - Who has to know about the decision? Exactly what action has to be taken? Are the incentives properly set?
Establish a Feedback Mechanism - There must be some clear way to measure results and see progress on the decision. What gets measured gets done.