How GitHub Copilot Works

Ryan Salva is the VP of Product at GitHub and he gave a fantastic talk delving into how Copilot works. Plus, principles from The Effective Executive, a guide to Stock Options Conversations and more.

Hey Everyone!

Today we’ll be talking about

  • Under the Hood of GitHub Copilot

    • Ryan Salva is the VP of Product at GitHub and he recently gave a fantastic talk delving into how Copilot works

    • We’ll talk about how Copilot takes context from your editor, cleans it and then feeds it to the ChatGPT API. The LLM’s response is cleaned and then sent back to the user.

    • Copilot can also use Retrieval Augmented Generation (RAG) and plugins to add context to the input prompt

  • Principles from The Effective Executive by Peter Drucker

    • Being effective is a skill that needs to be learned and practiced

    • Spend a few weeks understanding exactly where your time goes

    • Shift your mindset to finding strengths and helping people utilize them.

    • Make effective decisions by following a systematic process and honing it over time.

  • Tech Snippets

    • The Guide to Stock Options Conversations

    • Building a Data Infrastructure Business in 2024

    • How Canva Built their Draw Tool

    • Write Clean Code to Reduce Cognitive Load

Index is a conference for backend engineers who want to learn about building search, analytics, and AI applications at scale.

Some of the talks this year will cover:

  • How Meta Built FAISS (a popular vector search library) by Matthijs Douze, Research Scientist at Meta AI Research and co-creator of FAISS

  • How DoorDash’s Shopping Recommendation System Works by Sudeep Das, Head of Machine Learning and AI at DoorDash

  • The Tech Behind the Online Data Systems Netflix Uses to Serve the Homepage by Shriya Arora, Engineering Manager at Netflix

  • The Architecture of the Uber Eats Recommendation System by Bo Ling, a Staff Software ML Engineer at Uber

You can join the conference virtually through Zoom or you can attend in-person at the Computer History Museum in Mountain View, Ca.

It’ll be a fantastic learning experience if you’re a backend engineer and also a great networking opportunity

It’s completely free to join!

sponsored

Under the Hood of GitHub Copilot

GitHub copilot is a code completion tool that helps you become more productive. It analyzes your code and gives you in-line suggestions as you type. It also has a chat interface that you can use to ask questions about your codebase, generate documentation, refactor code and more.

Copilot is used by over 1.5 million developers at more than 30,000 organizations. It works as a plugin for your editor; you can add it to VSCode, Vim, JetBrains IDEs and more, 

Ryan J. Salva is the VP of Product at GitHub, and he has been helping lead the company’s AI strategy for over 4 years. A few months ago, he gave a fantastic talk at the YOW! Conference, where he delved into how Copilot works.

We will be summarizing the talk and adding some extra context.

How GitHub Copilot Works

GitHub partnered with OpenAI to use the GPT-3.5 and GPT-4 APIs for generating code suggestions and handling question-answering tasks.

The key problem the GitHub team needs to solve is how to get the best output from these GPT models.

Copilot goes through several steps to do this

  1. Create the input prompt using context from the code editor: Copilot needs to gather all the relevant code snippets and incorporate them into the prompt. It continuously monitors your cursor position and analyzes the code structure around it, including the current line where your cursor is placed and the relevant function/class scope.


    Copilot will also analyze any open editor tabs and identify relevant code snippets by performing a Jacobian difference algorithm.

  2. Send the input prompt to a proxy for cleaning: Once Copilot assembles the relevant content for the prompt, it sends it to a backend service at GitHub. This service sanitizes the input by removing any toxic content from the user, blocking prompts irrelevant to software engineering, checking for prompt hacking/injection, and more.

  3. Send the cleaned input prompt to the ChatGPT API: After sanitizing the user prompt, Copilot passes it to ChatGPT. For code completion tasks (where Copilot suggests code snippets as you program), GitHub requires very low latency, targeting a response within 300-400 ms. Therefore, they use GPT-3.5 turbo for this.

    For the conversational AI bot, GitHub can tolerate higher latency and they need more intelligence, so they use GPT-4.

  4. Send the output from ChatGPT to a proxy for additional cleaning - The output from the ChatGPT model is first sent to a backend service at GitHub. This service is responsible for checking for code quality and identifying any potential security vulnerabilities.


    It’ll also take any code snippets longer than 150 characters and check if they’re a verbatim copy of any repositories on GitHub (to check that they’re not violating any code licenses). GitHub built an index of all the code stored across all the repositories so they can run this search very quickly. 

  5. Give the cleaned output to the user in their code editor - Finally, GitHub returns the code suggestion back to the user and it’s displayed in their editor.

Expanding the Context Window

In the previous section, we just talked about using context from the different tabs you have open in your code editor.

However, there’s a ton of other information that can be added to the prompt in order to generate better code suggestions and chat replies. Useful context can also include things like

  • Directory tree - hierarchy and organization of files and folders within the project

  • Terminal information - commands executed, build logs, system output

  • Build output - complication results, error messages, warnings

Copilot allows you to use tagging with elements like @workspace to pull information from these sources into your prompt.

There’s also a huge amount of additional context that could be helpful, such as documentation, other repositories, GitHub issues, and more.

To incorporate this information into the prompting, Copilot uses Retrieval Augmented Generation (RAG).

With RAG, you take any additional context that might be useful for the prompt and store it in a database (usually a vector database).

When the user enters a prompt for the LLM, the system first searches the database corpus for any relevant information and combines that with the original user prompt.

Then, the combined prompt is fed to the large language model. This can lead to significantly better responses and also greatly reduce LLM hallucinations (the LLM just making stuff up).

Another feature Copilot has planned is to expand the context window is plugins. This allows Copilot to call another API/service to gather data and perform actions.

For example, let’s say you receive a notification about an outage in your service. You can ask Copilot to check Datadog and retrieve a list of the critical errors from the last hour. Then, you might ask Copilot to find the pull requests and authors who contributed to the code paths of those errors.

Note - this is what Ryan was talking about in the talk but I’m not sure about the current status of agents in Copilot. I haven’t been able to use this personally and wasn’t able to find anything in the docs related to this. Let me know if you have experience with this and I can update the post!

Custom Models with Fine Tuning

Previously, what we talked about was with prompting - generating better prompts that ChatGPT can use to generate more relevant responses.

The other lever GitHub offers for Enterprises is custom models. More specifically, they can fine-tune ChatGPT to generate better responses.

Some scenarios were fine-tuning is useful include

  • Stylistic Preferences - a team might have specific coding styles, naming conventions, formatting guidelines, etc. Using a fine-tuned version of ChatGPT will enable Copilot to follow these rules.

  • API/SDK Versions - a team might be working with a specific version of an API/SDK. The ChatGPT model can be finetuned on a codebase that utilizes the targeted version to provide suggestions that are compatible and optimized for that specific development environment.

  • Proprietary Codebases - some companies have proprietary codebases that use technologies not available to the public. Fine-tuning ChatGPT allows it to learn the patterns of these codebases for more relevant suggestions.

Index is a conference for backend engineers who want to learn about building search, analytics, and AI applications at scale.

Some of the talks this year will cover:

  • How Meta Built FAISS (a popular vector search library) by Matthijs Douze, Research Scientist at Meta AI Research and co-creator of FAISS

  • How DoorDash’s Shopping Recommendation System Works by Sudeep Das, Head of Machine Learning and AI at DoorDash

  • The Tech Behind the Online Data Systems Netflix Uses to Serve the Homepage by Shriya Arora, Engineering Manager at Netflix

  • The Architecture of the Uber Eats Recommendation System by Bo Ling, a Staff Software ML Engineer at Uber

You can join the conference virtually through Zoom or you can attend in-person at the Computer History Museum in Mountain View, Ca.

It’ll be a fantastic learning experience if you’re a backend engineer and also a great networking opportunity

It’s completely free to join!

sponsored

Tech Snippets

Premium Content

Subscribe to Quastor Pro for long-form articles on concepts in system design and backend engineering.

Past article content includes 

System Design Concepts

  • Measuring Availability

  • API Gateways

  • Database Replication

  • Load Balancing

  • API Paradigms

  • Database Sharding

  • Caching Strategies

  • Event Driven Systems

  • Database Consistency

  • Chaos Engineering

  • Distributed Consensus

Tech Dives

  • Redis

  • Postgres

  • Kafka

  • DynamoDB

  • gRPC

  • Apache Spark

  • HTTP

  • DNS

  • B Trees & LSM Trees

  • OLAP Databases

  • Database Engines

When you subscribe, you’ll also get Spaced Repetition (Anki) Flashcards for reviewing all the main concepts discussed in past Quastor articles

Principles from The Effective Executive

The Effective Executive is a fantastic book by Peter Drucker that’s packed with timeless wisdom on how professionals can maximize their effectiveness. The book was first published in the 1960s and has been recommended and read over the past 50 years.

We’ll be summarizing the principles from the book and giving some additional context.

“What gets measured gets done”

Peter Drucker

Effectiveness can be Learned

One of the first principles of the book is on “Effectiveness“ and how it’s something that isn’t innate. Instead, “being effective“ is something that needs to be learned and developed, like any other habit.

Being effective comes down to several key abilities

  1. Understanding where your time goes

  2. Gearing your actions towards results

  3. Focusing on strengths

  4. Concentrating on areas with out-sized returns

Understand Where Your Time Goes

Instead of starting out with your tasks/planning, begin with your time and figuring out where it goes.

The three step process is

  1. Recording - Keep a log of your daily activities and note down how much time you’re spending on each task throughout the day. Try to do this in real-time since trying to recall at the end of the day can be unreliable. Try to do this for at least a week.

  2. Managing - After recording your time, analyze the data and identify patterns, time-wasters and areas for improvement. This can be quite tough but try to find areas that can be delegated and see why interruptions in your day are arising.

  3. Consolidating - Consolidate your time into larger, uninterrupted blocks. Most important work requires sustained concentration and this is only possible if you have significant chunks of time to devote to it. Constantly switching between tasks makes it extremely difficult to make meaningful progress on your important priorities.

Focus on Strengths

It’s crucial to shift your mindset into finding strengths and helping people utilize them. Don’t make staffing decisions based on minimizing weakness but instead on maximizing strengths.

Making strengths productive is a mindset and a continuous process, it’s not a single insight or technique. You have to practice this diligently.

Concentrate on One Task

There will always be more opportunities and tasks than there is time available. Your job is to determine what is truly important.

You need to avoid getting caught up in the flow of events and letting short-term demands dictate all your priorities. You should challenge assumptions about what is truly important or urgent. Many “crises” and fires can be anticipated and prevented with foresight.

Over time, you should develop the discipline to concentrate on a few vital priorities and give them the time and attention they need.

Making Effective Decision

The higher up you go, the more your effectiveness depends on your decision-making ability. Being good at decision-making is a systematic process and needs to be honed over time.

In the book, Drucker talks about the five elements of the decision process.

  1. Generic or Exceptional - identify if this situation is generic or if it’s exceptional. If it’s generic, then you can rely on past experience or rules around this situation. Otherwise, you’ll have to put more thought into the right course of action.

  2. Objectives - determine what objectives the decision has to accomplish. Are there any constraints (budget, time, etc.)?

  3. Establish the Right Action - Based on the first two elements, determine the right course of action. You should focus on long-term impact and make sure that it’s the correct course of action and not just something that’s acceptable and “sounds good”.

  4. Determine how it will be Implemented - Who has to know about the decision? Exactly what action has to be taken? Are the incentives properly set?

  5. Establish a Feedback Mechanism - There must be some clear way to measure results and see progress on the decision. What gets measured gets done.