☕ Imaging the Human Brain

Hey Everyone,

Hope you’re all having a fantastic day!

Today we’ll be talking about

  • Google AI’s reconstruction of the human brain

  • How to review code as a Junior Engineer

Today’s interview question is on Database Sharding.

The previous solution is on creating a deep copy of a linked list with a random pointer.

Interviewing.io is an awesome resource.

Book realistic mock interviews with senior FAANG engineers who will give you detailed and actionable feedback on exactly what you need to work on.

They prepare you for the pressure and stress that comes from an actual interview setting.

You don’t pay anything until you’re hired.

Check them out here.

Tech Snippets

How to Review Code as a Junior Developer - a terrific blog post by Emma Catlin at Pinterest. Here’s a quick summary.

  • Junior Devs may lack confidence when asked to conduct code reviews. “Why would a senior engineer want to have a junior engineer review their code?”

  • However, there are several benefits to reviewing code as a junior dev

    • Helps you learn the code base - reading well written code by senior engineers helps you quickly learn the code base. It also helps you know who to ask when you need help with a section of the codebase.

    • Helps you build a feedback circle - when you give other engineers feedback on their code, it becomes easier for you to ask them for feedback.

    • Being an owner - Reviewing code helps you take partial ownership over the team’s codebase. Many companies encourage an ownership mentality for developers.

  • How to develop an ability to code review?

    • Ask Questions - If something’s not clear to you, then it probably isn’t clear to everyone. Ask questions on why a piece of logic was added in a specific file vs. further downstream, what a specific section of code does, or what a particular comment means.

    • Calibrate feedback - Identify what your team members care about and calibrate feedback off that. Does your team have code guidelines that you can refer to?

    • Emulate others - Identify someone on your team who reviews code well and watch what they do. Observe things that they look/ask for and also observe how they ask.

  • A connectome is a map of all the neural connections in an organism’s brain. It’s useful for understanding the organization of neural interactions inside the brain.

  • Releasing a full mapping of all the neurons and synapses in a brain is incredibly complicated, and in January 2020, Google Research released a “hemibrain” connectome of a fruit fly - an online database with the structure and synaptic connectivity of roughly half the brain of a fruit fly.

  • The connectome for the fruit fly has completely transformed neuroscience, with Larry Abbott, a theoretical neuroscientist at Columbia, saying “the field can now be divided into two epochs: B.C. and A.C. — Before Connectome and After Connectome”.

    • You can read more about the fruit fly connectome’s influence here.

  • Google Research is now releasing the H01 dataset, a 1.4 petabyte (a petabyte is 1024 terabytes) rendering of a small sample of human brain tissue.

    • The sample covers one cubic millimeter of human brain tissue, and it includes tens of thousands of reconstructed neurons, millions of neuron fragments and 130 million annotated synapses.

  • The initial brain imaging generated 225 million individual 2D images. The Google AI team then computationally stitched and aligned that data to produce a single 3D volume.

    • Google did this using a recurrent convolutional neural network. You can read more about how this is done here.

  • You can view the results of H01 (the imaging data and the 3D model) here.

  • The 3D visualization tool linked above was written with WebGL and is completely open source. You can view the source code here.

  • H01 is a petabyte-scale dataset, but is only one-millionth the volume of an entire human brain. THe next challenge is a synapse-level brain mapping for an entire mouse brain (500x bigger than H01) but serious technical challenges still remain.

    • One challenge is data storage - a mouse brain could generate an exabyte of data so Google AI is working on image compression techniques for Connectomics with negligible loss of accuracy for the reconstruction.

    • Another challenge is that the imaging process (collecting images of the slices of the mouse brain) is not perfect. There is image noise that has to be dealt with.

    • Google AI solved the imaging noise by imaging the same piece of tissue in both a “fast” acquisition regime (leading to higher amounts of noise) and a “slow” acquisition regime (leading to low amounts of noise). Then, they trained a neural network infer the “slow” scans from the “fast” scans, and can now use that neural network as part of the connectomics process.

Interview Question

What is Database Replication and Database Sharding?

Why is Replication necessary?

What are the tradeoffs between Asynchronous vs. Synchronous replication?

Why is Sharding necessary?

What are some ways of implementing Database Sharding?

We’ll send a detailed solution tomorrow, so make sure you move our emails to primary, so you don’t miss them!

Gmail users—move us to your primary inbox

  • On your phone? Hit the 3 dots at the top right corner, click "Move to" then "Primary"

  • On desktop? Back out of this email then drag and drop this email into the "Primary" tab near the top left of your screen

Apple mail users—tap on our email address at the top of this email (next to "From:" on mobile) and click “Add to VIPs”

Previous Solution

As a refresher, here’s the last question

You are given a linked list with n nodes.

The nodes in the linked list have a next and prev pointer, but they also have a random pointer.

The random pointer points to a randomly selected node in the linked list (or it could point to null).

Construct and return a deep copy of the given linked list.

Solution

One way of solving this question would be with a hash table.

You create a hash table mapping the nodes from the old linked list to the nodes of the deep copy linked list. The keys in the hash table would be the nodes of the old linked list and the values would be the respective nodes in the deep copy.

You iterate through the given linked list and for each node in the linked list you

  1. Check if the current node has a mapping in the deep copy. If not, then create a node in the deep copy that is a copy of the current node. Create the mapping in the hash table.

  2. Check if the next node has a mapping in the deep copy. If not, then create a node in the deep copy that is a copy of the next node. Create the mapping in the hash table.

  3. Check if the random node has a mapping in the deep copy. If not, then create a node in the deep copy that is a copy of the random node. And… create the mapping in the hash table.

After iterating through the entire linked list, you can check the hash table for the copy of the original head node and return that.

Here’s the Python 3 code…

The time and space complexity are both linear.

However, this isn’t the most optimal way of solving the question. We can actually do it without using any additional data structure (other than new deep copy linked list we’re constructing).

The way we do it is by augmenting the original linked list.

We’ll first iterate through the original linked list and create a new node in between the cur node and the cur.next node.

This new node has the same value as cur and is meant to be cur’s copy in the deep copy linked list. We’ll ignore the random pointer for now.

We go through the entire linked list and create the new nodes.

So, if we had a linked list that looked like

1 -> 2 -> 3 -> 4

It will now look like

1 -> 1 -> 2 -> 2 -> 3 -> 3 -> 4 -> 4

Now, we iterate through the augmented linked list and set the random pointers for the new nodes we created.

Since the deep copy node will be immediately after the original node, we know that the deep copy node for the node pointed at by our random pointer will also be immediately after that node.

After setting all the random pointers, we’ll have to iterate through the augmented linked list again.

Now, we’ll be breaking the linked list up into two linked lists. One linked list will be all the original linked list nodes and the other linked list will be the deep copy nodes.

Now, we can return the deep copy linked list.

Here’s the Python 3 code.

If you want to practice more questions like this with top engineers at Google, Facebook, etc. then check out Interviewing.io.

You can book realistic mock interviews with senior FAANG engineers who will give you detailed and actionable feedback on exactly what you need to work on.

You don’t pay anything until you’re hired.

Check them out here.