☕ Google AI

We talk about self driving cars in China, Google's new AI product for businesses, Peer to Peer Networks and a medium difficulty interview question around linked lists.

Good Morning Planet Earth!

Hope you’re all having a fantastic day!

Let’s get to the Interview Question, Daily News and Previous Solution!

Interview Question

You are given a linked list.

Sort the nodes in the list in ascending order and return the sorted linked list.

Do this in O(n log n) time and O(1) space.

Industry News

Driverless cars startup Pony.ai raises $267 million dollars at a $5.3 billion dollar valuation

Pony.ai is a chinese driverless car startup founded by James Peng and Tiancheng Lou. James Peng was the former chief architect at Baidu (China’s Google) and Tiancheng Lou worked on Google X’s autonomous car project. They are one of the few companies to have received an autonomous vehicle testing permit in Beijing.

The company uses a full-stack hardware platform, PonyAlpha, which involves LIDAR, radar and cameras. Last October, they partnered with Hyundai to launch BotRide, their second public robo-taxi service after a pilot program in Nansha, China. BotRide allowed users to hail autonomous Hyundai Kona electric SUVs, although the cars had human safety drivers behind the wheels.

Pony.ai’s main competition in China is Baidu. Baidu is developing their own autonomous vehicles called Apollo Auto. Other competitors include Didi (China’s Uber) and AutoX (backed by Alibaba).

The company has now announced a $267 million dollar round of funding where the company is valued at over $5.3 billion dollars. Investors include Sequoia Capital China, Fidelity China and ClearVue Partners. Enthusiasm for driverless cars clearly has not waned during the pandemic!

Google launches Document AI

Only 18% of companies consider themselves to be paperless, and companies spend an average of $20 to file and store a single paper document. U.S. companies waste a total of $8 billion dollars every year on managing physical documents and challenges around that area result in an estimated 20% productivity loss.

Google is launching a new product, aptly called Document AI, to tackle this problem (and make some money!) DocAI provides a full suite that uses Optical Character Recognition to read physical documents and store them on Google’s servers. They then have a host of tools that allow you to parse, split, modify or copy your online paperwork via the DocAI API. Users don’t need to do any training or data mapping.

Google’s parsers can extract and classify data like addresses, account numbers, signatures, supplier names, payment terms, etc. Google also has tools available to parse specific documents like W2, 1040, W9, etc. where they can extract extremely specific details.

DocumentAI is part of the Google Cloud Platform and is Google’s latest offering to entice businesses to switch over to GCP from AWS or Azure.

Cloud Computing is very profitable with AWS (Amazon Web Services) having a 30.5% operating margin. Around 57% of Amazon’s operating income comes from AWS!

Previous Solution

As a reminder, here’s the previous question

What are Peer to Peer networks?

In what kind of scenarios do they come in handy for System Design?

Can you give an example of a system where you might use a Peer to Peer network?

Solution

In a traditional network, you have a client/server setup, where a client will request files from the server.

This works great for the majority of use cases, but what happens when you have thousands of clients all requesting a specific item from a server? What if the item is a large file (several gigabytes in size)?

The server quickly becomes a bottleneck and you’ll have to add more servers. However, this is an inefficient way of scaling your system. What if you could design your system so that your clients shared the item with each other. Each client is also a server.

This is what Peer to Peer Networks do. Rather than having a bunch of clients and a bunch of servers, they have a bunch of peers.

Peer to Peer Networks typically have a tracker, that contains a list of all the peers in the network. This allows computers who want the file to easily join the P2P network and get a list of all the nodes. This problem is called Peer Discovery.

Now, the download progress on each peer will differ depending on how long the peer has been in the network. Therefore, Peer Selection becomes an interesting question. How can we figure out which peers in the network have the data that we need to finish downloading our file?

P2P Networks solve this with a DHT (Distributed Hash Table). A DHT is a distributed key/value store and for P2P networks it will contain a list of what peers hold which pieces of data.

The use cases for P2P networks (as suggested earlier) is when you want to share a large file between many different computers in a network.

A real world example of this use case is with Docker images. Sharing these Docker images between all your servers in your data center can be extremely tedious using the traditional client/server model. This becomes even more of a pain if you want to modify the docker images.

Uber solved this problem by building a Kraken, an open source P2P-powered Docker registry that allows for easy Docker image management, replication and distribution.

Another real world use case is BitTorrent, which is a communication protocol for P2P file sharing that allows users to distribute files across the internet. The BitTorrent protocol is one of the most common protocols for transferring large files. In 2019, the BitTorrent generated 27% of upstream traffic for the entire internet.

Thanks for reading and be sure to reply back with any feedback!

Best,

Arpan