My new job at Lyft

In October, I joined Lyft as Data Science Manager for Core Mapping. I wish I could have posted this update a while ago but a big event got in the way (yes, we went public) ...

While low visibility, Mapping turns out to a big deal for ride-sharing as it has influence on a lot of other services. In a nutshell, Mapping has an impact on pricing, driver dispatch, scheduled rides and customer XP. Also, it is usually the biggest friction point for seamless pickups.

Why is Mapping important for ride sharing company ?

Mapping has traditionally 4 components: Basemap (representation of the world as a graph), Locations (where are drivers, passengers ?), Routing (optimal paths between locations) and ETA (distance and time between locations).

When you open your app, you see the following screen:

A lot of information displayed is connected to the work by my team:
  • The surrounding physical world: road segments, (train) stations, POI
  • The pick-up ETA (3min) looks at drivers around the PIN and compares it to demand. This gives an estimate how fast we can get a car to you
  • The drop-off ETA (10:03) estimates when we can get you to your destination 
  • The price ($36.66) which uses a lot of indicators including dropoff ETAs to make the ride fair for both parties: riders and drivers
  • The polyline (purple) showing the route to your destination

Once you click on "Request", the application will dispatch a driver to you. This decision is central to the efficiency of the platform and therefore very sophisticated. Getting a driver to you ASAP (using our ETAs) is of critical importance to make for the best user experience. Here, there are usually two ways of solving it: the greedy way (dispatching the closest driver) and the optimal way (solving the problem as a global minimization problem).

Exciting problems in Core Mapping 

What are some of the problems in Mapping that are very interesting in nature ? This is a broad question and I will just sample it down a few:

How does driver location influence ETA?
We collect GPS signal from our drivers by way of streaming to our services. There is a chance that it can be wrong: For instance, snapping the driver to the wrong road segment often occurs is urban canyon.

In the example below, what if the driver is falsely matched to Bryan Street but is actually about to enter I-80 ? Dispatching this driver would arguably be disastrous: The driver getting on I-80 may have to drive all the way to Oakland and back ! A small map match variation (in distance) has a non-linear relationship with ETA.

How does ETA influence Dispatch?

To provide the best user experience, dispatch uses pairwise (DVR,PAX) ETA to make the best Supply <==> Demand match. In the ideal state, dispatch minimizes the SUM(duration) across all dispatches. However, we only measure the dispatches that occurred and have little information about those that did not materialize. Gaining a better understanding of what leads to "Bad dispatch" (Whether false positive or false negative) is a central topic for my team and for Lyft !

As Lyft speeds up into 2020, Mapping will become an even more important topic e.g. for multi modal transportation. Reach out to me if you want to join the team !


Popular posts from this blog

Saving $12K on modeling jobs in AWS

Why Microsoft shouldn't be overlooked in Data Science