Loki

Loki can be used to associate location information to an underlying graph tile object for use in creating input to the routing engine. In keeping with the Norse mythological theme, the name Loki was chosen as a play on the word locate. Since loki deals mostly with correlating some input (minimally a lat,lon) to an object within a graph tile, this seemed like a fitting name!

Loki is essentially a set of various data structures and alogrithms which deal with things like: correlating an input location to the underlying graph, partial distance along an edge and filtering edges which shouldn't be considered for correlation.

Components

What follows are some notable components of loki.

What's it do?

The primary function of loki is to correlate the given input coordinate(s) to the underlying routing graph by searching over small portions of said graph near said input. The goal is to return information about the graph that can be used by a router to find a path in the graph from one point to another. Generally this is a set of candidate edges for each input coordinate. Additionally the actual correlated (snapped to route network geometry) coordinate is returned which can be useful for things like placing transit egress points on the route network or fun trivia stuff like dropping a point in the atlantic or a massive desert to find the closest signs of civilization to that point.

How's it work?

First, we exploit the fact that the graph is tiled and separated into levels of detail. See Why Tiles? for more detail. This is important because it lets us focus our search on specific partitions of the graph which are closest to the input coordinate. Also, note that loki is only interested in the most detailed level as it has all of the edges in the graph, even the lesser importance ones. You might be thinking ok so those tiles must be pretty darn small to make this reasonable (time/space complexity). Because the tiles are a regularly spaced grid, some will land on almost no routable edges and some will land on, well, Tokyo.

This may not seem like much of a problem at first but closer inspection reveals its very important to strike a balance between tile size and complexity concerns. The heart of the problem is that we use these tiles for several different use-cases (algorithms) and each has different needs. Performing the actual graph traversal algorithm benefits from loading larger tiles into memory so the expansion of the graph doesn't hit (load more) new tiles so frequently. At the same time you want smaller tiles so that when you need to find just a couple edges nearest to a point you don't need to consider so many edges. There are more concerns but these illustrate the point well enough.

So to settle on a size we looked at which operation dominated (in terms of complexity) with the intuition that optimizing things for that would give the most bang for the buck. That lead us to larger (for some value of large) tiles to benefit graph traversal. It did though mean that some tricks needed to be carried out on the graph correlation use-case. We're getting ahead of ourselves though. How did it work with native high detail level tile sizes? You noticed the past tense there right? Foreshadowing!

The initial implementation (lets call it v0) of this was very very simple. For a single input coordinate, loki would just open the tile that the coordinate was in, scan through all the nodes in that chunk of the routing graph and return the edges connected to the one that was closest to the input. If there were no nodes in the tile, it would return no result. As you may have guessed though, it was really fast (sub-millisecond in Tokyo). You may also have guessed that it was really annoying. The closest node is often not even connected to the correct edge and can be pretty far away from the input coordinate in sparser tiles. This brings up another point about what a good implementation of loki should do. It should find the most plausible result not just the quickest one. This sounds obvious but of course the two things are at odds.

Next we were onto the second implementation (v1 I guess?) which entailed actually looking at the edges' shapes. Now this operation is expensive... The shapes are compressed into strings via LEB128 encoding, which means to look at one we need to decompress it. This is expensive (comparatively), so its important to look at as few as possible. Here is where we run into a problem. We really want to look at all the edges that have shape near the input coordinate, but we have no idea which ones those are without looking at every edge's shape. So we did a few tricks. The first one was that since scanning through the nodes was free (again comparatively) we would do that first and if we were within a certain distance from a node we'd just return that result. This was problematic if the distance threshold wasn't set high enough because it could snap the result to a node on an opposing lane of traffic. Tune the threshold smaller and you make the likelihood of snapping to a node so low its not worth doing. The second trick we employed was using the length of an edge as a heuristic to determine whether an edge should or should not be considered (ie. have its shape decoded) with respect to a given input point. Basically if an edge was long enough to have reached the input point from one of its endpoints it would be considered. This weeded out a lot of short edges that weren't close to the input, but it had the side effect of over emphasizing larger edges nearer to the input. In other words it was not all that fast and there were lots of problems with it: finds result in wrong tile, finds no result because its in the adjacent tile, snaps to the wrong road because of node snapping, finds the wrong road because of over emphasis on longer edges (see issues: #72, #62, #52).

So we were in a bit of a terrible state... Wrong results or no results and nothing that was particularly fast (median 130ms across NYC). It was time to come back to the tile size... But honestly it was only affecting this use-case. The case in which you are using the tiles as a geospatial index to limit your search to a small geographic region. So we had an idea. Add an index to each tile at high enough resolution for it to be performant. So we hopped over to mjolnir and did it. Essentially for each tile, we cut it up into a 5x5 set of bins and for each bin we note which edges' shapes pass through it. Mjolnir is what makes our tiles, meaning we do this once at tile creation time so the index is cached with the tile (its basically appended to the end with the shape data and so forth). Another important trick we did was reference the edges in these bins in a global fashion (an edge from another tile can pass into a bin in this tile, I like to call these hairy tiles). This lets us find edges that cross tiles boundaries (one of the problems noted above). Finally we employed one more trick to make the indexes easier to use. The graph is arranged so that you can traverse an edge in both the forward and reverse direction. To do this, each edge is represented by two directed edges (one from A to B and one from B to A). Each of these directed edges has its own id. We only ever store one of these two ids in the bins the edges intersect. This cuts down on redundant shape decoding.

So what does the current implementation look like (v2?)? Well, after the indexes were added to the tiles we went to work using them. We wrote a little utility that lets you iterate (kind of like a python generator) over the bins in a closest first fashion. Basically we grab this closest bin first generator, seed it with the input point and iterate over the bins it passes us. If the bin is empty we move on, if its not, we decode the shapes of each edge in the bin and see if any edge was closer to this point than the current known closest. We terminate when either we've searched too far without finding anything or the bin we are about to look in is farther away than our current known closest edge. Since we look at bins in closest first order we shouldn't be able to find anything closer if the closest point in the bin is further than the current known closest edge.

This implementation is small, it prunes a lot of search space, it doesn't find the wrong edge (in terms of closeness), it can find things in tiles not containing the input coordinates and of course its way fast (median 4ms across NYC). It does cost in data size but it was measured as a less than %5 increase across the planet tileset.

What's next?

In terms of features there are a few things I didn't mention that the current (and previous) version of the algorithm do provide. The most important of these is probably the filtering out of edges that are not traversable by you for the mode of travel you've selected. For example, you can't drive a car on the Appalachian Trail so if you request a driving route with points along the trail, loki will filter those trail edges out because they don't provide car access. It will end up correlating your coordinates to edges closest to your input which do have car access though.

Another feature provided by the current (and previous) version of loki is that of favoring a direction of travel. This is very useful in a mobile navigation context. Say you are speeding down a two lane road when your passenger presses the 'get route' button. As previously mentioned, even if you two are smack in the middle of a long edge with no other connecting edges around, technically there are two results, the edge in the direction you are traveling and the edge in the reverse direction of travel. The thing is, you don't care about the route in the reverse direction of travel. So loki allows you to specify a favored direction of travel and if one of the directions is with in a 90 degree window of that then it will be included in the candidate set passed onto the routing algorithm.

You might be noticing a theme here. In some cases we want to reject or favor certain edges based on some criteria (usually derived from the users request parameters). And this is the first area of future work. For a given edge we essentially want to assign a probability which denotes its likelihood of being the actual edge intended in the request. At this point the probabilty is binary. It either is traversable by your mode of travel or not. It either is in the general direction you are going, or not. It either is the closest edge to your input or not. It would seem we should want to have each of these factors influence a portion of the final probability. This would allow us to add other factors in the future. For example, if the request supplied a street name with the coordinate we could give a higher weighting to edges with similar street names. This means we would want to actually look at least a little bit further than the absolute closest edge in the presence of that edge not matching the provided street name.

Another general problem is that of islands of connectivity. Without traversing the route network, one cannot say for certain that two edges are reachable from one another (for a given mode of travel). So it happens with some degree of probability that loki will correlate a pair of inputs but the routing algorithm will find no path between them. In some cases this is by design. In fact at a coarse level, if two regions are not possibly connected, think O'ahu and Mt. Everest, we won't even ask loki to correlate the coordinates to the graph. Even places that are quite close can have reachability issues which indeed should fail to find paths (unless you have a boat/hanglider/climbing gear handy). However there are a number of fairly frequently occurring cases that follow a pattern similar to the following. A user is somewhere on the road network (highly connected). User wants to go to a POI. The POI itself might be a polygon such as a park, the white house or a golf course, which will give an input coordiate as the centroid of that feature. Or it may be a point POI, such as a city center or something relatively small like an overlook on a mountain. The problem with all of these is that their location, the coordinate that will be provided as input to loki was picked so that the feature could be properly labeled on the map. In other words the coordinates are those you'd expect to use if you wanted to see where the thing is. This is often very unhelpful for the purpose of routing. Not only could routing to that label point be suboptimal (the parking for the golf course is not near its centroid), but it could actually put you on an island of limited connectivity. So that when you pass these results to the routing algorithm you won't find a path.

To tackle this issue we have a few options. We could at data creation time crawl the route network to find small islands of connectivity. We could mark the edges in these islands so that loki would know to only send them to the routing algorithm if both input coordinates were in the same island. Or we could use a multi-pass approach in which we have the routing algorithm detect when its search is trapped in an island of connectivity and send the list of edges with in back to loki as a set of edges excluded from the correlation process. That latter would seem like the best option at this point in time simply because the information needed to store and time to crawl the tiles to find these small islands of connectivity would be prohibative.

The final area for future work would be an elaboration to what was said earlier about wanting only to look a the highest detail level of route network data. One could conceive of a scenario in which a user has a route and they want to drag a portion of that route so as to force it toward a certain feature. If the route network is dense where that feature lives but the users map is zoomed out such that the user only sees certain route network edges loki should attempt to correlate to those rather than the possibly not visible edges in the area. Essentially when doing a correlation at a course zoom level we may want to exclude certain classes of edges that are unlikely to be visible to the user interacting with the map.

Benchmark

TODO: