Overview

Meili is a namespace within valhalla which is responsible for providing map matching functionality in the library. This scope of this functionality is essentially limited to the approach defined in the siminole microsoft paper outlining the technique: https://www.microsoft.com/en-us/research/publication/hidden-markov-map-matching-noise-sparseness/

The important thing to note here is that Meili is not responsible for packaging the results of the map match into a route path as defined in our primary routing module Thor. Thus there is a non-trivial amount of effort to take the output of Meili and convert it to the expected format.

Meili Code Layout

AppendMeasurements

The first step Meili does is decide which tracepoints its going to map match. You might think that it map matches every point (and it does) but it doesnt actually do the routing computation using every point if it can avoid it. We allow specifying an interpolation distance which groups close together points such that only one will be used and the rest will be interpolated onto the map match after it is performed. This not only speeds up the code but it also avoids problems with GPS jitter of stationary objects. This work is done in the function AppendMeasurements which append tracepoints into what can be thought of as a matrix.

Whats in this matrix? For every input point that AppendMeasurements decides it will use in the routing calculations a column is made in the matrix. Each column can have 1 or more rows but they will not all have the same number of rows. Each row represents an edge candidate for a given trace point. An edge candidate is a snap point along an edge in the graph within the radius of the input trace point.

Model

The above is an image of the 4 magenta edge candidates for an input trace point with a 50m radius. AppendMeasurements eventually calls down into CandidateQuery::Query to get the list of candidates for a given trace point. It should be noted that CandidateQuery provides the same functionality Loki::Search does however there are some key differences. The main difference is that CandidateQuery keeps a fine resolution in-memory spatial index/cache of route network geometries. What this means is that it can have much higher throughput (once the cache is warm) than loki for high numbers of points. This is a key difference between map matching and routing use-cases. In routing we dont expect 1000s of way points but GPS traces are frequently reported at 1hz. Which means a 15 minute trace is already close to 1000 points. In the future we'd like to remove CandidateQuery and replace it with functionality from Loki::Search but at this time performance considerations keep us from doing so.

Viterbi

Viterbi is a dynamic programming algorithm used to find paths (eg. Markov chains) through a hidden state diagram such as a hidden Markov model. Read more about it here: https://en.wikipedia.org/wiki/Viterbi_algorithm. Within Meili we use the algorithm to determine the highest probability map match while doing the least number of routing calculations possible. You can think of the matrix described above as a series of nodes in a state diagram. Each row in each column has a connection to each row in the prevoius and next column in the diagram. We refer to these unique column/row pairs as States through out the code and each is given a StateId which represents which column (0..n) and which candidate (0..m) it refers to. In the figure below candidates 0 1 and 2 are in column 0 whereas candidates 3, 4, 5 and 6 are in column 1.

Model

ViterbiSearch is run by iterating over pairs of columns and running routes between their candidates (rows). While doing so we measure two metrics, emission and transition cost. These costs are inversely related to the probabilities in the hidden markov model. Emmission cost is simply a measure of how close to the road network a given candidate is where as transition cost is a function of network distance along the route between two candidates in the graph.

At the end of the viterbi search a highest probability (lowest cost) path, ie a sequence of StateIds, is returned. There may or may not be sections of the path where no route was possible between a given pair of States or even between columns (all pairs of candidates in 2 adjacent columns).

Match Points

After a path through the state diagram as been computed we need to get the first part of the output from Meili which is to say which candidates were used in the final path. To do that we call FindMatchResults which loops over the states and an returns a vector of MatchResult objects. The MatchResult contains information about which edge it matched to, what the lat,lon is once snapped to the edge, the distance along the edge as a percentage among other things. This is useful because the final service APIs mapmatching output contains this metadata, either indirectly via a origin/destination location or directly in trace_attributes or the osrm flavored output.

Interpolation

The MatchResults that we got back only included those points which AppendMeasurements deemed necessary for the routing calculation. That means that for some of the input points, as mentioned earlier, we didnt actually attach a column to them or any states. This is kind of what it looks like in an ascii diagram:

p1p2------p3-------p4p5p6-----------p7--------------p8-----------p9p10

In the above {p2, p5, p6, p9} are all interpolated because they are close in distance to a prior point or in p9s case close to the last point which cannot be interpolated. So what we do next is we loop over pairs of states again and if between those two states there were unused (ie interpolated) input points then we project each of those points in succession onto the route between them. Using the example above, if we found a route between p4 and p7, then we would project p5 and p6 onto this route geometry to compute their MatchResults. Interpolation also gaurantees that sequence ordering remains unchanged, ie p4 comes before p5 comes before p6 comes before p7 in the final route geometry.

Route Building

Next we get the second part of Meil's output which is the actual path through the graph that was taken. We compute this with a call to ConstructRoute. Construct route uses the States to get at sequences of edges stored as a collection of EdgeLabels in a LabelSet. The LableSet contains all the edge labels a graph expansion saw. For a given State we know which was the last label it saw when it reached the destination State. So to get the route out we simply follow the chain of EdgeLabels back to the origin State. Its like a linkedlist but without pointers, instead it uses indices into the LabelSet. The MergeRoute function is responsible for this recovery of a path between two states.

From these EdgeLabels we make a vector of EdgeSegment objects, which is the final object that is returned. The EdgeSegment stores information about which edge in the graph it is, how much of it was used and what was the first and last MatchResult that got matched onto this EdgeSegment. It also contains information about whether or not there is a discontinuity after this EdgeSegment. As described before a discontinuity occurs when no paths for any candidates can be found from between two columns.

During this process we also take care to cut EdgeSegments where a MatchResult is marked as a break or break_through location. These types of locations denote where we want to have route legs in the final output. So that we don't have to break them down later in serialization, we make sure to split edges when constructing the route.

Alternatives

Meili supports the notion of alternatives. The API calls this "best_paths" but should be refactored to use "alternatives". What this allows a user to do is get the top k most probable paths with a couple caveats.

The first caveat is that if there is a discontinuity in one of the results we will not return any more results after that. So if the first match had a discontinuity you'll only get 1 result even if you asked for 2.

The second caveat is that redundant paths are not returned. What is a redundant path? Its a path that has already been seen in a previous result. Technically this means that the sequence of EdgeSegments has already been seen. Why would this happen? It is common that a MatchResult has two candidates, but no matter which one is used, the sequence of edges in the path remains the same. Basically the MatchResults may move around but the path is the same. This commonly occurs at intersections where one candidate may be some distance along the edge and the alternative candiate will be at the end of the previous edge. Either way both edges are on the path, the path didn't materially change.

Complications

The biggest complication in map matching by far is that of dealing with node snapped candidates. That is to say when an input trace points closest point on the graph is a node in the graph connecting 2 or more edges. This type of candidate presents 2 problems. The first is one of performance. You could say, no problem we'll just put a candidate for every edge at that node, but this is very wasteful as each candidate added increases the number of permutations of routes that viterbi will cause to be computed. So instead of doing that we have special logic in the router there handles node snapped candidates as a single candidate. The second problem with node candidates is that of ambiguity. Because a node does not refer to an edge but all of our other datastructures do (MatchResult and EdgeSegment) we end up needing to put a bunch of special case logic to patch things up if they are happening at a node. Specifically when we have a trace point that is the split between two legs in a route, we end up having the end of one leg on one edge and the beginning of the next leg on another edge and using the same MatchResult (which only stores one edge reference) means that it disagrees with one of the legs.

Thor Contract

Meili itself does not have an external API. It kind of used to but has since been refactored to be accessed via the rest of our routing APIs. What this means is that it must fulfill the same contract that Thor does. That contract consists of a series of Locations with the chosen candidate for each Location getting filled out (in this case we translate the MatchResult into this) as well as a path which is a series of PathInfos representing the edgs on the path and their cost/duration.

The bulk of what has been described about Meili above was refering to its main entry point OfflineMatch (offline refers to the type of algorithm, see here: https://en.wikipedia.org/wiki/Online_algorithm). Thor must take the results of this function which, as described earlier, is a series of MatchResults and a series of EdgeSegments and convert them. This conversion takes place first via FormPath (every path finding algorithm implements one of these) and then via a call to TripLegBuilder::Build for each leg of the route. FormPath builds the vector of PathInfo objects from the EdgeSegments. The MatchResults are used to build origin and destination Locations. The PathInfos and Locations are then sent to TripLegBuilder::Build as with every other routing operation.