by Andy Boyd
Today, we drive to work. The University of Houston presents this series about the machines that make our civilization run, and the people whose ingenuity created them.
Imagine you wake up tomorrow and find your favorite route to work is closed. Not only that, but so many roads are under construction you can't think of a good alternative. Going to the computer, you visit a website and ask for the shortest route. It responds in a fraction of a second with the answer. And if all the information's correct, it's sure to be the shortest route.
So how did the website do this? It certainly didn't try every option and pick the shortest one. Even at the speed of today's computers, such a strategy would take longer than the age of the universe for a city like Houston.
What the website did was apply a specialized optimization method; a method that can be used to solve any problem where there's a clear measure of "best."
Methods for problems like finding the shortest route to work often do just what you might -- they start with a good guess and refine it until nothing better can be found. Driving to work, you might try what seems like a good route and then improve upon it by taking shortcuts. At some point you find you just can't seem to do any better. But how do you know you've found the very best route? Perhaps there's another, shorter one that's very different from the one you've settled on. When optimization methods stop finding improvements they also provide a mathematical guarantee there's nothing better. For the shortest route problem this guarantee can be found very quickly. It's actually a clever argument that absolutely no shortcuts remain.
Not all problems are as easy as finding the shortest route to work. Take the closely-related traveling salesman problem, in which a mythical salesman seeks the shortest route to visit all of his customers in a city. While good solutions can be found in the same way we find a shortest route to work, actually showing a solution's the best is very difficult. So far a simple guarantee isn't known for the traveling salesman problem.
If the relative difficulty of these two problems seems puzzling, you're not alone. Over the last half century many of the brightest people in the world have tried to show that the traveling salesman problem is either harder than the shortest route problem or just as easy, but so far no luck. We just don't know.
In a slightly more general form, it's considered the most important problem in theoretical computer science. If the traveling salesman problem could be solved quickly, it would show that thousands of other very important problems could also be solved quickly. That could save the world economy hundreds of billions of dollars every year. On the flip side, methods for making secure internet transactions would crumble, wreaking havoc on the world's financial system. The problem's so important it even carries a million-dollar reward for its solution. Only six other mathematical problems offer such enormous rewards. But as any mathematician will tell you, it's not the money that makes the problem interesting. It's the opportunity to solve a fascinating problem that your peers agree is superbly difficult.
I'm Andy Boyd at the University of Houston, where we're interested in the way inventive minds work.
This is an updated version of episode 1897, OPTIMIZATION.
In 1962, Proctor and Gamble offered a contest for selecting the shortest route from Chicago through all the cities shown in red -- much more difficult than it first appeared.
The shortest route problem is one of a class of problems for which fast algorithms are known. "Fast" is formally defined as a problem that can be solved within a guaranteed number of computational steps that is dependent on the problem size. It's fair to let the algorithm take longer for Houston than for Terlingua, Texas, whose population is only 267.
Specifically, if S represents the problem size - which for our purposes we can think of as a list of the intersections, road segments, and their lengths - and, if there is an algorithm A and a polynomial function f(S) such that the number of steps required to solve the problem with A is no more than f(S), we say the problem belongs to the class P. P actually stands for "polynomially solvable," and the algorithm A is called "polynomial time."
There are actually many polynomial time algorithms for solving the shortest route problem. Dijkstra's algorithm is guaranteed to find a shortest route in no more than O(n*n) steps, where n is the number of intersections. The notation O(n*n) means that the computational time is dominated by the term n*n for large n. The Bellman-Ford algorithm is guaranteed to find a shortest route in no more than O(n*m) steps, where m is the number of road segments. These expressions are polynomial functions of the problem size.
Both the shortest route problem and the traveling salesman problem belong to another class of problems called NP. To be in the class NP, when a problem is a "yes instance," there must exist a certificate that verifies this in polynomial time.
For example, "is there a route to work that is no longer than 15 miles" is a problem in NP, since if there is such a route it's easy to verify -- simply check that its length is no longer than 15 miles. The route is the certificate. Similarly, "is there a traveling salesman tour that is no longer than 100 miles" is also a problem in NP.
Note that being in NP does not require a quick way to find a certificate, only that if the problem is a yes instance then there exists a certificate that can be verified in polynomial time. Further, being in NP does not require that if the problem is a "no instance" that this can be verified in polynomial time. If all traveling salesman tours are longer than 100 miles, how is this demonstrated? One way is to show that each and every tour is longer than 100 miles, but this is not a polynomial time-verification procedure.
On the other hand, if all routes to work are longer than 15 miles it is possible to verify this in polynomial time -- run the polynomial time algorithm that finds the shortest route, and if it's longer than 15 miles you're done. This is possible because the shortest route problem is in P. Note that if the problem is a "yes instance," that is, if there is a route no longer than 15 miles, this can also be verified by running the polynomial time algorithm that finds the shortest route. So the fact that a problem is in the class P is sufficient to demonstrate that it is also in NP; that is, the set NP contains the set P.
We thus come upon the big question: is every problem in NP also in P, in which case P = NP, or is NP a strict superset of P? This is a concise, abstract statement of the big question, but it is important to realize it is stems from a fundamentally practical problem. Problems in P are generally easy to solve to optimality, whereas there are many, many practical problems that we know are in NP but as yet we don't know if there exist polynomial time algorithms that will solve them. The definition of NP may seem a bit abstract, but it was, in fact, defined so as to capture an enormous number of very practical problems.
Algorithmic complexity theory goes further by showing that there are problems in NP that are NP-complete. Loosely speaking, a problem is NP-complete if it is in NP and at least as hard as any other problem in NP. More correctly, if a polynomial time algorithm can be found for an NP-complete problem, thus showing it is in P, then every problem in NP has a polynomial time algorithm and P = NP. The traveling salesman problem is NP-complete, so a polynomial time algorithm to solve it would prove P = NP. Interestingly, most known problems in NP are NP-complete. Because the proof methodology of finding a polynomial time algorithm for a problem is so straightforward, and because there are so many NP-complete problems, you regularly see proposed proofs that P = NP. So far, they have all been wrong.
The definitions of the classes P and NP are really quite different, so it might very well be expected that P is not equal to NP. This is further supported by the fact that no polynomial time algorithm has ever been found for an NP-complete problem even though many brilliant people have tried. Thus, the common conjecture is that P and NP are not identical. However, a proof of this fact has not been forthcoming. The conjecture is so widely held that a vast number of mathematical papers have been published based on the assumption P is not equal to NP. If by some chance it turns out that P = NP, we have wasted a lot of time and paper.
The million-dollar prize for resolving whether or not P = NP is sponsored by the Clay Mathematics Institute. See www.claymath.org/millennium/ for more information.
This episode was first aired on July 14, 2016