The past chapters we have discussed all involve efficient algorithms with a singular objective: shortest path, minimum spanning tree, maximum flows, etc. We consider them efficient because their time requirement grows as a polynomial function of the size of the input.

Consider what we are truly doing — we are finding a solution from among an exponential population of possibilities. Indeed, $n$ boys can be matched with $n$ girls in $n!$ different ways, and a graph with $n$ vertices has $n^{n-2}$ spanning trees. All these problems could, with enough time, be solved in exponential time by checking every candidate solution. But an algorithm with a running time of $2^n$ or worse is all but useless in practice, and the quest for efficient algorithms is about finding clever ways to reduce the time needed in this exahustive search, using clues from the input to dramatically narrow the search space.

But while this quest's successes, such as greedy algorithms and linear programming that we have seen in this book, are able to achieve this goal, it also has its failures as well. We will now observe some search problems with no apparent shortcut, and the fastest algorithms we know for them remain exponential.

Satisfiability

Satisfiability, or SAT, is a canonical hard problem with great practical importance. We first discussed this problem in the Greedy Algorithms chapter, specifically, the topic of Horn formulas.

An instance of SAT looks like this:

$$ (x\vee y\vee z)(x\vee\bar{y})(y\vee\bar{z})(z\vee\bar{x})(\bar{x}\vee\bar{y}\vee\bar{z}) $$

This is a Boolean formula in conjunctive normal form **(CNF), a collection of clauses (items in individual parentheses), each consisting of the disjunction of several literals, which is either a Boolean variable or its negation. A **satisfying truth assignment ****is an assignment of FALSE or TRUE values to each variable so that every clause contains a literal whose value is TRUE. The SAT problem asks that given a Boolean formula in CNF, either find a satisfying truth assignment, or else report that none exists.

In the above example, it is not difficult to see no such truth assignment exists, because the middle three clauses constrain all three varables to have the same value. But how would one decide this for a more general formula, one with $n$ variables, and thus with $2^n$ possible assignments?

SAT is a typical search problem — given an instance $I$ (some input data specifying the problem at hand), find a solution $S$ (an object that meets a particular specification), or report that no such solution exists. A search problem must have the property that any proposed solution $S$ to an instance $I$ can be quickly checked for correctness, or in other words, there exists a polynomial-time algorithm that takes in $I$ and $S$ and decides whether $S$ is a solution to $I$.

<aside> 📘 A search problem is specified by an algorithm $\mathcal{C}$ that takes an instance $I$ and a proposed solution $S$ as inputs, and runs in time polynomial in $|I|$. We say $S$ is a solution to $I$ if and only if $\mathcal{C}(I,S)$ is true.

</aside>

Given the importance of SAT, researchers have tried hard to find the most efficient way to solve it. However, the fastest algorithms remain exponential on their worst case inputs. Interestingly, there are two variants of SAT for which we do have good algorithms.

If all clauses contain at most one positive literal, then the Boolean formula is called a Horn formula, and a satisfying truth assignment can be found by the greedy algorithm we saw previously. Alternatively, if all clauses have only two literals, then graph theory comes into play, and SAT can be solved in linear time by finding the strongly connected components of a particular graph constructed from the instance. In fact, we will see later there is a different polynomial algorithm for this case called 2SAT.

On the other hand, if we allow clauses to contain three literals, the resulting 3SAT problem once again becomes difficult to solve.

Travelling Salesman Problem

In the traveling salesman problem, we are given $n$ vertices $1,\cdots,n$ and all $\frac{n(n-1)}{2}$ distances between them, as well as a budget $b$. We are then asked to find a tour, a cycle that passes through every vertex exactly once, of a total cost $b$ or less — or to report that no such tour exists.

In other words, we seek a permutation $\tau(1),\cdots,\tau(n)$ of the vertices such that when they are toured in this order, the total distance covered is at most $b$:

$$ d_{\tau(1),\tau(2)}+d_{\tau(2),\tau(3)}+\cdots+d_{\tau(n),\tau(1)}\leq b $$