oscarbonilla.com
25Nov/087

Simple Guide to Complexity Theory

After reading Jeff's terribly misguided post about NP-completeness I thought to myself: "If Jeff, who seems to be completely clueless about complexity theory, can write a blog post about it then so can I."

So without further ado, here's ob's complex guide to simplifying complexity theory.

First of all let's get the definitions out of the way.

Turing machine: An idealized computer, with infinite memory, an infinitely fast cpu, and unlimited bandwidth. It's a mathematical model of computer so it can kick your Core 2 Duo's sorry little ass faster than you can say Pentium. (update: I originally wanted to make the point that bandwidth and cpu speed are irrelevant, but it seems to have confused more than clarified.)

Problem: In complexity theory, a problem is really a decision problem. It has a simple yes/no answer. You don't ask "take this list and sort it and print the last ten items" that's not a decision problem. You do ask "does this program, given this input, produce the last ten items of the sorted input?".

(update: Although the sort problem as outlined here would be a valid decision problem, it is confusing since just changing my "last ten items" to "last C items" where C is a function of n would make it not be a decision problem. Better to use some other example, like "is the number given prime?")

n: Size of the input for the turing machine. E.g. if your turing machine is sorting strings, like in the example above, n is the number of symbols that it takes as input.

T(n): Number of steps (i.e. instructions) that a turing machine must execute to solve a problem with input of size n.

Polynomial Time: When T(n) can be expressed as an equation of n, and possibly some constants, using only addition, subtraction, multiplication, and constant non-negative whole number exponents. E.g. n² is a polynomial while 2ⁿ is not.

Non-deterministic turing machine: A turing machine that can, at every branch point, take all possible branches in parallel. We call this guessing the optimal answer because, since the turing machine takes all branches, it must take the optimal branch too. If any of the parallel turing machines finds an answer, the non-deterministic turing machine stops and gives that answer (this means that it's ok for some branches to never end).

So take your already idealized computer from above and make it able to reproduce itself on every branch point. I.e. it can follow all possible decisions in parallel.

P: The class of problems that can be solved by a turing machine in polynomial time.

NP: The class of problems that can be solved by a non-deterministic turing machine in polynomial time.

Reducing a problem to another: Imagine you have two problems, P1 and P2. You know something about P1 (e.g. that it belongs to NP). Now you want to prove some property of P2, for instance, that it belongs to NP as well.

One technique is to show that if we could solve P2 in polynomial time (i.e. P2 belongs to P), we could use that answer to solve P1 in polynomial time. This would mean that P1 belongs to P which is a contradiction. This technique is called reductio ad absurdum. (update: I completely botched this and commenter Bibi correctly called me on it. Here's an attempt to clarify a reduction: if you want to solve an instance of P2, you convert it to an instance of P1, and then solve P1 to get an answer. Thus, you've reduced P2 to P1. How's that?)

NP-complete: We say a problem is NP-complete if it satisfies two properties: 1) it belongs to NP, and 2) every other problem that belogs to NP can be reduced, in polynomial time, to it.

This means that if we come up with a solution for an NP-complete problem that runs in polynomial time in a deterministic turing machine (i.e. belongs to P), any other problem in NP can be solved in polynomial time by first reducing it (again, in polynomial time) to this problem, and using our solution. We would have proved that P = NP and we would have also won (pinky to corner of mouth) one million dollars.

NP-hard: A problem P1, that is so hard, that even though we can prove number two above, we can't prove number one. This means we know P1 to be just as hard as anything in NP and possibly harder, but we don't know if P1 is in NP or not. P1 is very likely to require exponential time.

(update: I stand by my definition of NP-hard. Although not being able to prove #1 is a sufficient condition, it is not necessary. All problems in P are in NP, and yet we call them P, not NP. Thus, even if all problems that are NP-complete are also NP-hard, it's uninteresting to call an NP-complete problem NP-hard. The point of calling a problem NP-hard is that, it's really hard! Even if it turned out that P = NP, an NP-hard problem might still be hard!)

So there you have it, go impress your friends. I hear that "hey, baby! your love is NP-hard to get" is a very successful pickup line at pubs around M.I.T.

Tagged as: Leave a comment
Comments (7) Trackbacks (1)
  1. I’m not sure where the “infinitely fast CPU” part came in, but that would kind of defeat the point of analyzing the time complexity of the machine. Other than that, good stuff.

  2. Time complexity T(n) refers to the number of steps it takes to solve a problem, not the amount of real time required. T(n) remains constant, regardless of the speed of the CPU.

  3. Nice overview. Except that calling a problem P where we already using P as a class of problem can be confusing.

  4. Butch, good point. I’ve changed it to P1 and bolded all the P’s and NP’s.

  5. There are some problems in the definitions.

    In Turing machine, infinitely fast CPU does not make sense, and infinite bandwidth is plain wrong: the “CPU” (head) of the TM only sees one symbol (bit) per cycle. It’s hardly possible to have a smaller bandwidth.

    In problem, the example given is not a decision problem, it’s a meta-question on the algorithm. This makes no sense. I’d advise not to use sorting as an example, as it is not easy to come up with a convincing decision problem associated with sorting. Shortest path in a graph seems better. Say “instead of returning the shortest path between A and B, tell whether they are at distance at most k”.

    Your definition of reduction is wrong. It is generally used to prove that it is NP-hard. (By reducing from another NP-hard problem). When you do that, you don’t assume that P2 is solvable in polynomial time at all. You show that you can use the solution to P2 as a solution from P1 in polynomial time, say TR(n). Then if P2 can be solved in time T2(n), P1 can be solved in time T2(n) + TR(n). So if P2 is in P, P1 too. There’s no proof by contradiction going on here at all.

    NP-hard is inexact: there’s no hypothesis that we can’t prove point 1 of NP-completeness. In particular, NP-complete problems are NP-hard.

    Folks, once you have read this post, read the wp article before you go and try to impress your friends (especially around reputables universities). You might save yourselves some embarassment.

  6. Bibi, thanks for the comments. I corrected most of the errors, although if you want to be pedantic, your statement:

    ‘the “CPU” (head) of the TM only sees one symbol (bit) per cycle. It’s hardly possible to have a smaller bandwidth.’

    is incorrect. There is no requirement that a turing machine operate on bits. The only requirement is that the alphabet contains a finite number of symbols. Thus I can define as many symbols as I want to have as much “bandwidth” as I want per step. But it seems saying “infinitely fast CPU and infinite bandwidth” was just confusing people, so I took it out.

  7. The problem i see with the definition of NP-hard you give (apart from its being non-standard) is that the statement “problem P is NP-hard” is a statement of our current knowledge, while with the standard definition, it is the statement of a mathematical truth. Maybe something like “as this notion is more general than NP-completeness, when one says a problem is NP-hard, it’s generally because we don’t know whether it’s in NP or because we know it’s not in NP” would make things clearer.


Leave a comment